---
title: "AI Cost Math for Hospitals: Per-Seat vs Usage-Based in 2026"
slug: "ai-cost-math-for-hospitals-per-seat-vs-usage"
author: "ibl.ai Engineering"
date: "2026-05-30 10:00:00"
category: "Premium"
topics: "AI cost healthcare, hospital AI pricing, HIPAA AI, prior authorization AI, clinical documentation AI, per-seat vs usage-based, ChatGPT Enterprise healthcare, GPT-5 pricing, Claude Opus pricing, self-hosted healthcare AI"
summary: "What AI actually costs a hospital in 2026 — token pricing across the latest models (Claude Opus 4.7, GPT-5, Gemini 3 Pro, Llama 4), per-seat SaaS math, and why $60-per-clinician scales the wrong way for prior auth and clinical documentation."
banner: ""
thumbnail: ""
---

## Per-Seat Pricing Was Built for Software You Use Occasionally

A mid-size health system has 5,000 clinicians. ChatGPT Enterprise lists at around $60 per user per month. That's **$300,000 per month** — $3.6M per year — before a single prior-authorization letter is drafted.

The pricing model was built for collaboration software (Slack, Notion, Salesforce) — tools where most seats sit idle most of the day and the per-seat fee approximates "access." For AI that actually does work — drafting prior auths, summarizing visit notes, triaging messages — the seat model breaks. The cost scales with how many people *could* use it, not what they *do*.

The same workload, priced by tokens consumed, costs a fraction. The math is the post.

## What the Latest Models Actually Cost in 2026

Token pricing across the major providers, approximate as of mid-2026 (always check provider docs for current rates):

<table style="width:100%; border-collapse:collapse; margin:1.5rem 0; font-size:0.95rem;">
  <thead>
    <tr style="background:#f5f5f0; border-bottom:2px solid #2175C5;">
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Model</th>
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Provider</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Input ($/MTok)</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Output ($/MTok)</th>
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">HIPAA-eligible?</th>
    </tr>
  </thead>
  <tbody>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Claude Opus 4.7</strong></td>
      <td style="padding:0.75rem;">Anthropic</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$15</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$75</td>
      <td style="padding:0.75rem;">Yes (BAA)</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Claude Sonnet 4.6</strong></td>
      <td style="padding:0.75rem;">Anthropic</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$3</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$15</td>
      <td style="padding:0.75rem;">Yes (BAA)</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Claude Haiku 4.5</strong></td>
      <td style="padding:0.75rem;">Anthropic</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$1</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$5</td>
      <td style="padding:0.75rem;">Yes (BAA)</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>GPT-5</strong></td>
      <td style="padding:0.75rem;">OpenAI</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$10</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$30</td>
      <td style="padding:0.75rem;">Yes (Enterprise BAA)</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Gemini 3 Pro</strong></td>
      <td style="padding:0.75rem;">Google</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$3.50</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$10.50</td>
      <td style="padding:0.75rem;">Yes (Vertex BAA)</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Llama 4 (70B, self-hosted)</strong></td>
      <td style="padding:0.75rem;">Meta (open weights)</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$0</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$0</td>
      <td style="padding:0.75rem;">Yes (you control PHI)</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>DeepSeek-R1 (self-hosted)</strong></td>
      <td style="padding:0.75rem;">DeepSeek (open weights)</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$0</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$0</td>
      <td style="padding:0.75rem;">Yes (you control PHI)</td>
    </tr>
  </tbody>
</table>

For self-hosted open-weight models, "~$0 per token" means the marginal cost is just the GPU time. A single A100 or H100 instance ($1–3/hour reserved) handles thousands of clinical requests per day.

## A Real Workload: Prior Authorization at 5,000-Clinician Health System

Prior authorization is the highest-volume, highest-pain administrative AI use case in any health system. A mid-size system processes roughly **10,000 prior-auth requests per month**. Each request is about 500 tokens in (patient context, clinical justification) and 1,500 tokens out (drafted letter with citations to medical-necessity criteria). For a deeper per-letter cost breakdown — including per-transaction specialty vendors (Cohere Health / Olive / Notable) and three scale tiers (community / regional / IDN) — see **[What AI Prior Authorization Actually Costs in 2026](/blog/what-ai-prior-authorization-actually-costs-2026)**.

That's **5M input + 15M output tokens per month** for the entire prior-auth workload — across 5,000 clinicians, that's an average of 2 requests per clinician per month, with heavy concentration on a few high-volume specialties.

### What it costs by deployment shape

<table style="width:100%; border-collapse:collapse; margin:1.5rem 0; font-size:0.95rem;">
  <thead>
    <tr style="background:#f5f5f0; border-bottom:2px solid #2175C5;">
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Deployment</th>
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Pricing shape</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Monthly cost</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Annual</th>
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">PHI residency</th>
    </tr>
  </thead>
  <tbody>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>ChatGPT Enterprise</strong></td>
      <td style="padding:0.75rem;">Per-seat ($60/user)</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;"><strong>$300,000</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$3,600,000</td>
      <td style="padding:0.75rem;">OpenAI cloud (BAA)</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Microsoft 365 Copilot</strong></td>
      <td style="padding:0.75rem;">Per-seat ($30/user)</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;"><strong>$150,000</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$1,800,000</td>
      <td style="padding:0.75rem;">Microsoft cloud (BAA)</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;">Direct API — Claude Sonnet 4.6</td>
      <td style="padding:0.75rem;">Token-based</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$240</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$2,880</td>
      <td style="padding:0.75rem;">Anthropic cloud (BAA)</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;">Direct API — GPT-5</td>
      <td style="padding:0.75rem;">Token-based</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$500</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$6,000</td>
      <td style="padding:0.75rem;">OpenAI cloud (BAA)</td>
    </tr>
    <tr style="background:#f0f9ff; border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>ibl.ai self-hosted (Llama 4 / DeepSeek-R1)</strong></td>
      <td style="padding:0.75rem;">Flat license + GPU</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#15803d;"><strong>~$3,000–5,000</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#15803d;">~$36,000–60,000</td>
      <td style="padding:0.75rem;"><strong>Inside your VPC / on-prem</strong></td>
    </tr>
  </tbody>
</table>

The ibl.ai row covers the GPU instance, the platform license, and ongoing support. It does **not** include the BAA conversation, the vendor risk review, or the re-architecture every time the vendor updates their data-processing terms — because there is no third-party vendor in the data path. The model runs on infrastructure you already own.

## Why the Per-Seat Math Doesn't Work in Healthcare

Three reasons per-seat AI fails harder in healthcare than anywhere else:

**1. Usage is concentrated.** A handful of high-volume specialties (oncology, cardiology, GI) generate most of the prior-auth and documentation load. Buying a seat for every clinician means subsidizing the ones who barely touch it for the ones who hit it constantly. Token pricing aligns the bill to the actual work.

**2. The clinical workforce is large and lower-paid than the cost model assumes.** A 5,000-clinician system isn't 5,000 attending physicians — it's nurses, techs, residents, schedulers, coders, billers. The seat fee assumes a uniform "knowledge worker" who can absorb $60/month of overhead. For a coding clerk doing prior auth all day, $60/month is fine; for a triage nurse who touches AI twice a week, it's not.

**3. PHI residency forces re-purchase, not extension.** When a managed AI vendor updates its data-processing terms — or when the FDA / OCR publishes new guidance — every BAA gets re-papered. With self-hosted, the data never leaves; the model swap is a config change, not a procurement event.

## What Stays the Same, What Changes

Self-hosting the runtime doesn't mean rebuilding the platform. The chat UI, the clinician dashboards, the audit logs, the model-routing-with-fallbacks, the multi-agent orchestration — all of that stays managed by ibl.ai. The compute, the model, and the PHI move inside the hospital's perimeter.

The trade-off most health systems don't realize: **the per-seat SaaS line item is bigger than the all-in self-hosted infrastructure budget.** A $3M/year ChatGPT Enterprise contract pays for an internal AI platform team, dedicated GPUs, and the model-choice flexibility that comes with owning the stack — with money left over.

## Run the Numbers for Your Health System

For workload-specific calculations — prior auth, clinical documentation, patient messaging triage — use the **[AI Help Desk Cost Savings Calculator](/resources/calculators/ai-help-desk-savings-calculator)** as a starting point (the math generalizes to most high-volume clinical-administrative workloads).

For the deployment comparison side-by-side — including HIPAA posture, BAA reach, and air-gapped options — see **[Self-Hosted AI vs ChatGPT Enterprise for Healthcare](/resources/comparisons/self-hosted-ai-vs-chatgpt-enterprise-for-healthcare)**.

For the full HIPAA-aligned architecture (Managed VPC → on-premise → air-gapped tiers, Epic / Cerner / athenahealth integrations, TCO at 10K clinicians), read **[Healthcare AI Reference Architecture on ibl.ai](/blog/healthcare-ai-reference-architecture)**.

## Why Family-Owned and New York Matters Here

The sovereignty argument falls apart if the vendor on the other side of the BAA is on a five-year exit clock, foreign-owned, or acquired before the next OCR audit. ibl.ai is family-owned and operated from New York, NY — a long-term partner for U.S. health systems, defense, and regulated buyers, with a perpetual platform license and no investor exit pressure.

The runtime is open source. The data stays inside the covered boundary. The math works at 100 clinicians or 50,000.
