---
title: "What AI Prior Authorization Actually Costs in 2026"
slug: "what-ai-prior-authorization-actually-costs-2026"
author: "ibl.ai Engineering"
date: "2026-05-30 18:00:00"
category: "Premium"
topics: "AI prior authorization cost, prior auth automation, AI prior auth pricing, ChatGPT prior auth, Claude prior auth, HIPAA AI prior authorization, Cohere Health pricing, Olive AI cost, prior auth per-letter cost, self-hosted prior auth AI"
summary: "Per-letter token math for prior authorization across the latest models, monthly bills at community / regional / IDN scale, and why the per-transaction and per-clinician AI vendors are the wrong shape — even for the workload that started the AI-in-healthcare conversation."
banner: ""
thumbnail: ""
---

## Prior Authorization Is the Highest-Leverage AI Workload in Healthcare Admin

Every health system has the same pain. The AMA's surveys put prior-auth burden at the top of the list of administrative complaints — clinicians spend hours per week on it, denials and re-submissions slow patient care, and the back-office staff who manage the workflow are the highest-turnover roles in the revenue cycle.

It's also the workload AI can actually do well. The task is structured: pull patient context, match it to payer-specific medical-necessity criteria, draft a letter with cited reasoning, route for review. A current-generation model handles a first-pass draft in seconds at a cost measured in cents.

The vendors in this space — Cohere Health, Olive AI, Notable Health, and now every per-seat AI suite — know it. They've built pricing around the value of the workload (per-transaction fees of $2–5 per letter, or per-clinician seats at $50–100/month) rather than the cost of producing the output.

The output cost is much smaller. Showing the math is the post.

## What a Prior-Auth Letter Actually Costs Per Token

A typical prior-authorization draft is about **2,000 input tokens** (patient context, visit notes excerpts, payer-specific medical-necessity criteria) and **2,500 output tokens** (the drafted letter with cited reasoning). Cost-per-letter on the major models:

<table style="width:100%; border-collapse:collapse; margin:1.5rem 0; font-size:0.95rem;">
  <thead>
    <tr style="background:#f5f5f0; border-bottom:2px solid #2175C5;">
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Model</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Input ($/MTok)</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Output ($/MTok)</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">$ per letter</th>
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">When to use it</th>
    </tr>
  </thead>
  <tbody>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Claude Opus 4.7</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$15</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$75</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$0.22</td>
      <td style="padding:0.75rem;">Complex appeals, peer-to-peer prep</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>GPT-5</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$10</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$30</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$0.095</td>
      <td style="padding:0.75rem;">Mixed cases, ambiguous criteria</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Claude Sonnet 4.6</strong></td>
      <td style="padding:0.75rem;"></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$3</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$0.044</td>
      <td style="padding:0.75rem;">Standard PA workhorse</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Gemini 3 Pro</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$3.50</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$10.50</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$0.033</td>
      <td style="padding:0.75rem;">Long-context (multi-document) cases</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Claude Haiku 4.5</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$1</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$5</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$0.015</td>
      <td style="padding:0.75rem;">High-volume routine cases</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Gemini 3 Flash</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$0.35</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$1.05</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$0.003</td>
      <td style="padding:0.75rem;">Cheapest hosted option</td>
    </tr>
    <tr style="background:#f0f9ff; border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Llama 4 / DeepSeek-R1 (self-hosted)</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#15803d;">~$0</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#15803d;">~$0</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#15803d;"><strong>~$0</strong></td>
      <td style="padding:0.75rem;"><strong>Inside the HIPAA boundary</strong></td>
    </tr>
  </tbody>
</table>

The most expensive frontier model produces a draft for **22 cents**. The mid-tier workhorse: **4 cents**. The cheap hosted option: **fractions of a cent**. Self-hosted on the hospital's own GPU: the marginal cost is the electricity.

## Monthly Bills at Three Scale Tiers

Realistic prior-auth volume in 2026:

- **Community hospital** (200 beds): ~3,000 letters/month
- **Regional health system** (5,000 clinicians): ~12,000 letters/month
- **IDN / academic medical center**: ~30,000 letters/month

Monthly cost using **Claude Sonnet 4.6** (the standard workhorse model for this workload) vs the per-transaction and per-seat alternatives:

<table style="width:100%; border-collapse:collapse; margin:1.5rem 0; font-size:0.95rem;">
  <thead>
    <tr style="background:#f5f5f0; border-bottom:2px solid #2175C5;">
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Approach</th>
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Pricing shape</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Community (3K/mo)</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Regional (12K/mo)</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">IDN (30K/mo)</th>
    </tr>
  </thead>
  <tbody>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Specialty PA AI vendor</strong></td>
      <td style="padding:0.75rem;">Per-transaction (~$3/letter)</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$9,000</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$36,000</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;"><strong>$90,000</strong></td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Specialty PA AI (per-clinician)</strong></td>
      <td style="padding:0.75rem;">~$75/clinician/mo</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">~$15,000</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">~$375,000</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;"><strong>~$1,500,000</strong></td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>ChatGPT Enterprise</strong></td>
      <td style="padding:0.75rem;">$60/seat × all clinicians</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">~$12,000</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">~$300,000</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;"><strong>~$1,200,000</strong></td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;">Direct API — Claude Sonnet 4.6</td>
      <td style="padding:0.75rem;">Token-based</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$131</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$522</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$1,305</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;">Direct API — GPT-5</td>
      <td style="padding:0.75rem;">Token-based</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$285</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$1,140</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$2,850</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;">Direct API — Gemini 3 Flash</td>
      <td style="padding:0.75rem;">Token-based (cheapest hosted)</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$9</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$36</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$90</td>
    </tr>
    <tr style="background:#f0f9ff; border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>ibl.ai self-hosted (Llama 4 / DeepSeek-R1)</strong></td>
      <td style="padding:0.75rem;">Flat license + GPU</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#15803d;"><strong>~$2,000</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#15803d;"><strong>~$3,000–5,000</strong></td>
      <td style="padding:0.75rem; text-align:right; font-variant-numeric:tabular-nums; color:#15803d;"><strong>~$5,000–8,000</strong></td>
    </tr>
  </tbody>
</table>

The ibl.ai row covers the GPU instance (sized so a single H100 handles the IDN workload), the platform license, and ongoing support. The PHI never leaves the hospital's environment. At the IDN scale, the specialty per-clinician vendor is **~200× more expensive** for the same letters drafted.

## Why the Per-Transaction Vendor Math Is the Sneaky One

Per-seat is obviously wrong for prior auth — most of a hospital's clinicians don't draft prior auths personally; a small revenue-cycle team does most of the volume on behalf of the medical staff. Buying a seat for every clinician to access an AI tool 80% of them never touch is the same per-seat trap that breaks for every healthcare AI use case.

**Per-transaction is the trickier one.** $3 per letter sounds reasonable. It feels aligned to value — the hospital pays the vendor for each piece of work the AI does. But:

1. **The unit price doesn't drop with volume.** A 30,000-letter/month IDN pays the same $3/letter as a 3,000-letter/month community hospital. There's no scale benefit. Self-hosting the same workload costs *less* per letter as volume grows, because the GPU is amortized.

2. **The vendor's marginal cost is the same fraction of a cent the hospital would pay running it directly.** The $3 is value capture, not cost recovery. That's the vendor's business model, not a description of the underlying compute.

3. **PHI residency is still vendor-side.** Per-transaction or per-seat, the data still goes to the vendor's cloud, the BAA still needs annual review, and the model selection is still the vendor's choice — not the hospital's.

## Why HIPAA Makes Self-Hosting Non-Negotiable at Scale

PHI cannot leave the HIPAA-covered boundary without a BAA — and even with a BAA, every change in the vendor's data-processing terms, every sub-processor change, and every model switch is a re-papering event. For a workload that touches 30,000 patient records per month, that's a continuous compliance overhead the hospital doesn't see until the OCR audit.

Self-hosting changes the geometry. The claw runs inside the hospital's existing covered environment. Patient data, clinical notes, prior-auth narratives, and payer correspondence never traverse a third-party cloud. The model swap (Sonnet for routine, Opus for complex appeals, Haiku for high-volume sorting) is a config change — not a procurement event.

The other compliance angle that doesn't get discussed often: **payer medical-necessity criteria change constantly.** Hospitals using a managed AI vendor depend on the vendor to keep the criteria library current. Self-hosting means the hospital owns the criteria library — and can update it the same day the payer publishes the change, not whenever the vendor's release cycle ships it.

## What Stays the Same, What Changes

Self-hosting prior-auth AI doesn't mean rebuilding the platform around it. The clinician-facing chat UI, the case-worker dashboards, the audit logs, the model-routing-with-fallbacks, the multi-agent orchestration, the Epic / Cerner / athenahealth integrations — all of that stays managed by ibl.ai. The compute, the model, and the PHI move inside the hospital's covered environment.

What disappears: the $90K/month per-transaction bill (or the $1.2M/month per-seat bill) at IDN scale.

What appears: a self-hosted prior-auth capability the hospital owns and controls, with the model-routing policy the medical-staff committee designed:

- **Sonnet** for standard prior auths (the bulk)
- **Opus** for complex appeals and peer-to-peer prep
- **Haiku** for triage sorting (which payer, which criteria set)
- **Llama 4 self-hosted** for the highest-volume routine cases when even pennies per letter add up

## Run the Numbers for Your Health System

For workload sizing and cost modeling for a prior-auth deployment, the **[AI Help Desk Cost Savings Calculator](/resources/calculators/ai-help-desk-savings-calculator)** generalizes well — prior auth is one of the highest-volume, most-amenable-to-automation administrative workloads in healthcare.

For the segment-wide cost-math context (not just prior auth), see **[AI Cost Math for Hospitals: Per-Seat vs Usage-Based in 2026](/blog/ai-cost-math-for-hospitals-per-seat-vs-usage)**.

For the deployment comparison side-by-side — including HIPAA posture, BAA reach, and air-gapped options — see **[Self-Hosted AI vs ChatGPT Enterprise for Healthcare](/resources/comparisons/self-hosted-ai-vs-chatgpt-enterprise-for-healthcare)**.

For the full HIPAA-aligned architecture (Epic / Cerner / athenahealth integrations, Managed VPC → on-prem → air-gapped tiers), read **[Healthcare AI Reference Architecture on ibl.ai](/blog/healthcare-ai-reference-architecture)**.

For the staged deployment recipe — Managed VPC for low-sensitivity workloads + on-prem for production — see **[Healthcare AI Blueprint: Managed VPC in 30/60/90 Days](/blog/healthcare-ai-blueprint-managed-vpc-30-60-90-days)**.

For the broader pricing landscape (every major model, every major per-seat vendor), the hub: **[What Does AI Actually Cost in 2026?](/blog/what-does-ai-actually-cost-in-2026)**.

## Why Family-Owned and New York Matters Here

A health system's AI vendor relationship for a workload as central as prior authorization is a multi-year commitment — the workflows, the criteria library, the EHR integration, the audit trail revenue-cycle compliance relies on. ibl.ai is family-owned and operated from New York, NY — a long-term partner with a perpetual platform license and no investor exit pressure. The runtime is open source. The PHI stays inside the hospital's covered boundary. The math works at a 100-bed community hospital or a 30-hospital IDN.
