How much does AI prior authorization cost?

A prior-authorization letter is ~2,000 input tokens (patient context + medical-necessity criteria) and ~2,500 output tokens (drafted letter with cited reasoning). Per-letter cost: Claude Opus 4.7 $0.22, GPT-5 $0.095, Claude Sonnet 4.6 $0.044, Claude Haiku 4.5 $0.015, Gemini 3 Flash $0.003. Self-hosted on a hospital's own GPU: ~$0 marginal cost. The specialty per-transaction PA AI vendors charge $3 per letter — 60–1000× the underlying token cost.

Is per-transaction AI pricing fair for prior authorization?

Per-transaction pricing ($3/letter from specialty PA AI vendors) feels value-aligned but isn't. The vendor's marginal cost is fractions of a cent; the $3 is value capture, not cost recovery. The unit price doesn't drop with volume — a 30,000-letter/month IDN pays the same $3/letter as a 3,000-letter community hospital. Self-hosted gets cheaper per-letter as volume grows because the GPU is amortized.

How much does Cohere Health or Olive AI cost vs ibl.ai?

Specialty prior-auth AI vendors (Cohere Health, Olive AI, Notable Health) typically charge $2–5 per transaction or $50–100 per clinician per month. At IDN scale (30,000 letters/month + 5,000 clinicians): per-transaction = ~$90K/month, per-clinician = ~$500K/month. ibl.ai self-hosted handles the same workload for ~$5,000–8,000/month all-in (Llama 4 on hospital GPU), with PHI inside the hospital's HIPAA-covered boundary.

Does payer medical-necessity criteria change matter for AI?

Yes — payer criteria change weekly per payer. Managed AI vendors typically lag 2–6 weeks updating their criteria library because of release-cycle constraints. Self-hosted on ibl.ai means the hospital owns the criteria library directly — updates the same day the payer publishes the change. This matters for compliance (every drafted letter cites the current criteria) and accuracy (no inadvertent use of superseded criteria).

Back to Blog

What AI Prior Authorization Actually Costs in 2026

Miguel AmigotMay 30, 2026

Premium

Per-letter token math for prior authorization across the latest models, monthly bills at community / regional / IDN scale, and why the per-transaction and per-clinician AI vendors are the wrong shape — even for the workload that started the AI-in-healthcare conversation.

Prior Authorization Is the Highest-Leverage AI Workload in Healthcare Admin

Every health system has the same pain. The AMA's surveys put prior-auth burden at the top of the list of administrative complaints — clinicians spend hours per week on it, denials and re-submissions slow patient care, and the back-office staff who manage the workflow are the highest-turnover roles in the revenue cycle.

It's also the workload AI can actually do well. The task is structured: pull patient context, match it to payer-specific medical-necessity criteria, draft a letter with cited reasoning, route for review. A current-generation model handles a first-pass draft in seconds at a cost measured in cents.

The vendors in this space — Cohere Health, Olive AI, Notable Health, and now every per-seat AI suite — know it. They've built pricing around the value of the workload (per-transaction fees of $2–5 per letter, or per-clinician seats at $50–100/month) rather than the cost of producing the output.

The output cost is much smaller. Showing the math is the post.

What a Prior-Auth Letter Actually Costs Per Token

A typical prior-authorization draft is about 2,000 input tokens (patient context, visit notes excerpts, payer-specific medical-necessity criteria) and 2,500 output tokens (the drafted letter with cited reasoning). Cost-per-letter on the major models:

Model	Input ($/MTok)	Output ($/MTok)	$ per letter	When to use it
Claude Opus 4.7	$15	$75	$0.22	Complex appeals, peer-to-peer prep
GPT-5	$10	$30	$0.095	Mixed cases, ambiguous criteria
Claude Sonnet 4.6		$3	$0.044	Standard PA workhorse
Gemini 3 Pro	$3.50	$10.50	$0.033	Long-context (multi-document) cases
Claude Haiku 4.5	$1	$5	$0.015	High-volume routine cases
Gemini 3 Flash	$0.35	$1.05	$0.003	Cheapest hosted option
Llama 4 / DeepSeek-R1 (self-hosted)	~$0	~$0	~$0	Inside the HIPAA boundary

The most expensive frontier model produces a draft for 22 cents. The mid-tier workhorse: 4 cents. The cheap hosted option: fractions of a cent. Self-hosted on the hospital's own GPU: the marginal cost is the electricity.

Monthly Bills at Three Scale Tiers

Realistic prior-auth volume in 2026:

Community hospital (200 beds): ~3,000 letters/month
Regional health system (5,000 clinicians): ~12,000 letters/month
IDN / academic medical center: ~30,000 letters/month

Monthly cost using Claude Sonnet 4.6 (the standard workhorse model for this workload) vs the per-transaction and per-seat alternatives:

Approach	Pricing shape	Community (3K/mo)	Regional (12K/mo)	IDN (30K/mo)
Specialty PA AI vendor	Per-transaction (~$3/letter)	$9,000	$36,000	$90,000
Specialty PA AI (per-clinician)	~$75/clinician/mo	~$15,000	~$375,000	~$1,500,000
ChatGPT Enterprise	$60/seat × all clinicians	~$12,000	~$300,000	~$1,200,000
Direct API — Claude Sonnet 4.6	Token-based	~$131	~$522	~$1,305
Direct API — GPT-5	Token-based	~$285	~$1,140	~$2,850
Direct API — Gemini 3 Flash	Token-based (cheapest hosted)	~$9	~$36	~$90
ibl.ai self-hosted (Llama 4 / DeepSeek-R1)	Flat license + GPU	~$2,000	~$3,000–5,000	~$5,000–8,000

The ibl.ai row covers the GPU instance (sized so a single H100 handles the IDN workload), the platform license, and ongoing support. The PHI never leaves the hospital's environment. At the IDN scale, the specialty per-clinician vendor is ~200× more expensive for the same letters drafted.

Why the Per-Transaction Vendor Math Is the Sneaky One

Per-seat is obviously wrong for prior auth — most of a hospital's clinicians don't draft prior auths personally; a small revenue-cycle team does most of the volume on behalf of the medical staff. Buying a seat for every clinician to access an AI tool 80% of them never touch is the same per-seat trap that breaks for every healthcare AI use case.

Per-transaction is the trickier one. $3 per letter sounds reasonable. It feels aligned to value — the hospital pays the vendor for each piece of work the AI does. But:

The unit price doesn't drop with volume. A 30,000-letter/month IDN pays the same $3/letter as a 3,000-letter/month community hospital. There's no scale benefit. Self-hosting the same workload costs less per letter as volume grows, because the GPU is amortized.
The vendor's marginal cost is the same fraction of a cent the hospital would pay running it directly. The $3 is value capture, not cost recovery. That's the vendor's business model, not a description of the underlying compute.
PHI residency is still vendor-side. Per-transaction or per-seat, the data still goes to the vendor's cloud, the BAA still needs annual review, and the model selection is still the vendor's choice — not the hospital's.

Why HIPAA Makes Self-Hosting Non-Negotiable at Scale

PHI cannot leave the HIPAA-covered boundary without a BAA — and even with a BAA, every change in the vendor's data-processing terms, every sub-processor change, and every model switch is a re-papering event. For a workload that touches 30,000 patient records per month, that's a continuous compliance overhead the hospital doesn't see until the OCR audit.

Self-hosting changes the geometry. The claw runs inside the hospital's existing covered environment. Patient data, clinical notes, prior-auth narratives, and payer correspondence never traverse a third-party cloud. The model swap (Sonnet for routine, Opus for complex appeals, Haiku for high-volume sorting) is a config change — not a procurement event.

The other compliance angle that doesn't get discussed often: payer medical-necessity criteria change constantly. Hospitals using a managed AI vendor depend on the vendor to keep the criteria library current. Self-hosting means the hospital owns the criteria library — and can update it the same day the payer publishes the change, not whenever the vendor's release cycle ships it.

What Stays the Same, What Changes

Self-hosting prior-auth AI doesn't mean rebuilding the platform around it. The clinician-facing chat UI, the case-worker dashboards, the audit logs, the model-routing-with-fallbacks, the multi-agent orchestration, the Epic / Cerner / athenahealth integrations — all of that stays managed by ibl.ai. The compute, the model, and the PHI move inside the hospital's covered environment.

What disappears: the $90K/month per-transaction bill (or the $1.2M/month per-seat bill) at IDN scale.

What appears: a self-hosted prior-auth capability the hospital owns and controls, with the model-routing policy the medical-staff committee designed:

Sonnet for standard prior auths (the bulk)
Opus for complex appeals and peer-to-peer prep
Haiku for triage sorting (which payer, which criteria set)
Llama 4 self-hosted for the highest-volume routine cases when even pennies per letter add up

Run the Numbers for Your Health System

For workload sizing and cost modeling for a prior-auth deployment, the AI Help Desk Cost Savings Calculator generalizes well — prior auth is one of the highest-volume, most-amenable-to-automation administrative workloads in healthcare.

For the segment-wide cost-math context (not just prior auth), see AI Cost Math for Hospitals: Per-Seat vs Usage-Based in 2026.

For the deployment comparison side-by-side — including HIPAA posture, BAA reach, and air-gapped options — see Self-Hosted AI vs ChatGPT Enterprise for Healthcare.

For the full HIPAA-aligned architecture (Epic / Cerner / athenahealth integrations, Managed VPC → on-prem → air-gapped tiers), read Healthcare AI Reference Architecture on ibl.ai.

For the staged deployment recipe — Managed VPC for low-sensitivity workloads + on-prem for production — see Healthcare AI Blueprint: Managed VPC in 30/60/90 Days.

For the broader pricing landscape (every major model, every major per-seat vendor), the hub: What Does AI Actually Cost in 2026?.

For the most sensitive clinical workloads — appeals, research, anything that should never touch an external network — see the no-egress Air-Gapped Clinical AI Platform.

Why Family-Owned and New York Matters Here

A health system's AI vendor relationship for a workload as central as prior authorization is a multi-year commitment — the workflows, the criteria library, the EHR integration, the audit trail revenue-cycle compliance relies on. ibl.ai is family-owned and operated from New York, NY — a long-term partner with a perpetual platform license and no investor exit pressure. The runtime is open source. The PHI stays inside the hospital's covered boundary. The math works at a 100-bed community hospital or a 30-hospital IDN.

← PreviousAI Cost Math for Higher Education: Per-Seat vs Usage-Based in 2026 Next →What AI Contract Review Actually Costs in 2026

The AI Harness Thesis: Orchestration Beats Model Selection

Enterprises spend their AI strategy debating which model to buy. The model is the commodity — it is replaced every few months and its price falls. The harness around it (retrieval, validation, routing, memory) is the durable asset, and it only compounds if you own it.

ibl.ai EngineeringJuly 29, 2026

Self-Hosted Voice AI Agents for Hospital Health Systems

What it actually costs to run outbound voice AI agents on hospital-owned infrastructure, which BAAs you still need, and where PHI travels during an AI phone call.

ibl.ai EngineeringJuly 28, 2026

The Semantic Layer AI Agents Need — and Who Should Own It

A warehouse semantic layer gives dashboards consistent metrics; AI agents need that plus an operational layer — actions, permissions, audit — with governance. ibl.ai ships both as one open-source, MIT-licensed ontology you self-host and own.

Mikel AmigotJuly 16, 2026

Ontology vs Taxonomy vs Knowledge Graph: What AI Needs

A taxonomy classifies things into a hierarchy; an ontology adds typed relationships, attributes, and actions; a knowledge graph is the ontology populated with your real data. AI agents need all three levels — and you should own the whole stack.

Mikel AmigotJuly 16, 2026

See the ibl.ai AI Operating System in Action

Discover how leading universities and organizations are transforming education with the ibl.ai AI Operating System. Explore real-world implementations from Harvard, MIT, Stanford, and users from 400+ institutions worldwide.

View Case Studies

Get Started with ibl.ai

Choose the plan that fits your needs and start transforming your educational experience today.

ibl.ai Agentic AI Blog

Topics We Cover

Featured Research and Reports

For Technical Leaders