Two Pricing Models, Very Different Math
Every AI buying decision in 2026 comes down to a single shape question:
Per-seat SaaS — ChatGPT Enterprise ($60/user/month), Microsoft 365 Copilot ($30/user), Glean (~$40/user), Harvey ($300–500/lawyer), Co:Counsel ($200–500/user). You buy a license for every employee who might use AI, whether they touch it daily or never.
Usage-based or self-hosted — Token pricing on the underlying model, or a flat license on a runtime you own. You pay for the actual work done — and at any organization above ~100 users, this is 10–100× cheaper for the same workload.
The per-seat model was borrowed from collaboration software (Slack, Notion, Salesforce), where the seat fee approximated "access." AI does real work; the cost should scale with the work, not the org chart.
This post is the 2026 reference: every major model's token price, every major per-seat vendor's headcount math, and the segment-by-segment breakdown of what the gap looks like.
What the Latest Models Actually Cost (Token Pricing)
Approximate as of mid-2026 — always check provider docs for current rates. Prices are dollars per million tokens (MTok).
| Model | Provider | Input ($/MTok) | Output ($/MTok) | Tier |
|---|---|---|---|---|
| Claude Opus 4.7 | Anthropic | $15 | $75 | Frontier reasoning |
| GPT-5 | OpenAI | $10 | $30 | Frontier reasoning |
| Gemini 3 Pro | $3.50 | $10.50 | Frontier (long context) | |
| Claude Sonnet 4.6 | Anthropic | $3 | $15 | Mid-tier workhorse |
| GPT-5 mini | OpenAI | ~$1.50 | ~$6 | Mid-tier workhorse |
| Claude Haiku 4.5 | Anthropic | $1 | $5 | Cheap/fast |
| Gemini 3 Flash | $0.35 | $1.05 | Cheap/fast (cheapest hosted) | |
| DeepSeek-R1 (hosted) | DeepSeek | ~$0.55 | ~$2.20 | Cheap reasoning (hosted) |
| Llama 4 / DeepSeek-R1 / Qwen 3 (self-hosted) | Open weights | ~$0 | ~$0 | GPU cost only (~$1–3/hour) |
The bottom row is the punchline. Self-hosted open-weight models have no per-token charge — just GPU time. A reserved H100 instance ($1.50–3/hour) handles tens of thousands of requests per day for an organization of any size.
What the Per-Seat Vendors Charge
Same disclaimer — approximate as of mid-2026.
| Product | Target buyer | $/user/month | @ 1,000 users | @ 10,000 users |
|---|---|---|---|---|
| ChatGPT Enterprise | Large org, horizontal | $60 | $60K/mo | $600K/mo |
| Microsoft 365 Copilot | M365 customers | $30 | $30K/mo | $300K/mo |
| Glean | Enterprise work AI | ~$40 | $40K/mo | $400K/mo |
| Harvey | Law firms | $300–500 | $300–500K/mo | N/A |
| Thomson Reuters Co:Counsel | Law firms | $200–500 | $200–500K/mo | N/A |
| ChatGPT Team | SMB | $25 | $25K/mo | $250K/mo |
| ChatGPT Edu | K-12 / higher ed | ~$25 | $25K/mo | $250K/mo |
Every row scales linearly with headcount — and headcount has no relationship to how much AI work an organization actually generates.
The Same Workload, Three Ways
Take a representative workload: 100 million input + 50 million output tokens per month. That's roughly what a 5,000-person org generates for high-engagement AI use cases (drafting, classification, Q&A, agent automation).
| Approach | Math | Monthly cost | Annual |
|---|---|---|---|
| Per-seat — ChatGPT Enterprise | $60 × 5,000 users | $300,000 | $3,600,000 |
| Per-seat — Microsoft 365 Copilot | $30 × 5,000 users | $150,000 | $1,800,000 |
| Direct API — Claude Sonnet 4.6 | 100M×$3 + 50M×$15 | $1,050 | $12,600 |
| Direct API — GPT-5 | 100M×$10 + 50M×$30 | $2,500 | $30,000 |
| Direct API — Gemini 3 Flash | 100M×$0.35 + 50M×$1.05 | $87.50 | $1,050 |
| ibl.ai self-hosted (Llama 4 / DeepSeek-R1) | Flat license + 1× H100 | ~$3,000–8,000 | ~$36,000–96,000 |
ChatGPT Enterprise is 300× more expensive than the same workload on direct Claude Sonnet API. Even compared to the all-in self-hosted line — which includes GPU, platform license, and support — per-seat is 40–100× more expensive.
Why the Per-Seat Model Doesn't Survive Contact With Real Usage
Three reasons it breaks at scale:
1. Usage is concentrated, not distributed. In any organization, 10–20% of users generate 80% of the AI work. Per-seat means buying for the 100% to subsidize the 20%.
2. Headcount and AI work are uncorrelated. A 5,000-person org might generate the same monthly AI workload as a 500-person org if the use case is automation rather than personal productivity. Per-seat invoices headcount; the work doesn't care.
3. The savings story unwinds. AI is supposed to do more with less. A per-seat bill that scales with headcount is the opposite — every new hire makes AI more expensive, not the work more productive.
What This Looks Like in Your Segment
The math changes a bit by segment — different workloads, different compliance constraints, different per-seat villains. The shape doesn't:
- AI Cost Math for Hospitals — Prior auth at a 5K-clinician system. ChatGPT Enterprise $300K/mo vs ~$3–5K/mo self-hosted. HIPAA + BAA reach.
- AI Cost Math for Law Firms — Due diligence at a 200-lawyer firm. Harvey $80K/mo vs ~$5–8K/mo self-hosted. ABA Rule 1.6 privilege.
- AI Cost Math for Financial Services — AML triage at a 10K-employee bank. Per-seat $300–600K/mo vs ~$5–15K/mo. FINRA + SR 11-7 model risk.
- AI Cost Math for Government Agencies — FOIA + case management at a 15K-employee state agency. Per-seat $450–900K/mo vs ~$5–15K/mo. FedRAMP + IL4/IL5.
- AI Cost Math for K-12 Districts — Tutoring + lesson planning + IEP at a 50K-student district. Per-seat $75–90K/mo vs ~$3–6K/mo. FERPA + COPPA.
- AI Cost Math for Higher Education — Advising + tutoring + course content at a 30K-student university. ChatGPT Edu $825K/mo vs ~$5–10K/mo. FERPA + LMS/SIS integration.
- AI Cost Math for Small Business — Customer-support automation at a 20-person company. Per-seat $500–600/mo vs ~$100–250/mo. Flat-rate VPS deployment.
And three earlier higher-ed deep-dives that take different angles on the same problem:
- Cost Math University CFOs Love With ibl.ai — campus platform vs per-seat for a 30K-student university.
- University AI Per-Seat Cost: True Math — what the per-seat invoice actually looks like for higher ed at scale.
- The Most Cost-Effective Way to Adopt AI in Higher Ed Isn't Per-Seat SaaS — It's a Campus Platform — the procurement-shape argument.
What Stays the Same, What Changes
Self-hosting the runtime doesn't mean rebuilding the platform. With ibl.ai, the chat UI, the agent dashboards, the audit logs, the model-routing-with-fallbacks, the multi-agent orchestration, and the integrations with the systems your organization already runs — all of that stays managed by ibl.ai. The compute, the model, and the data move inside your perimeter.
What disappears: the per-seat line item that scales with headcount.
What appears: an AI capability your organization owns, with model-choice flexibility — frontier reasoning models (Opus, GPT-5) for the high-stakes work, mid-tier models (Sonnet, GPT-5 mini) for the workhorse queue, fast/cheap models (Haiku, Flash) for high-volume routing, and open-weight models (Llama 4, DeepSeek-R1, Qwen 3) for the bulk and sensitive workloads.
Why Family-Owned and New York Matters Here
When the AI vendor contract becomes a multi-million-dollar annual line item, the structure of the vendor matters. ibl.ai is family-owned and operated from New York, NY — a U.S.-headquartered, domestically-owned, long-term partner with a perpetual platform license and no investor exit pressure. The runtime is open source. The data stays inside your perimeter. The math works at 20 employees or 50,000.