---
title: "What Does AI Actually Cost in 2026? Latest LLM Pricing + Per-Seat Math"
slug: "what-does-ai-actually-cost-in-2026"
author: "ibl.ai Engineering"
date: "2026-05-30 16:00:00"
category: "Premium"
topics: "LLM pricing 2026, AI cost comparison, GPT-5 pricing, Claude Opus 4.7 pricing, Gemini 3 Pro pricing, ChatGPT Enterprise cost, Microsoft Copilot cost, per-seat vs usage-based, self-hosted AI cost, AI cost calculator"
summary: "The 2026 pricing landscape — every major LLM (Claude Opus 4.7, GPT-5, Gemini 3 Pro, Llama 4, DeepSeek-R1) and every major per-seat AI vendor (ChatGPT Enterprise, Microsoft Copilot, Glean, Harvey) — with the math that shows why per-seat breaks at scale and what shape actually works."
banner: ""
thumbnail: ""
---

## Two Pricing Models, Very Different Math

Every AI buying decision in 2026 comes down to a single shape question:

**Per-seat SaaS** — ChatGPT Enterprise ($60/user/month), Microsoft 365 Copilot ($30/user), Glean (~$40/user), Harvey ($300–500/lawyer), Co:Counsel ($200–500/user). You buy a license for every employee who *might* use AI, whether they touch it daily or never.

**Usage-based or self-hosted** — Token pricing on the underlying model, or a flat license on a runtime you own. You pay for the actual work done — and at any organization above ~100 users, this is 10–100× cheaper for the same workload.

The per-seat model was borrowed from collaboration software (Slack, Notion, Salesforce), where the seat fee approximated "access." AI does real work; the cost should scale with the work, not the org chart.

This post is the 2026 reference: every major model's token price, every major per-seat vendor's headcount math, and the segment-by-segment breakdown of what the gap looks like.

## What the Latest Models Actually Cost (Token Pricing)

Approximate as of mid-2026 — always check provider docs for current rates. Prices are dollars per million tokens (MTok).

<table style="width:100%; border-collapse:collapse; margin:1.5rem 0; font-size:0.95rem;">
  <thead>
    <tr style="background:#f5f5f0; border-bottom:2px solid #2175C5;">
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Model</th>
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Provider</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Input ($/MTok)</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Output ($/MTok)</th>
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Tier</th>
    </tr>
  </thead>
  <tbody>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Claude Opus 4.7</strong></td>
      <td style="padding:0.75rem;">Anthropic</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$15</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$75</td>
      <td style="padding:0.75rem;">Frontier reasoning</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>GPT-5</strong></td>
      <td style="padding:0.75rem;">OpenAI</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$10</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$30</td>
      <td style="padding:0.75rem;">Frontier reasoning</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Gemini 3 Pro</strong></td>
      <td style="padding:0.75rem;">Google</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$3.50</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$10.50</td>
      <td style="padding:0.75rem;">Frontier (long context)</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Claude Sonnet 4.6</strong></td>
      <td style="padding:0.75rem;">Anthropic</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$3</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$15</td>
      <td style="padding:0.75rem;">Mid-tier workhorse</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>GPT-5 mini</strong></td>
      <td style="padding:0.75rem;">OpenAI</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$1.50</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$6</td>
      <td style="padding:0.75rem;">Mid-tier workhorse</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Claude Haiku 4.5</strong></td>
      <td style="padding:0.75rem;">Anthropic</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$1</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$5</td>
      <td style="padding:0.75rem;">Cheap/fast</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Gemini 3 Flash</strong></td>
      <td style="padding:0.75rem;">Google</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$0.35</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$1.05</td>
      <td style="padding:0.75rem;">Cheap/fast (cheapest hosted)</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>DeepSeek-R1 (hosted)</strong></td>
      <td style="padding:0.75rem;">DeepSeek</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$0.55</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$2.20</td>
      <td style="padding:0.75rem;">Cheap reasoning (hosted)</td>
    </tr>
    <tr style="background:#f0f9ff; border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Llama 4 / DeepSeek-R1 / Qwen 3 (self-hosted)</strong></td>
      <td style="padding:0.75rem;">Open weights</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#15803d;">~$0</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#15803d;">~$0</td>
      <td style="padding:0.75rem;"><strong>GPU cost only (~$1–3/hour)</strong></td>
    </tr>
  </tbody>
</table>

The bottom row is the punchline. Self-hosted open-weight models have no per-token charge — just GPU time. A reserved H100 instance ($1.50–3/hour) handles tens of thousands of requests per day for an organization of any size.

## What the Per-Seat Vendors Charge

Same disclaimer — approximate as of mid-2026.

<table style="width:100%; border-collapse:collapse; margin:1.5rem 0; font-size:0.95rem;">
  <thead>
    <tr style="background:#f5f5f0; border-bottom:2px solid #2175C5;">
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Product</th>
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Target buyer</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">$/user/month</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">@ 1,000 users</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">@ 10,000 users</th>
    </tr>
  </thead>
  <tbody>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>ChatGPT Enterprise</strong></td>
      <td style="padding:0.75rem;">Large org, horizontal</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$60</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$60K/mo</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;"><strong>$600K/mo</strong></td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Microsoft 365 Copilot</strong></td>
      <td style="padding:0.75rem;">M365 customers</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$30</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$30K/mo</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;"><strong>$300K/mo</strong></td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Glean</strong></td>
      <td style="padding:0.75rem;">Enterprise work AI</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$40</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$40K/mo</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;"><strong>$400K/mo</strong></td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Harvey</strong></td>
      <td style="padding:0.75rem;">Law firms</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$300–500</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$300–500K/mo</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">N/A</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Thomson Reuters Co:Counsel</strong></td>
      <td style="padding:0.75rem;">Law firms</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$200–500</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$200–500K/mo</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">N/A</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>ChatGPT Team</strong></td>
      <td style="padding:0.75rem;">SMB</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$25</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$25K/mo</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;"><strong>$250K/mo</strong></td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>ChatGPT Edu</strong></td>
      <td style="padding:0.75rem;">K-12 / higher ed</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">~$25</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$25K/mo</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;"><strong>$250K/mo</strong></td>
    </tr>
  </tbody>
</table>

Every row scales linearly with headcount — and headcount has no relationship to how much AI work an organization actually generates.

## The Same Workload, Three Ways

Take a representative workload: **100 million input + 50 million output tokens per month**. That's roughly what a 5,000-person org generates for high-engagement AI use cases (drafting, classification, Q&A, agent automation).

<table style="width:100%; border-collapse:collapse; margin:1.5rem 0; font-size:0.95rem;">
  <thead>
    <tr style="background:#f5f5f0; border-bottom:2px solid #2175C5;">
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Approach</th>
      <th style="text-align:left; padding:0.75rem; color:#5f6368;">Math</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Monthly cost</th>
      <th style="text-align:right; padding:0.75rem; color:#5f6368;">Annual</th>
    </tr>
  </thead>
  <tbody>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>Per-seat — ChatGPT Enterprise</strong></td>
      <td style="padding:0.75rem;">$60 × 5,000 users</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;"><strong>$300,000</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$3,600,000</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;">Per-seat — Microsoft 365 Copilot</td>
      <td style="padding:0.75rem;">$30 × 5,000 users</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$150,000</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#b91c1c;">$1,800,000</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;">Direct API — Claude Sonnet 4.6</td>
      <td style="padding:0.75rem;">100M×$3 + 50M×$15</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$1,050</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$12,600</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;">Direct API — GPT-5</td>
      <td style="padding:0.75rem;">100M×$10 + 50M×$30</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$2,500</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$30,000</td>
    </tr>
    <tr style="border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;">Direct API — Gemini 3 Flash</td>
      <td style="padding:0.75rem;">100M×$0.35 + 50M×$1.05</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$87.50</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums;">$1,050</td>
    </tr>
    <tr style="background:#f0f9ff; border-bottom:1px solid #e5e7eb;">
      <td style="padding:0.75rem;"><strong>ibl.ai self-hosted (Llama 4 / DeepSeek-R1)</strong></td>
      <td style="padding:0.75rem;">Flat license + 1× H100</td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#15803d;"><strong>~$3,000–8,000</strong></td>
      <td style="text-align:right; padding:0.75rem; font-variant-numeric:tabular-nums; color:#15803d;">~$36,000–96,000</td>
    </tr>
  </tbody>
</table>

ChatGPT Enterprise is **300× more expensive** than the same workload on direct Claude Sonnet API. Even compared to the all-in self-hosted line — which includes GPU, platform license, and support — per-seat is **40–100× more expensive**.

## Why the Per-Seat Model Doesn't Survive Contact With Real Usage

Three reasons it breaks at scale:

**1. Usage is concentrated, not distributed.** In any organization, 10–20% of users generate 80% of the AI work. Per-seat means buying for the 100% to subsidize the 20%.

**2. Headcount and AI work are uncorrelated.** A 5,000-person org might generate the same monthly AI workload as a 500-person org if the use case is automation rather than personal productivity. Per-seat invoices headcount; the work doesn't care.

**3. The savings story unwinds.** AI is supposed to do more with less. A per-seat bill that scales with headcount is the opposite — every new hire makes AI more expensive, not the work more productive.

## What This Looks Like in Your Segment

The math changes a bit by segment — different workloads, different compliance constraints, different per-seat villains. The shape doesn't:

- **[AI Cost Math for Hospitals](/blog/ai-cost-math-for-hospitals-per-seat-vs-usage)** — Prior auth at a 5K-clinician system. ChatGPT Enterprise $300K/mo vs ~$3–5K/mo self-hosted. HIPAA + BAA reach.
- **[AI Cost Math for Law Firms](/blog/ai-cost-math-for-law-firms-per-seat-vs-usage)** — Due diligence at a 200-lawyer firm. Harvey $80K/mo vs ~$5–8K/mo self-hosted. ABA Rule 1.6 privilege.
- **[AI Cost Math for Financial Services](/blog/ai-cost-math-for-financial-services-per-seat-vs-usage)** — AML triage at a 10K-employee bank. Per-seat $300–600K/mo vs ~$5–15K/mo. FINRA + SR 11-7 model risk.
- **[AI Cost Math for Government Agencies](/blog/ai-cost-math-for-government-per-seat-vs-usage)** — FOIA + case management at a 15K-employee state agency. Per-seat $450–900K/mo vs ~$5–15K/mo. FedRAMP + IL4/IL5.
- **[AI Cost Math for K-12 Districts](/blog/ai-cost-math-for-k12-districts-per-seat-vs-usage)** — Tutoring + lesson planning + IEP at a 50K-student district. Per-seat $75–90K/mo vs ~$3–6K/mo. FERPA + COPPA.
- **[AI Cost Math for Higher Education](/blog/ai-cost-math-for-higher-education-per-seat-vs-usage)** — Advising + tutoring + course content at a 30K-student university. ChatGPT Edu $825K/mo vs ~$5–10K/mo. FERPA + LMS/SIS integration.
- **[AI Cost Math for Small Business](/blog/ai-cost-math-for-small-business-per-seat-vs-usage)** — Customer-support automation at a 20-person company. Per-seat $500–600/mo vs ~$100–250/mo. Flat-rate VPS deployment.

And three earlier higher-ed deep-dives that take different angles on the same problem:

- **[Cost Math University CFOs Love With ibl.ai](/blog/cost-math-university-cfos-love-with-iblai)** — campus platform vs per-seat for a 30K-student university.
- **[University AI Per-Seat Cost: True Math](/blog/university-ai-per-seat-cost-true-math)** — what the per-seat invoice actually looks like for higher ed at scale.
- **[The Most Cost-Effective Way to Adopt AI in Higher Ed Isn't Per-Seat SaaS — It's a Campus Platform](/blog/the-most-cost-effective-way-to-adopt-ai-in-higher-ed-isnt-per-seat-saas-its-a-campus-platform)** — the procurement-shape argument.

## What Stays the Same, What Changes

Self-hosting the runtime doesn't mean rebuilding the platform. With ibl.ai, the chat UI, the agent dashboards, the audit logs, the model-routing-with-fallbacks, the multi-agent orchestration, and the integrations with the systems your organization already runs — all of that stays managed by ibl.ai. The compute, the model, and the data move inside your perimeter.

What disappears: the per-seat line item that scales with headcount.

What appears: an AI capability your organization owns, with model-choice flexibility — frontier reasoning models (Opus, GPT-5) for the high-stakes work, mid-tier models (Sonnet, GPT-5 mini) for the workhorse queue, fast/cheap models (Haiku, Flash) for high-volume routing, and open-weight models (Llama 4, DeepSeek-R1, Qwen 3) for the bulk and sensitive workloads.

## Why Family-Owned and New York Matters Here

When the AI vendor contract becomes a multi-million-dollar annual line item, the structure of the vendor matters. ibl.ai is family-owned and operated from New York, NY — a U.S.-headquartered, domestically-owned, long-term partner with a perpetual platform license and no investor exit pressure. The runtime is open source. The data stays inside your perimeter. The math works at 20 employees or 50,000.
