The 15,000-Employee Agency Math: $30–60 × Headcount Is Not the Right Number
A mid-size state agency has 15,000 civil servants — case workers, attorneys, analysts, inspectors, IT, back-office. ChatGPT Enterprise at $60 per user per month would run $900,000 per month — $10.8M per year. Microsoft 365 Copilot at $30 per user is $450,000 per month — $5.4M per year. The procurement officer reading those numbers already knows the per-seat model can't be how this gets bought.
Federal and gov-cloud variants of the same vendors don't change the shape — they add FedRAMP overhead and usually charge the same or more per seat. The pricing is borrowed from commercial productivity software where the seat is the unit; for AI doing real work — FOIA drafting, case-management copilot, citizen-service triage, internal policy Q&A — the cost should scale with the work, the data should stay inside the agency's authorization boundary, and the model choice should be the agency's, not the vendor's.
The math is the post.
What the Latest Models Actually Cost in 2026
Token pricing across the major providers, approximate as of mid-2026 (always check provider docs for current rates):
| Model | Provider | Input ($/MTok) | Output ($/MTok) | Authorization fit |
|---|---|---|---|---|
| Claude Opus 4.7 | Anthropic | $15 | $75 | Commercial cloud; GovCloud variant via AWS Bedrock |
| Claude Sonnet 4.6 | Anthropic | $3 | $15 | Commercial cloud; GovCloud variant via AWS Bedrock |
| Claude Haiku 4.5 | Anthropic | $1 | $5 | Same; cheap routing/classification |
| GPT-5 | OpenAI | $10 | $30 | Commercial cloud; ChatGPT Gov for FedRAMP-High |
| Gemini 3 Pro | $3.50 | $10.50 | Commercial cloud; GCP Assured Workloads / IL5 | |
| Llama 4 (70B, self-hosted) | Meta (open weights) | ~$0 | ~$0 | Air-gappable; runs inside any boundary |
| DeepSeek-R1 (self-hosted) | DeepSeek (open weights) | ~$0 | ~$0 | Air-gappable; cost-sensitive workloads |
For the most sensitive environments — CUI, CJIS, IL4/IL5 — the realistic options collapse to two: a frontier-lab government-cloud variant (still a fixed model + vendor data path) or a self-hosted open-weight stack the agency runs end-to-end. There isn't a third option.
A Real Workload: FOIA Drafting + Case Management at a State Agency
Most agencies have two high-volume AI workloads: FOIA response drafting and case-management narrative generation. Take the state agency: roughly 4,000 FOIA requests per month (each ~5,000 input + 2,000 output tokens) and 25,000 case-management updates per month (each ~1,000 input + 800 output tokens). Combined: 45M input + 28M output tokens per month. For a deeper per-FOIA-request cost breakdown — including a side-by-side against Granicus, NextRequest, Mark43, and Tyler Technologies AI add-ons at three scale tiers (municipal / county / state-federal) — see What AI FOIA Drafting Actually Costs in 2026.
That's distributed across maybe 800–1,500 actual users — the FOIA officers, the case workers, the supervising attorneys. Not the 15,000-person headcount the per-seat vendor would invoice for.
What it costs by deployment shape
| Deployment | Pricing shape | Monthly cost | Annual | Authorization posture |
|---|---|---|---|---|
| ChatGPT Enterprise | Per-seat ($60/user × 15K) | $900,000 | $10,800,000 | OpenAI commercial cloud |
| Microsoft 365 Copilot (Gov) | Per-seat ($30+/user × 15K) | $450,000+ | $5,400,000+ | Microsoft Gov cloud (FedRAMP-High) |
| Direct API — Claude Sonnet 4.6 (Bedrock GovCloud) | Token-based | ~$555 | ~$6,660 | AWS GovCloud (IL4-eligible) |
| Direct API — GPT-5 (ChatGPT Gov) | Token-based | ~$1,290 | ~$15,480 | OpenAI Gov cloud (FedRAMP-High) |
| ibl.ai self-hosted (Llama 4 / DeepSeek-R1) | Flat license + GPU | ~$5,000–15,000 | ~$60,000–180,000 | Air-gappable; runs in your boundary |
The ibl.ai row covers GPU instance, platform license, and ongoing support — and works in environments where the gov-cloud variants don't reach (IL5, fully air-gapped enclaves, agencies with their own ATO process that doesn't accept managed-cloud AI vendors).
Why Per-Seat Pricing Fails Harder in Government
Three structural reasons:
1. Civil-service headcount is large; AI usage is concentrated. A state agency's headcount includes inspectors, field staff, and clerks who barely touch the AI system. The FOIA drafting and case-management workloads live with a few hundred specialists. Per-seat means buying for everyone; usage-based means paying for the actual specialists.
2. ATO and authorization boundaries are per-system, not per-vendor. A managed AI vendor with FedRAMP-High authorization passes some procurement filters, but the agency still has to authorize the new system into its own boundary and re-do the package every time the vendor changes the model, the region, or the underlying compute. A self-hosted stack inside an existing boundary doesn't trigger a new ATO event.
3. IL5 and air-gapped environments rule out managed AI vendors entirely. Defense workloads, certain intelligence-community use cases, and a growing number of state-level critical-infrastructure environments require AI that runs inside the boundary with no external network dependency. There is no "per-seat ChatGPT" answer to those requirements — only self-hosted open-weight runtimes connected to a platform like ibl.ai.
What Stays the Same, What Changes
Self-hosting the runtime doesn't mean rebuilding the agency's AI tooling. The chat UI, the case-worker dashboards, the audit logs, the model-routing-with-fallbacks, the multi-agent orchestration, the integration with existing case-management and FOIA-tracking systems — all of that stays managed by ibl.ai. The compute, the model, and the case data move inside the agency's authorization boundary.
What disappears: the $5–10M/year per-seat line item. What appears: an AI capability the agency owns, with the model-choice flexibility that procurement requires — Opus for the high-stakes regulatory analysis, Sonnet for the case-update queue, Llama 4 for the air-gapped enclave.
Run the Numbers for Your Agency
For workload sizing for FOIA, case management, and citizen-service triage, the AI Help Desk Cost Savings Calculator generalizes to most high-volume government-administrative workloads.
For the deployment comparison side-by-side — including FedRAMP / StateRAMP / CJIS / IL4-IL5 posture and air-gapped options — see Self-Hosted AI vs ChatGPT Enterprise for Government.
For the full NIST 800-53 aligned architecture (PIV/CAC identity, GovCloud → on-prem → air-gapped tiers, sovereignty benchmark vs gov-cloud AI assistants), read Government AI Reference Architecture on ibl.ai.
For the staged deployment recipe — FedRAMP GovCloud pilot → on-prem CUI → air-gapped IL4/IL5 — see Government AI Blueprint: GovCloud Pilot to IL4/IL5.
Why Family-Owned and New York Matters Here
For U.S. federal, state, and defense procurement, the structure of the vendor matters as much as the price. ibl.ai is family-owned and operated from New York, NY — a U.S.-headquartered, domestically-owned, long-term partner with a perpetual platform license and no investor exit pressure. The runtime is open source. The case data stays inside the agency's authorization boundary. The math works at a 500-employee municipal agency or a 50,000-employee federal department.