FOIA Is the Workload Every Agency Is Already Behind On
The Freedom of Information Act and its 50+ state analogs generate millions of requests per year, almost all handled by overworked records officers operating against statutory response clocks. AI doesn't change the policy questions (what to release, what to withhold, what to redact), but it can do the substantial drafting work that takes records officers weeks per request.
The vendors going after this category — Granicus, NextRequest (now Civic Plus), Mark43, Tyler Technologies, and a growing list of FedRAMP-authorized AI add-ons — have priced like records-management software vendors always have: per-request fees, per-officer seats, or annual licenses scaled to agency size. None of those pricing shapes track the actual cost of doing the work.
The actual cost is small. The math is the post.
What a FOIA Response Draft Actually Costs Per Token
A typical FOIA response draft is about 5,000 input tokens (the request, the responsive documents, the agency's redaction policy, exemption guidance) and 2,000 output tokens (the drafted response with exemption citations and a redaction map). Cost-per-request on the major models:
| Model | Input ($/MTok) | Output ($/MTok) | $ per request | When to use it |
|---|---|---|---|---|
| Claude Opus 4.7 | $15 | $75 | $0.225 | Sensitive / litigation-adjacent requests |
| GPT-5 | $10 | $30 | $0.110 | Mixed-complexity multi-exemption cases |
| Claude Sonnet 4.6 | $3 | $15 | $0.045 | Standard FOIA response workhorse |
| Gemini 3 Pro | $3.50 | $10.50 | $0.039 | Long-context (multi-thousand-page) requests |
| Claude Haiku 4.5 | $1 | $5 | $0.015 | Routing, request classification, simple requests |
| Llama 4 / DeepSeek-R1 (self-hosted) | ~$0 | ~$0 | ~$0 | Inside the agency's boundary |
Most-expensive frontier model: 23 cents per response. Standard workhorse: 4 cents. Self-hosted in the agency's existing authorization boundary: the marginal cost is the GPU's idle-vs-busy delta.
Monthly Bills at Three Scale Tiers
- Municipal agency (small city): ~200 FOIA requests/month
- County / mid-size state agency: ~1,500 requests/month
- Large state / federal agency: ~4,000+ requests/month
Monthly cost using Claude Sonnet 4.6 vs the records-management and per-seat AI alternatives:
| Approach | Pricing shape | Municipal (200/mo) | County (1.5K/mo) | State/Federal (4K/mo) |
|---|---|---|---|---|
| FOIA-software vendor with AI add-on | Per-request (~$8/request) | $1,600 | $12,000 | $32,000 |
| Per-records-officer seat | ~$200/officer/mo × team | ~$1,000 | ~$8,000 | ~$24,000 |
| ChatGPT Enterprise (gov) | $60/seat × agency headcount | ~$30,000 | ~$180,000 | ~$900,000+ |
| Direct API — Claude Sonnet 4.6 (Bedrock GovCloud) | Token-based | ~$9 | ~$68 | ~$180 |
| Direct API — GPT-5 (ChatGPT Gov) | Token-based | ~$22 | ~$165 | ~$440 |
| ibl.ai self-hosted (Llama 4 / DeepSeek-R1) | Flat license + GPU | ~$1,500 | ~$3,000–5,000 | ~$5,000–10,000 |
At state/federal scale, even the records-software vendor's AI add-on is ~3× more expensive than self-hosted, and ChatGPT Enterprise per-seat is ~90× more expensive — for the same drafted responses.
Why the Per-Request Pricing Is Worse in Government Than Elsewhere
Per-request pricing makes sense for software that handles a transactional unit of work — DocuSign envelopes, payment processing, parcel deliveries. FOIA pricing borrows the pattern, but the underlying unit cost is fractions of a cent; the $8 the vendor charges per request is value capture, not cost recovery.
There's a second problem specific to government: the agency can't pass the cost through. A bank prices customer-facing AI into product fees; a hospital prices it into billed services. An agency processing FOIA is doing statutorily-mandated work funded by appropriations. Every dollar of vendor markup on a per-request fee is a dollar the agency can't spend on a records officer to do the harder cases the AI can't.
Why ATO and IL Boundaries Make Self-Hosting Non-Negotiable at Sensitivity
A managed AI vendor with FedRAMP-High authorization passes some procurement filters, but the agency still has to authorize the new system into its own boundary and re-do the package every time the vendor changes the model, the region, or the underlying compute. For sensitive workloads (CUI, CJIS, IL4/IL5), the answer is usually one of:
- Hold all FOIA processing manually — the default for many agencies, and the reason response times slip past statutory deadlines
- Use a managed AI vendor's gov-cloud variant — works for some workloads, fails the boundary for others
- Self-host in the agency's existing boundary — the runtime sits inside an environment that already has an ATO
Option 3 is the only one that scales to CUI and CJIS workloads without a new authorization event for every model change.
What Stays the Same, What Changes
Self-hosting FOIA-drafting AI doesn't mean rebuilding the agency's records-management tooling. The records-officer-facing chat UI, the request-tracking dashboards, the audit logs, the model-routing-with-fallbacks, the multi-agent orchestration, the integration with the existing FOIA-tracking system and the document repository — all stays managed by ibl.ai. The compute, the model, and the responsive documents move inside the agency's authorization boundary.
What disappears: the $24–32K/month per-request or per-officer bill at state/federal scale.
What appears: a self-hosted FOIA-drafting capability the agency owns, with a model-routing recipe records counsel designed:
- Opus for litigation-adjacent or politically sensitive requests
- Sonnet for standard responsive-document drafts (the bulk)
- Haiku for request classification, routing, and simple acknowledgments
- Llama 4 self-hosted for the highest-volume routine routing inside a CUI / IL4 boundary
Run the Numbers for Your Agency
For the segment-wide cost-math context (FOIA, case management, citizen services), see AI Cost Math for Government Agencies: Per-Seat vs Usage-Based in 2026.
For the deployment comparison side-by-side — including FedRAMP / StateRAMP / CJIS / IL4-IL5 posture and air-gapped options — see Self-Hosted AI vs ChatGPT Enterprise for Government.
For the full NIST 800-53 aligned architecture (PIV/CAC identity, GovCloud → on-prem → air-gapped tiers, sovereignty benchmark vs gov-cloud AI assistants), read Government AI Reference Architecture on ibl.ai.
For the staged deployment recipe — FedRAMP GovCloud pilot → on-prem CUI → air-gapped IL4/IL5 — see Government AI Blueprint: GovCloud Pilot to IL4/IL5.
For the broader pricing landscape across every model and per-seat vendor, the hub: What Does AI Actually Cost in 2026?.
Why Family-Owned and New York Matters Here
For U.S. federal, state, and local procurement, the structure of the AI vendor matters as much as the price — especially for workloads inside a CUI or IL boundary. ibl.ai is family-owned and operated from New York, NY — a U.S.-headquartered, domestically-owned, long-term partner with a perpetual platform license and no investor exit pressure. The runtime is open source. The responsive documents stay inside the agency's authorization boundary. The math works at a 50-employee municipal records office or a 50,000-employee federal department.