--- title: "What AI FOIA Drafting Actually Costs in 2026" slug: "what-ai-foia-drafting-actually-costs-2026" author: "ibl.ai Engineering" date: "2026-05-30 19:30:00" category: "Premium" topics: "AI FOIA cost, FOIA drafting automation, Granicus AI pricing, NextRequest AI, public records AI, GovCloud AI, FedRAMP AI cost, government AI per-request, self-hosted FOIA AI" summary: "Per-request token math for FOIA drafting across the latest models, monthly bills at municipal / county / state agency scale, and why the per-request and per-seat AI vendors are the wrong shape — including in the GovCloud variants." banner: "" thumbnail: "" --- ## FOIA Is the Workload Every Agency Is Already Behind On The Freedom of Information Act and its 50+ state analogs generate millions of requests per year, almost all handled by overworked records officers operating against statutory response clocks. AI doesn't change the policy questions (what to release, what to withhold, what to redact), but it can do the substantial drafting work that takes records officers weeks per request. The vendors going after this category — Granicus, NextRequest (now Civic Plus), Mark43, Tyler Technologies, and a growing list of FedRAMP-authorized AI add-ons — have priced like records-management software vendors always have: per-request fees, per-officer seats, or annual licenses scaled to agency size. None of those pricing shapes track the actual cost of doing the work. The actual cost is small. The math is the post. ## What a FOIA Response Draft Actually Costs Per Token A typical FOIA response draft is about **5,000 input tokens** (the request, the responsive documents, the agency's redaction policy, exemption guidance) and **2,000 output tokens** (the drafted response with exemption citations and a redaction map). Cost-per-request on the major models:

Model	Input ($/MTok)	Output ($/MTok)	$ per request	When to use it
Claude Opus 4.7	$15	$75	$0.225	Sensitive / litigation-adjacent requests
GPT-5	$10	$30	$0.110	Mixed-complexity multi-exemption cases
Claude Sonnet 4.6	$3	$15	$0.045	Standard FOIA response workhorse
Gemini 3 Pro	$3.50	$10.50	$0.039	Long-context (multi-thousand-page) requests
Claude Haiku 4.5	$1	$5	$0.015	Routing, request classification, simple requests
Llama 4 / DeepSeek-R1 (self-hosted)	~$0	~$0	~$0	Inside the agency's boundary

Most-expensive frontier model: **23 cents per response.** Standard workhorse: **4 cents.** Self-hosted in the agency's existing authorization boundary: the marginal cost is the GPU's idle-vs-busy delta. ## Monthly Bills at Three Scale Tiers - **Municipal agency** (small city): ~200 FOIA requests/month - **County / mid-size state agency**: ~1,500 requests/month - **Large state / federal agency**: ~4,000+ requests/month Monthly cost using **Claude Sonnet 4.6** vs the records-management and per-seat AI alternatives:

Approach	Pricing shape	Municipal (200/mo)	County (1.5K/mo)	State/Federal (4K/mo)
FOIA-software vendor with AI add-on	Per-request (~$8/request)	$1,600	$12,000	$32,000
Per-records-officer seat	~$200/officer/mo × team	~$1,000	~$8,000	~$24,000
ChatGPT Enterprise (gov)	$60/seat × agency headcount	~$30,000	~$180,000	~$900,000+
Direct API — Claude Sonnet 4.6 (Bedrock GovCloud)	Token-based	~$9	~$68	~$180
Direct API — GPT-5 (ChatGPT Gov)	Token-based	~$22	~$165	~$440
ibl.ai self-hosted (Llama 4 / DeepSeek-R1)	Flat license + GPU	~$1,500	~$3,000–5,000	~$5,000–10,000

At state/federal scale, even the records-software vendor's AI add-on is **~3× more expensive** than self-hosted, and ChatGPT Enterprise per-seat is **~90× more expensive** — for the same drafted responses. ## Why the Per-Request Pricing Is Worse in Government Than Elsewhere Per-request pricing makes sense for software that handles a transactional unit of work — DocuSign envelopes, payment processing, parcel deliveries. FOIA pricing borrows the pattern, but the underlying unit cost is fractions of a cent; the $8 the vendor charges per request is value capture, not cost recovery. There's a second problem specific to government: **the agency can't pass the cost through.** A bank prices customer-facing AI into product fees; a hospital prices it into billed services. An agency processing FOIA is doing statutorily-mandated work funded by appropriations. Every dollar of vendor markup on a per-request fee is a dollar the agency can't spend on a records officer to do the harder cases the AI can't. ## Why ATO and IL Boundaries Make Self-Hosting Non-Negotiable at Sensitivity A managed AI vendor with FedRAMP-High authorization passes some procurement filters, but the agency still has to authorize the new system into its own boundary and re-do the package every time the vendor changes the model, the region, or the underlying compute. For sensitive workloads (CUI, CJIS, IL4/IL5), the answer is usually one of: 1. **Hold all FOIA processing manually** — the default for many agencies, and the reason response times slip past statutory deadlines 2. **Use a managed AI vendor's gov-cloud variant** — works for some workloads, fails the boundary for others 3. **Self-host in the agency's existing boundary** — the runtime sits inside an environment that already has an ATO Option 3 is the only one that scales to CUI and CJIS workloads without a new authorization event for every model change. ## What Stays the Same, What Changes Self-hosting FOIA-drafting AI doesn't mean rebuilding the agency's records-management tooling. The records-officer-facing chat UI, the request-tracking dashboards, the audit logs, the model-routing-with-fallbacks, the multi-agent orchestration, the integration with the existing FOIA-tracking system and the document repository — all stays managed by ibl.ai. The compute, the model, and the responsive documents move inside the agency's authorization boundary. What disappears: the $24–32K/month per-request or per-officer bill at state/federal scale. What appears: a self-hosted FOIA-drafting capability the agency owns, with a model-routing recipe records counsel designed: - **Opus** for litigation-adjacent or politically sensitive requests - **Sonnet** for standard responsive-document drafts (the bulk) - **Haiku** for request classification, routing, and simple acknowledgments - **Llama 4 self-hosted** for the highest-volume routine routing inside a CUI / IL4 boundary ## Run the Numbers for Your Agency For the segment-wide cost-math context (FOIA, case management, citizen services), see **[AI Cost Math for Government Agencies: Per-Seat vs Usage-Based in 2026](/blog/ai-cost-math-for-government-per-seat-vs-usage)**. For the deployment comparison side-by-side — including FedRAMP / StateRAMP / CJIS / IL4-IL5 posture and air-gapped options — see **[Self-Hosted AI vs ChatGPT Enterprise for Government](/resources/comparisons/self-hosted-ai-vs-chatgpt-enterprise-for-government)**. For the full NIST 800-53 aligned architecture (PIV/CAC identity, GovCloud → on-prem → air-gapped tiers, sovereignty benchmark vs gov-cloud AI assistants), read **[Government AI Reference Architecture on ibl.ai](/blog/government-ai-reference-architecture)**. For the staged deployment recipe — FedRAMP GovCloud pilot → on-prem CUI → air-gapped IL4/IL5 — see **[Government AI Blueprint: GovCloud Pilot to IL4/IL5](/blog/government-ai-blueprint-govcloud-to-il4-il5)**. For the broader pricing landscape across every model and per-seat vendor, the hub: **[What Does AI Actually Cost in 2026?](/blog/what-does-ai-actually-cost-in-2026)**. ## Why Family-Owned and New York Matters Here For U.S. federal, state, and local procurement, the structure of the AI vendor matters as much as the price — especially for workloads inside a CUI or IL boundary. ibl.ai is family-owned and operated from New York, NY — a U.S.-headquartered, domestically-owned, long-term partner with a perpetual platform license and no investor exit pressure. The runtime is open source. The responsive documents stay inside the agency's authorization boundary. The math works at a 50-employee municipal records office or a 50,000-employee federal department.