How much does AI cost for banks?

At regional-bank scale (10,000 employees, 40,000 AML alerts/month), per-seat AI runs $300–600K/month (ChatGPT Enterprise $600K, Microsoft Copilot $300K, Glean $400K). Direct Claude Sonnet API for the same workload costs ~$816/month. ibl.ai self-hosted runs ~$5,000–15,000/month all-in, with transaction data inside the bank's VPC.

What is SR 11-7 compliant AI?

SR 11-7 compliant AI requires the bank to validate, govern, and monitor the AI model throughout its lifecycle — including conceptual soundness, implementation soundness, and ongoing monitoring. Managed AI vendors that control the model selection make this harder; the bank can only validate inputs and outputs. Self-hosted AI lets the bank's MRM team inspect the full stack, pin model versions, and own the validation pack.

Why do banks need air-gapped AI?

Air-gapped is the default for bank AI because of FINRA + SR 11-7 + GLBA + examiner-subpoena reach. The AML triage, KYC, advisor copilot, and trading-desk workloads all carry regulator-facing audit obligations that don't survive third-party-cloud custody at scale. Self-hosted inside the bank's existing audit perimeter flattens the compliance graph.

What's the alternative to specialty AML AI vendors like Quantexa and NICE Actimize?

ibl.ai is the alternative — same AML alert triage workload (transaction-context summarization, sanctions-hit narration, disposition recommendation, cited reasoning) but the runtime executes inside the bank's VPC. ~20× cheaper at G-SIB scale than the specialty per-analyst vendors, with SR 11-7 governance running through the bank's existing MRM tooling.

Back to Blog

AI Cost Math for Financial Services: Per-Seat vs Usage-Based in 2026

Miguel AmigotMay 30, 2026

Premium

What AI actually costs a regional bank in 2026 — token pricing for the latest models against the $300–600K/month ChatGPT Enterprise and Copilot bills, with KYC/AML workload math and SR 11-7 model risk on a stack you can audit.

The Regional Bank Math: $60 × 10,000 Employees Is Not the Right Number

A regional bank has 10,000 employees — relationship managers, compliance analysts, back-office operations, IT, branch staff. ChatGPT Enterprise at $60 per user per month is $600,000 per month — $7.2M per year. Microsoft 365 Copilot at $30 per user is $300,000 per month — $3.6M per year. Most of those seats touch AI a handful of times per week, if that.

The per-seat model was built for productivity software where every desk needs occasional access. For AI doing real work — KYC document review, AML alert triage, advisor copilot, internal policy Q&A — the cost should scale with the work, not the org chart. And the data should stay in the bank's VPC, not a vendor's cloud where every quarter's DPA refresh is a compliance event.

The math is the post.

What the Latest Models Actually Cost in 2026

Token pricing across the major providers, approximate as of mid-2026 (always check provider docs for current rates):

Model	Provider	Input ($/MTok)	Output ($/MTok)	Best for
Claude Opus 4.7	Anthropic	$15	$75	Complex KYC narratives, advisor copilot
Claude Sonnet 4.6	Anthropic	$3	$15	AML alert triage, document classification
Claude Haiku 4.5	Anthropic	$1	$5	High-volume routing, transaction tagging
GPT-5	OpenAI	$10	$30	Sanctions-screening narration, internal Q&A
Gemini 3 Pro	Google	$3.50	$10.50	Long-context filings & disclosures
Llama 4 (70B, self-hosted)	Meta (open weights)	~$0	~$0	In-VPC bulk workloads, sensitive desks
DeepSeek-R1 (self-hosted)	DeepSeek (open weights)	~$0	~$0	Cost-sensitive batch reasoning

For self-hosted open-weight models, the marginal cost is GPU time. A reserved H100 instance ($1.50–3/hour) handles tens of thousands of bank workflows per day inside the bank's VPC.

A Real Workload: AML Alert Triage at a Regional Bank

AML alert triage is the highest-volume, highest-pain compliance AI use case in retail and commercial banking. A regional bank generates roughly 40,000 alerts per month. A typical alert is 800 input tokens (transaction context, customer history, sanctions hits) and 1,200 output tokens (narrative explaining the disposition with cited reasoning). For a deeper per-alert cost breakdown — including a side-by-side against Quantexa, NICE Actimize, Hawk AI, ComplyAdvantage, and Feedzai at three scale tiers (community / regional / G-SIB) — see What AI AML Alert Triage Actually Costs in 2026.

That's 32M input + 48M output tokens per month for the entire alert workload — concentrated on a few hundred compliance analysts, not spread across the bank's 10K headcount.

What it costs by deployment shape

Deployment	Pricing shape	Monthly cost	Annual	Data residency
ChatGPT Enterprise	Per-seat ($60/user × 10K)	$600,000	$7,200,000	OpenAI cloud (DPA)
Microsoft 365 Copilot	Per-seat ($30/user × 10K)	$300,000	$3,600,000	Microsoft cloud (DPA)
Glean	Per-seat (~$40/user × 10K)	$400,000	$4,800,000	Glean cloud (DPA)
Direct API — Claude Sonnet 4.6	Token-based	~$816	~$9,792	Anthropic cloud (bank DPA)
Direct API — GPT-5	Token-based	~$1,760	~$21,120	OpenAI cloud (bank DPA)
ibl.ai self-hosted (Llama 4 / DeepSeek-R1)	Flat license + GPU	~$5,000–15,000	~$60,000–180,000	Inside the bank's VPC / on-prem

The ibl.ai row covers the GPU instance, the platform license, and ongoing support. There is no third-party vendor in the data path, no managed-cloud DPA to renegotiate, and no question about whether the model provider could be examiner-subpoenaed for transaction records.

Why Per-Seat Pricing Fails Harder in Financial Services

Three structural reasons:

1. Usage is concentrated in compliance, risk, and front-office advisory. A retail-bank teller doesn't generate AML narratives; a compliance analyst does. Buying a seat for every employee subsidizes the 9,500 who barely use AI for the 500 who depend on it. Token pricing — or a flat-rate platform — aligns the bill to the work.

2. SR 11-7 model risk applies to the whole stack, not just the model. OCC SR 11-7 and the joint Fed/OCC/FDIC model-risk guidance require validation, governance, and ongoing monitoring of any model affecting bank decisions. A managed AI vendor that controls the model selection, the training data, and the inference path is a sole-source dependency that risk committees have to underwrite as a single point of failure. A self-hosted, model-agnostic stack passes the test by being inspectable and swappable.

3. Examiner subpoenas don't stop at the bank's perimeter. When the OCC, FINRA, or a state regulator asks for the full reasoning behind a flagged transaction, the bank produces it. When that reasoning lives inside a third-party AI vendor's cloud, the bank introduces a chain-of-custody question that doesn't exist when the model runs inside the bank's VPC.

What Stays the Same, What Changes

Self-hosting the runtime doesn't mean rebuilding the bank's AI tooling. The chat UI, the agent dashboards, the audit logs, the model-routing-with-fallbacks, the multi-agent orchestration, the integration with Bloomberg / Refinitiv / FIS — all of that stays managed by ibl.ai. The compute, the model, and the transaction data move inside the bank's VPC.

What disappears: the $3.6–7M/year per-seat line item. What appears: an internal AI capability the bank owns and audits, with the model-choice flexibility that model risk committees require — Opus for the high-stakes advisor copilot, Sonnet for the AML triage queue, Llama 4 for the air-gapped trading-desk workload.

Run the Numbers for Your Bank

For workload sizing and cost modeling for your AML, KYC, and advisory teams, the AI Help Desk Cost Savings Calculator generalizes to most high-volume bank-administrative workloads.

For the deployment comparison side-by-side — including FINRA / SR 11-7 / GLBA posture and air-gapped options for trading and private-client desks — see Self-Hosted AI vs ChatGPT Enterprise for Financial Services.

For the full SEC / FINRA / SOX / PCI / SR 11-7 aligned architecture (Bloomberg / Refinitiv / FIS integration, model-output versioning, air-gapped tier), read Financial Services AI Reference Architecture on ibl.ai.

For the staged deployment recipe — Managed VPC for low-sensitivity workloads + air-gapped for trading and private-client desks — see Financial Services Blueprint: Air-Gapped AI in 90 Days.

The per-seat-vs-usage gap isn't unique to financial services — it breaks the same way in every regulated industry. For the cross-sector version of this math, see What Does AI Actually Cost in 2026?.

Why Family-Owned and New York Matters Here

The regulatory exposure of a bank's AI vendor relationship is non-trivial — every DPA refresh, every change in data-processing terms, every vendor acquisition is an event the bank's third-party-risk team has to underwrite. ibl.ai is family-owned and operated from New York, NY — a long-term partner with a perpetual platform license and no investor exit pressure. The runtime is open source. The transaction data stays inside the bank's network. The math works at a 500-employee community bank or a 50,000-employee regional.

← PreviousPentagon's $13.4B AI Budget Changes Everything Next →AI Cost Math for Government Agencies: Per-Seat vs Usage-Based in 2026

Air-Gapped AI for Banks: Why FINRA + SR 11-7 Make It the Default

Why air-gapped deployment is the default — not the upgrade — for AI inside a bank. The FINRA, SR 11-7, GLBA, and examiner-subpoena math that pushes the AML, KYC, advisor, and trading workloads inside the bank's own perimeter.

Jaione AmigotJune 1, 2026

AI Cost Math for Higher Education: Per-Seat vs Usage-Based in 2026

What AI actually costs a university in 2026 — token pricing for the latest models against per-seat ChatGPT Edu / Copilot bills for 30K students and 3K faculty, with academic advising and tutoring workload math and a campus-controlled deployment.

Miguel AmigotMay 30, 2026

AI Cost Math for Small Business: Per-Seat vs Usage-Based in 2026

What AI actually costs a 20-person company in 2026 — token pricing for the latest models against ChatGPT Team and Copilot per-seat bills, with customer-support automation workload math and a flat-rate alternative that scales with the work, not the org chart.

Miguel AmigotMay 30, 2026

AI Cost Math for K-12 Districts: Per-Seat vs Usage-Based in 2026

What AI actually costs a school district in 2026 — token pricing for the latest models against per-seat ChatGPT Edu / Copilot bills for 50K students and 3K teachers, with FERPA / COPPA posture and a district-controlled deployment.

Blanca AmigotMay 30, 2026

See the ibl.ai AI Operating System in Action

Discover how leading universities and organizations are transforming education with the ibl.ai AI Operating System. Explore real-world implementations from Harvard, MIT, Stanford, and users from 400+ institutions worldwide.

View Case Studies

Get Started with ibl.ai

Choose the plan that fits your needs and start transforming your educational experience today.

ibl.ai Agentic AI Blog

Topics We Cover

Featured Research and Reports

For Technical Leaders

AI Cost Math for Financial Services: Per-Seat vs Usage-Based in 2026

The Regional Bank Math: $60 × 10,000 Employees Is Not the Right Number

What the Latest Models Actually Cost in 2026

A Real Workload: AML Alert Triage at a Regional Bank

What it costs by deployment shape

Why Per-Seat Pricing Fails Harder in Financial Services

What Stays the Same, What Changes

Run the Numbers for Your Bank

Why Family-Owned and New York Matters Here

Related Articles

Air-Gapped AI for Banks: Why FINRA + SR 11-7 Make It the Default

AI Cost Math for Higher Education: Per-Seat vs Usage-Based in 2026

AI Cost Math for Small Business: Per-Seat vs Usage-Based in 2026

AI Cost Math for K-12 Districts: Per-Seat vs Usage-Based in 2026

See the ibl.ai AI Operating System in Action

Get Started with ibl.ai