Customer Support Is Where the Per-Conversation Pricing Trap Really Bites
Of all the AI use cases this site covers, customer support is the most visibly competitive. Intercom Fin charges $0.99 per AI-resolved conversation. Ada, Drift, Ultimate.ai, Forethought, Cresta — every legacy customer-service vendor has a per-conversation or per-resolution AI add-on, with prices in the same neighborhood.
That neighborhood is roughly 50–200× what the underlying token cost is. The vendor's pitch is alignment to value — you only pay when the AI resolves a ticket. The reality is that the price doesn't drop with volume, doesn't drop as the cheap models improve, and includes a healthy margin on top of the actual compute.
The actual compute is fractions of a cent. The math is the post.
What a Customer-Support Conversation Actually Costs Per Token
A typical customer-support exchange — customer asks a question, agent responds with order/account context, customer follows up — is about 600 input tokens (ticket + order context + knowledge-base hit) and 400 output tokens (drafted response). Cost per ticket on the major models:
| Model | Input ($/MTok) | Output ($/MTok) | $ per ticket | When to use it |
|---|---|---|---|---|
| Claude Sonnet 4.6 | $3 | $15 | $0.008 | Standard support workhorse |
| GPT-5 mini | ~$1.50 | ~$6 | $0.003 | General-purpose mid-volume |
| Claude Haiku 4.5 | $1 | $5 | $0.003 | High-volume FAQ / order status |
| Gemini 3 Flash | $0.35 | $1.05 | $0.0006 | Cheapest hosted (sub-cent) |
| Llama 4 (self-hosted on small VPS) | ~$0 | ~$0 | ~$0 | Flat-rate; whole-company workloads |
Standard workhorse: less than a penny per ticket. Cheapest hosted: six hundredths of a cent. Self-hosted on a small VPS: flat-rate, marginal cost rounds to zero.
For comparison, Intercom Fin charges $0.99 per AI-resolved conversation. That's 120–1,500× the underlying token cost depending on which model the vendor is actually running.
Monthly Bills at Three Scale Tiers
- Small business (20 employees, e-commerce): ~5,000 tickets/month
- Mid-market SaaS (200 employees): ~50,000 tickets/month
- Enterprise / consumer brand: ~500,000 tickets/month
Monthly cost using Claude Haiku 4.5 (the standard cheap-and-fast model for high-volume customer support) vs the per-conversation alternatives:
| Approach | Pricing shape | SMB (5K/mo) | Mid-market (50K/mo) | Enterprise (500K/mo) |
|---|---|---|---|---|
| Intercom Fin | $0.99 per AI-resolved conversation | $4,950 | $49,500 | $495,000 |
| Ada / Ultimate.ai / Forethought | ~$0.30–0.80 per resolution | ~$1,500–4,000 | ~$15K–40K | ~$150K–400K |
| Per-agent seat (legacy support SaaS) | ~$50/agent/mo × team | ~$500 | ~$5,000 | ~$30,000 |
| Direct API — Claude Haiku 4.5 | Token-based | ~$15 | ~$150 | ~$1,500 |
| Direct API — Gemini 3 Flash | Token-based (cheapest) | ~$3 | ~$30 | ~$300 |
| ibl.ai self-hosted (Llama 4) | Flat license + VPS/GPU | ~$100–250 | ~$1,000–2,500 | ~$5,000–10,000 |
At enterprise scale, Intercom Fin is ~50× more expensive than self-hosted on ibl.ai, and ~330× more expensive than running the cheapest hosted model directly — for the same conversations resolved.
Why the Per-Conversation Pricing Trap Is Different Here
Customer-support AI vendors have a specific argument for per-conversation pricing: "we only charge when we deliver value." It sounds aligned. It's not. Three reasons:
1. The marginal cost is fractions of a cent. A vendor charging $0.99 per resolution and running Haiku 4.5 underneath has a 99%+ margin on the actual model spend. The $0.99 is value capture, not cost recovery.
2. The unit price doesn't drop with scale. A 500K-ticket/month enterprise pays the same $0.99/ticket as a 5K-ticket SMB. With self-hosted infrastructure, the per-ticket cost drops to near-zero as volume grows.
3. "Resolution" is the vendor's definition. What counts as an AI-resolved ticket (vs handed off to a human, vs escalated, vs re-opened) is the vendor's call — and changes the bill. Token-based or self-hosted has no such ambiguity.
What Stays the Same, What Changes
Self-hosting customer-support AI doesn't mean rebuilding the support stack. The customer-facing chat widget, the agent dashboards, the ticket routing, the integrations with the existing CRM (Salesforce, HubSpot, Zendesk, Front), the order system (Shopify, WooCommerce, Stripe), the audit logs — all stays managed by ibl.ai. The compute, the model, and the customer conversation data move inside the company's infrastructure (which for SMB is often a $20–50/month VPS).
What disappears: the per-conversation bill that scales linearly with ticket volume — and that's a $500K/month line item at enterprise scale.
What appears: a self-hosted customer-support AI with a model-routing recipe the support team designed:
- Sonnet for complex multi-turn cases requiring reasoning across several systems
- Haiku for standard support (the bulk)
- Gemini Flash for high-volume order-status / FAQ / routing
- Llama 4 self-hosted for the bulk of routine queries on flat-rate infrastructure
Run the Numbers for Your Business
For SMB-specific cost modeling, two interactive tools:
- AI Cost Calculator — Small Business — compare flat-rate vs per-seat / per-conversation subscriptions
- AI Readiness Assessment — Small Business — 5-question quiz to score deployment readiness
For the segment-wide cost-math context, see AI Cost Math for Small Business: Per-Seat vs Usage-Based in 2026.
For higher-volume / mid-market / enterprise contexts, the AI Help Desk Cost Savings Calculator generalizes to customer-support workloads.
For the broader pricing landscape across every model and per-seat vendor, the hub: What Does AI Actually Cost in 2026?.
Why Family-Owned and New York Matters Here
For a business of any size, the customer-support AI vendor relationship is a multi-year commitment that affects every conversation customers have with the brand. ibl.ai is family-owned and operated from New York, NY — a long-term partner with a perpetual platform license and no investor exit pressure. The runtime is open source. The customer conversation data stays inside the business's infrastructure. The math works at a 5-person startup or a 50,000-employee multi-brand enterprise.