The Custom Silicon Race Signals Enterprise AI's Next Phase

Mikel AmigotJune 30, 2026

Premium

Enterprise AI spending has shifted from training to inference. Custom silicon startups are racing to capture this market — and the implications for enterprise AI strategy are profound.

The Short Answer

The AI industry's center of gravity is shifting from training to inference — from "how big is the model" to "how cheaply can we run it." Custom inference silicon, like Etched's Sohu ASIC (which raised $800M and $1B+ in contracts before shipping a chip), is purpose-built to serve transformer models far cheaper than general-purpose GPUs.

For enterprises, the lesson is that inference cost is about to fall sharply — which makes owning a self-hosted AI stack more compelling than renting per-seat access to someone else's models.

The enterprises positioned to capture that gain own a model-agnostic platform they can move onto whatever silicon wins. ibl.ai is that stack: full source-code ownership, any-LLM routing, and deploy-anywhere — so a hardware shift becomes your cost advantage, not your vendor's.

The Training Era Is Over. The Inference Era Has Arrived.

For years, the AI industry measured progress by one metric: how big can we make the model?

That question has quietly become irrelevant for most enterprises.

The new question is simpler and more urgent: how cheaply and quickly can we run the model we already have?

$800M Says Inference Is the Bottleneck

Etched, a startup founded by two Harvard dropouts, just emerged from stealth with numbers that tell the whole story.

They raised $800 million and secured over $1 billion in customer contracts — all before shipping their first chip.

Their product is Sohu, a transformer-specific ASIC designed entirely in the post-ChatGPT era.

Unlike NVIDIA's GPUs, which handle everything from gaming to scientific computing, Sohu does exactly one thing: serve transformer-based AI models as fast and cheaply as possible.

The team behind it — 400+ engineers recruited from NVIDIA, Google TPU, Broadcom, SK Hynix, and TSMC — represents one of the largest talent concentrations in custom AI silicon outside the hyperscalers.

Why This Matters for Enterprise AI

The shift from training to inference isn't just a technical footnote.

It changes the economics of every enterprise AI deployment.

Training a frontier model is a one-time cost measured in hundreds of millions.

Serving that model to millions of users is an ongoing cost that compounds daily.

Industry data shows 71% of enterprise AI spending now goes to inference — running models in production, not building new ones.

This is why the silicon war has shifted.

NVIDIA dominates training with its GPU ecosystem.

But for inference, the market is fragmenting.

OpenAI designed Jalapeño with Broadcom, pricing it at roughly half a comparable GPU for inference workloads.

Google continues to iterate on its TPU line.

Amazon built Inferentia and Trainium for its own cloud customers.

And now Etched enters with a transformer-only architecture that trades generality for raw serving performance.

The Enterprise Decision Framework

For enterprise leaders, this fragmentation creates both opportunity and risk.

The opportunity is clear: more competition in inference silicon means falling costs.

Organizations that architect their AI infrastructure to be hardware-agnostic will benefit from every new entrant.

The risk is equally clear: betting on a single silicon vendor creates the same lock-in problem that plagued the enterprise software era.

The organizations best positioned for this transition share three characteristics.

First, they use model-agnostic platforms that can route inference to whatever hardware offers the best price-performance at any given moment.

Second, they own their inference infrastructure rather than renting it, converting ongoing costs into capitalizable assets.

Third, they deploy on their own infrastructure — whether cloud, on-premise, or air-gapped — maintaining control over where models run and how data flows.

What Comes Next

The custom silicon race is still in its early innings.

Etched's $1 billion in contracts before shipping suggests enterprise demand for inference optimization is massive and largely unmet.

But hardware alone won't solve the enterprise inference problem.

The software layer — model routing, load balancing, cost optimization, and governance — is equally critical.

Organizations that invest in inference-aware architectures today will have a structural cost advantage as these new chips come to market.

The training era created the models.

The inference era determines who actually uses them at scale.

That's where enterprise value gets created.

← PreviousEnterprise AI Data Integration: The Ontology-First Approach Next →Higher Education AI: Unify Campus Data With an Ontology

Samsung's $73 Billion Bet on Agentic AI — And What It Means for Your Organization

Samsung's $73B AI chip investment signals what the industry already knows: agentic AI — where interconnected agents run across an organization's operations — is the next infrastructure layer. Here's what that means technically, and how organizations should prepare.

Mikel AmigotMarch 20, 2026

Enterprise AI Security: Protecting Your AI Infrastructure

Security considerations and best practices for protecting enterprise AI infrastructure from development through production.

Mikel AmigotFebruary 11, 2026

MiniMax's 2.7-Trillion-Parameter Model Proves Enterprise AI Must Be Model-Agnostic

MiniMax is preparing a 2.7-trillion-parameter open-source model — the largest ever. Here is why enterprises that locked into a single model vendor are about to pay for it.

Miguel AmigotJuly 8, 2026

The Open-Source Model Explosion Is Rewriting Enterprise AI Strategy

A food delivery company built a frontier AI model. Export controls pulled another offline. The enterprise takeaway: own your infrastructure or lose access to it.

Mikel AmigotJuly 2, 2026

See the ibl.ai AI Operating System in Action

Discover how leading universities and organizations are transforming education with the ibl.ai AI Operating System. Explore real-world implementations from Harvard, MIT, Stanford, and users from 400+ institutions worldwide.

View Case Studies

Get Started with ibl.ai

Choose the plan that fits your needs and start transforming your educational experience today.

ibl.ai Agentic AI Blog

Topics We Cover

Featured Research and Reports

For Technical Leaders