ibl.ai Agentic AI Blog

Insights on building and deploying agentic AI systems. Our blog covers AI agent architectures, LLM infrastructure, MCP servers, enterprise deployment strategies, and real-world implementation guides. Whether you are a developer building AI agents, a CTO evaluating agentic platforms, or a technical leader driving AI adoption, you will find practical guidance here.

Topics We Cover

Featured Research and Reports

We analyze key research from leading institutions and labs including Google DeepMind, Anthropic, OpenAI, Meta AI, McKinsey, and the World Economic Forum. Our content includes detailed analysis of reports on AI agents, foundation models, and enterprise AI strategy.

For Technical Leaders

CTOs, engineering leads, and AI architects turn to our blog for guidance on agent orchestration, model evaluation, infrastructure planning, and building production-ready AI systems. We provide frameworks for responsible AI deployment that balance capability with safety and reliability.

Back to Blog

The Custom Silicon Race Signals Enterprise AI's Next Phase

Mikel AmigotJune 30, 2026
Premium

Enterprise AI spending has shifted from training to inference. Custom silicon startups are racing to capture this market β€” and the implications for enterprise AI strategy are profound.

The Short Answer

The AI industry's center of gravity is shifting from training to inference β€” from "how big is the model" to "how cheaply can we run it." Custom inference silicon, like Etched's Sohu ASIC (which raised $800M and $1B+ in contracts before shipping a chip), is purpose-built to serve transformer models far cheaper than general-purpose GPUs.

For enterprises, the lesson is that inference cost is about to fall sharply β€” which makes owning a self-hosted AI stack more compelling than renting per-seat access to someone else's models.

The enterprises positioned to capture that gain own a model-agnostic platform they can move onto whatever silicon wins. ibl.ai is that stack: full source-code ownership, any-LLM routing, and deploy-anywhere β€” so a hardware shift becomes your cost advantage, not your vendor's.

The Training Era Is Over. The Inference Era Has Arrived.

For years, the AI industry measured progress by one metric: how big can we make the model?

That question has quietly become irrelevant for most enterprises.

The new question is simpler and more urgent: how cheaply and quickly can we run the model we already have?

$800M Says Inference Is the Bottleneck

Etched, a startup founded by two Harvard dropouts, just emerged from stealth with numbers that tell the whole story.

They raised $800 million and secured over $1 billion in customer contracts β€” all before shipping their first chip.

Their product is Sohu, a transformer-specific ASIC designed entirely in the post-ChatGPT era.

Unlike NVIDIA's GPUs, which handle everything from gaming to scientific computing, Sohu does exactly one thing: serve transformer-based AI models as fast and cheaply as possible.

The team behind it β€” 400+ engineers recruited from NVIDIA, Google TPU, Broadcom, SK Hynix, and TSMC β€” represents one of the largest talent concentrations in custom AI silicon outside the hyperscalers.

Why This Matters for Enterprise AI

The shift from training to inference isn't just a technical footnote.

It changes the economics of every enterprise AI deployment.

Training a frontier model is a one-time cost measured in hundreds of millions.

Serving that model to millions of users is an ongoing cost that compounds daily.

Industry data shows 71% of enterprise AI spending now goes to inference β€” running models in production, not building new ones.

This is why the silicon war has shifted.

NVIDIA dominates training with its GPU ecosystem.

But for inference, the market is fragmenting.

OpenAI designed JalapeΓ±o with Broadcom, pricing it at roughly half a comparable GPU for inference workloads.

Google continues to iterate on its TPU line.

Amazon built Inferentia and Trainium for its own cloud customers.

And now Etched enters with a transformer-only architecture that trades generality for raw serving performance.

The Enterprise Decision Framework

For enterprise leaders, this fragmentation creates both opportunity and risk.

The opportunity is clear: more competition in inference silicon means falling costs.

Organizations that architect their AI infrastructure to be hardware-agnostic will benefit from every new entrant.

The risk is equally clear: betting on a single silicon vendor creates the same lock-in problem that plagued the enterprise software era.

The organizations best positioned for this transition share three characteristics.

First, they use model-agnostic platforms that can route inference to whatever hardware offers the best price-performance at any given moment.

Second, they own their inference infrastructure rather than renting it, converting ongoing costs into capitalizable assets.

Third, they deploy on their own infrastructure β€” whether cloud, on-premise, or air-gapped β€” maintaining control over where models run and how data flows.

What Comes Next

The custom silicon race is still in its early innings.

Etched's $1 billion in contracts before shipping suggests enterprise demand for inference optimization is massive and largely unmet.

But hardware alone won't solve the enterprise inference problem.

The software layer β€” model routing, load balancing, cost optimization, and governance β€” is equally critical.

Organizations that invest in inference-aware architectures today will have a structural cost advantage as these new chips come to market.

The training era created the models.

The inference era determines who actually uses them at scale.

That's where enterprise value gets created.

See the ibl.ai AI Operating System in Action

Discover how leading universities and organizations are transforming education with the ibl.ai AI Operating System. Explore real-world implementations from Harvard, MIT, Stanford, and users from 400+ institutions worldwide.

View Case Studies

Get Started with ibl.ai

Choose the plan that fits your needs and start transforming your educational experience today.