Smart Model Routing: One Way to Prevent Vendor Lock-In?

Smart Model Routing, implemented in my Dynamic Model Router (a Pi extension, details below), is more than just a technical solution. The idea for Smart Model Routing had been brewing for a while, especially as I grappled with vendor lock-in and token costs. While finishing the dynamic routing extension for Pi, I noticed that many readers had found my last blog post through Mastodon. One user’s take on it got me thinking about how my experiment with smart model routing fit into the bigger picture:

„GSD makes MistralAI usable.
I have access to the full GSD feature set with a non-Claude model, and I haven’t noticed any differences.“
Mistral AI, pi.dev, GSD: A Real Alternative

This lines up with what Satya Nadella recently posted on X in A frontier without an ecosystem is not stable:

„This means the real opportunity is not in picking the best model but instead in building a learning loop on top of models where human capital and token capital compound.“
x.com/satyanadella/status/2066

At the time, I hadn’t read the article yet, and since I no longer use X, accessing it was tricky. For those unfamiliar with Nadella’s article, here it is in full:

Satya Nadella – A frontier without an ecosystem is not stable

(Source: x.com/satyanadella/status/2066, May 2026)

A frontier without an ecosystem is not stable.

I’ve been thinking a lot about the future of the firm in an AI-driven economy.

This transition is different than any previous platform shift. In the past, we used digital systems to enhance human capital. This is the first time we can create a real cognitive loop between people and digital systems. That is a mind-bender, because it changes how we even conceptualize work inside an enterprise.

What is at stake is not some digital tool or system and its use, but how organizations continue to learn, build IP, differentiate, and thrive in a world where AI models can continuously absorb the expertise of humans and organizations and commoditize it.

Every company is going to have to build what I think of as human capital and token capital. Human capital comprises the knowledge, judgment, relationships, ingenuity, and pattern recognition of its people, while token capital is the firm’s AI capability it builds and owns.

Importantly, human capital does not become less valuable as token capital grows. It only becomes more valuable! I believe human agency will be the driver of token capital growth. Humans will set ambitious goals, connect dots across domains, build relationships, and recognize patterns that matter most. Without human direction, you have compute running in circles.

This means the real opportunity is not in picking the best model but instead in building a learning loop on top of models where human capital and token capital compound. You can offload a task, or even a job, but you can never offload your learning. The future of the firm is the ability to compound that learning across people and AI.

This requires a new architectural approach where every business is able to build agentic systems that improve over time, while still retaining control over their IP. A company should be able to switch out a generalist model without losing the company veteran expertise built into their learning system. This is the key test of your control and sovereignty in the era ahead.

Companies need to turn their workflows, domain knowledge, and accumulated judgment into AI systems that improve with each use. Private evals should capture whether a model is actually improving against outcomes that matter to the business (not just external benchmarks!). Private reinforcement learning environments should let models grow stronger on real traces from inside the organization. Its knowledge base makes institutional memory queryable and use of tokens more efficient.

This loop becomes the new IP of the firm. I think of it as a hill climbing machine. And unlike most assets, it compounds. Every improved workflow generates better training signal, which accelerates the accumulation of tacit knowledge unique to the firm. The companies that build this early will have an advantage that is hard to replicate, regardless of any new individual model capability.

The last thing any of us want is a world where every company across every sector is ceding value to a few models that eat everything they see. If all the value is accrued by only a few models, the political economy will simply not tolerate it. There is no societal permission for an AI future that hollows out entire industries.

Think about what happened in the first phase of globalization where entire industrial economies were hollowed out by outsourcing. The GDP numbers looked fine on the surface, but the displacement was real and the consequences are still being felt. Let us not bring that dynamic into the AI era, with a small number of AI systems capturing all the economic returns, while entire industries find their knowledge commoditized right out from underneath them.

In my view, our priority has to be building a frontier ecosystem, not just a frontier model, so value flows broadly across every company, every industry, and every country. One where every organization can own the learning loop that encodes its institutional knowledge, compounding its human and token capital.

This is the ethos I’ve grown up with where platforms enable more value on top than is captured inside, and where every company can continuously innovate and build value of its own.

When that happens, companies will create value for themselves and for the economy around them. Employees will see their expertise amplified and their judgment become part of systems that make it replicable and scalable and the benefits accrue to the companies and communities around them.

That is how companies drive value for themselves and the broader economy. And it is the stable equilibrium we should build together.

Core Thesis

In the AI era, the decisive factor is not the best model but the ability to build a stable ecosystem that connects and multiplies Human Capital (human knowledge, judgment, relationships) and Token Capital (a company’s own AI capabilities).

Nadella defines Token Capital as a company’s own AI capability, a strategic asset, not the API bill at the end of the month. But that’s exactly where the practical question arises, how do I build the ecosystem Nadella describes without burning through tokens, and money, in the process?

His key takeaway puts it succinctly:

The future of the firm is the ability to compound learning across people and AI. You can offload a task, but you can never offload your learning.

I find it interesting, and somewhat flattering, that my last blog post was linked to these ideas.

Recently, I’ve been thinking about how to save tokens. One thing quickly becomes clear, with frontier models, you can burn through a lot of tokens, and thus a lot of money, very quickly.

So, how do I build an ecosystem that serves me in the best and most efficient way possible? How do I ensure that a request doesn’t always use the largest possible model but the best-fitting one? For example, if I want to access a file and generate a simple summary, I don’t need the largest and most performant model. But if I then want to use what I’ve learned to design an architecture, I need a better model, even the best one available.

The Dynamic Model Router

The answer to my question came in the form of a Pi extension, the Dynamic Model Router, built on the foundation of a-canary/pi-model-router. This extension already handles basic routing based on model quality, availability, and price. What was missing for me was the automatic decision of which model group is the right one for a given request, and that’s exactly what I added in my fork.

Pi supports the concept of Model Groups, where you can tell the system which models are responsible for which tasks. The base extension’s router already makes intelligent choices:

Free models over paid ones,
Cheaper over more expensive,
Automatic retry and key rotation if a model doesn’t respond.

But what was still missing was content-based analysis of the request itself.

That’s where my fork comes in. Before a request is sent to a model, a small, local model (Ollama, gemma4:12b-mlx, or gemma2:2b, running locally, costing nothing) classifies the prompt into one of six categories:

Simple code changes,
Complex code,
Architecture design,
Planning,
Exploration,
Fallback (if unclear).

Based on this classification, the request is routed to the appropriate Model Group:

Operational – for simple tasks, using local or low-cost models,
Tactical – for medium complexity,
Strategic – for architecture decisions that require the best available model,
Scout – for open research tasks.

In practice, this means, if I want to summarize a file, it goes to a small, fast model. If I then want to design an architecture, Claude Opus or GPT-4o automatically takes over, without me having to switch manually. The system doesn’t choose the largest model but the best-fitting one.

Additionally, there’s an escalation mechanism: if a session stalls, the model keeps returning incorrect or unusable answers, the router automatically detects this and escalates to the next group. Operational becomes tactical, tactical becomes strategic. No manual intervention is required.

Beyond efficiency, the router is a strategic tool for AI sovereignty. By diversifying model usage across groups, you avoid dependency on any single provider, a lesson reinforced by Anthropic’s Fable case. This isn’t just about saving tokens, it’s about retaining control over your AI supply chain.

The router is available as an open-source project on GitHub, github.com/ANierbeck/pi-model-dynamic-router.

Connecting Back to Nadella

While discussing his article, I was reminded of a quote by John J. Pershing:

John J. Pershing:
Infantry wins battles, logistics wins wars.

This isn’t about war, but in essence, economics is nothing more than the optimization of supply chains. Here, the human plays a critical role. Relying on a single source, a single service provider, a single model, is a strategic mistake. The moment a supplier achieves an absolute position of power, you lose control. But it’s not just about avoiding dependency, it’s about recognizing that humans are the linchpin of the value chain. Without human expertise, judgment, and oversight, even the best AI systems will fail to deliver meaningful results.

How does this relate to AI? In today’s AI landscape (June 2026, by the time you read this, it might already be outdated), everything revolves around models, which model is the best? How much does it cost? How do I use it most effectively? In the broadest sense, this is about the supply chains of the software industry.

There’s no need to debate whether AI is here to stay, that’s not a productive discussion. But we do need to discuss how much, how far, and in what ways we should use AI, and how humans fit into this AI-driven environment. Humans are not just users of AI, they are the architects, the validators, and the ultimate decision-makers. And that’s where we return to the point Nadella made:

The last thing any of us want is a world where every company across every sector is ceding value to a few models that eat everything they see.

We only need to look at Anthropic: pressured by the US government to restrict its best model to US users only (how verification would even work remains unclear). The consequence? Anthropic completely removed the Fable and Mystic models from the market (the latter was already only available to a very limited audience).

Thus, the Fable case proves, AI sovereignty is not an option, it’s a necessity. Whoever doesn’t control their supply chain is a hostage to politics, providers, and chance.

But how does this connect to Nadella’s vision of a learning loop? The answer is simple: you can’t have a learning loop without the freedom to choose and adapt. While many tools allow you to switch between models, the Dynamic Model Router goes further. It doesn’t just enable freedom of choice, it actively optimizes model selection based on the task at hand. By classifying requests and routing them to the most suitable model group, it ensures that every task is handled by the best available model. And if a model fails to deliver useful results, the router automatically escalates to the next group, reducing trial and error. By giving you the freedom to select the best model for each task without being locked into a single vendor or constrained by cost, it removes the barriers to building a true learning ecosystem. Without this freedom, you’re stuck in a world where a few models dictate the terms, and no real learning can happen.

This router is already a practical step toward Nadella’s vision. By using different models and seeing firsthand which one is good enough or just fails miserably at your task, you’re taking the first steps toward compounding knowledge. I might add an audit trail in the future to better understand the reasoning behind model selection, but even without it, the router could be another building block, right alongside the Pi harness tool, in creating a true AI learning loop.

It gives more opportunities to use your token capital efficiently, so you have more for your human capital.

Disclaimer: This blog post was written by me, with the help of Vibe (previously known as Le Chat), an AI assistant by Mistral AI.

Smart Model Routing: One Way to Prevent Vendor Lock-In?

Core Thesis

The Dynamic Model Router

Connecting Back to Nadella

Kommentare

Schreibe einen Kommentar Antwort abbrechen