The Most Useful AI Models Might Be the Small Ones

By James Conner

Beyond the LLM Hype: The Quiet Power of Small Language Models

Everyone's obsessed with LLMs right now, but while the spotlight's on the giants, I keep thinking a lot of people are sleeping on Small Language Models (SLMs).

I was recently reading a paper about an LLM performing a million steps of execution, error free. Except it didn't. The LLM's actual job was to break the giant task into bite-sized chunks and hand them off to SLMs that did the heavy lifting.

As I dug into the paper, I started thinking about how I've been building up SLMs over the past couple of years to quietly do the heavy lifting in my home setup.

The Reality of Home Automation: Constant Decision Making

                            Home Infrastructure Overview:
                            Media catalogs and streaming management
Custom DNS-based ad blocking systems
Comprehensive home automation platform
Complex multi-VLAN network architecture
Real-time security monitoring and threat detection

                        

I run a bunch of quality-of-life services at home. Media catalogs, custom ad blocking, a home automation system that may or may not be sentient, and an absurdly complicated home network. Things mostly run themselves... until the universe decides to have a little fun, and then I'm neck-deep in logs, blocklists, schema changes, and whatever else the tech gods feel like throwing at me that day.

The common thread across all these services is that something somewhere always needs a decision. A strange DNS lookup happens: should I block it? Some device starts sending traffic that looks like it came from a very confused refrigerator: do I intervene? A media provider changes its schema again because apparently consistency is overrated. It's always something.

Sure, I could handle some of this with regex and logic trees, but over the years, I've discovered two universal truths: the variability (and thus complexity) of data is near infinite due to the inability of developers to agree on standards, and there is a surprising overlap in the behaviors and outputs of "well intentioned" developers and malicious actors.

So I turned to SLMs to offload some of the decision making.

SLMs in Practice: Structured Decision Making

These days I run the Small Language Models on Ollama, but LM Studio or any other local model runner works just fine. I give each SLM a tiny, explicit prompt along with a structured schema I validate with Pydantic. The goal's simple: shrink the problem until the SLM can give me near-deterministic answers. Once I get there, automation becomes way easier. Of course the model still has to be accurate enough to trust, because "deterministic but wrong" is Not Helpful™.

                            SLM Implementation Stack:
                            Model Runtime: Ollama for local model execution
Schema Validation: Pydantic for structured output validation
Safety Layer: Custom MCP server for additional guardrails
Integration: Direct API calls to home automation systems
Monitoring: Structured logging and decision audit trails to NAS

                        

The SLM handles messy input, and I use its structured output downstream. This isn't groundbreaking, but it saves me a shocking amount of code because the model handles the interpretation and judgment inside the guardrails of my prompt.

And because I don't fully trust my future AI overlords yet, everything runs through an MCP server I wrote to enforce another set of safety rules. Security should have layers. Just like ogres. And onions. And parfaits. Everybody loves parfaits.

Why SLMs Excel at Edge Cases

The beauty of SLMs in this context isn't their raw intelligence. It's their ability to handle the long tail of edge cases that would require increasingly complex rule-based systems to manage. When a new IoT device joins the network with a MAC address that doesn't follow standard conventions, or when a streaming service decides to encode metadata in a completely novel way, the SLM can make reasonable decisions without me having to anticipate every possible scenario.

This approach has proven particularly valuable for security decisions. Traditional rule-based systems excel at catching known bad patterns but struggle with the gray areas where legitimate traffic might look suspicious, or where malicious actors are trying to blend in. An SLM can evaluate context, consider multiple factors simultaneously, and make nuanced decisions that would be difficult to encode in traditional logic.

Is it always going to be right? Heavens no, though I certainly expect it to improve substantially over the next few years.

The Economics of Small Models

There's also a practical advantage to SLMs that often gets overlooked: cost and latency. Running a 7B parameter model locally for decision-making tasks costs essentially nothing after the initial setup. There are no external API call costs, no usage limits, and no concerns about sending potentially sensitive home network data to external services.

The response times are fast enough for real-time decision making, which is crucial when you're trying to automatically block suspicious network traffic or respond to home automation events. A 200ms decision loop is perfectly acceptable for most home automation scenarios, and often faster than the mechanical systems that need to respond anyway.

The Future is Small and Focused

SLM Benefits in Practice:

Zero ongoing operational costs after initial setup
Sub-200ms response times for real-time decisions
Complete data privacy with local processing
Deterministic outputs through structured schemas
Simplified codebase by offloading interpretation logic

SLMs have made my home ecosystem simpler and much more reliable. They let me offload the ambiguity of unstructured data, and in return I get consistent, actionable decisions. I think the future of AI isn't just about bigger models; it's also about the small ones quietly doing one thing really well.

The real insight from that research paper wasn't about the million-step execution. It was about the division of labor: large models for orchestration and planning, small models for focused execution. This pattern applies far beyond academic research. In production systems, in home automation, and in countless other domains, the most effective AI implementations might not be the ones that make headlines.

While everyone's racing to build bigger models, there's enormous untapped potential in making smaller models incredibly good at specific tasks. The future of AI might be less about artificial general intelligence and more about artificial specialized intelligence, deployed at the edge, making millions of small decisions that collectively create seamless, intelligent systems.

Sometimes the most powerful solution is the one that quietly handles the details while everyone else is focused on the spectacle.