The Toolbox Nobody Talks About: Why MCP Servers Are the Real Story in Agentic AI

By PS Advisory Team

Everyone wants to talk about the model.

Which one reasons better. Which one has the longest context window. Which one can run for six hours without losing the thread. And to be fair, those things matter. Model capability, intelligence, and the ability to handle long-running autonomous tasks are all genuinely critical advances.

But here's what almost nobody is talking about, and what I think is actually the more important story right now: an agent without tools is just a very articulate mechanic standing in an empty garage. It doesn't matter how much that mechanic knows about engines if there's no wrench in reach. It doesn't matter how brilliant the architect is if there's no drafting table, no compass, no straightedge. Knowledge without the ability to act on the world is a party trick, not a business capability.

That's what tools are for an agent. And in the world we're building in right now, the way an agent gets tools is through something called an MCP server.

What an MCP server actually is

MCP stands for Model Context Protocol. Strip away the acronym and what you're really talking about is a standardized way for an AI agent to reach out of its own head and into the real world: your CRM, your policy admin system, your document repository, your internal APIs, and actually do something. Look something up. Create a record. Kick off a workflow. Pull a file.

Without an MCP server, an agent can reason brilliantly about your business and still be completely unable to touch it. With one, that same agent can go look at the actual account, actually check the actual claim, actually update the actual record, not describe what it would theoretically do, but do it.

We just finished building one for a client of ours (an insurance carrier I'll leave unnamed for now) alongside the application engine that powers their platform. And going through that build end to end crystallized something for me: the MCP isn't a supporting piece of the architecture. It's the piece where the architecture stops being theoretical.

Where the rubber meets the road

Picture the layers of an agentic system stacked on top of each other. At the top, you've got the model, the reasoning engine. Below that, the orchestration layer that manages tasks, memory, and workflow. And at the bottom, where all of that reasoning either becomes real or stays hypothetical, sits the MCP layer.

This is the crossroads. It's where potential becomes action. You can have the smartest possible model sitting on top of the most elegant orchestration layer in the world, and if the MCP layer beneath it is thin, poorly designed, or insecure, none of that intelligence ever actually reaches your business.

I keep coming back to a car analogy for this, maybe because it's such an easy one to picture: everybody wants to talk about horsepower, but the tires are what actually put that horsepower onto the road. You can have a race-car engine and still go nowhere fast if the tires are wrong. MCPs are the tires. They're unglamorous, they're rarely the headline, and they're the entire reason the powerful part of the system means anything at all.

Four things that actually matter when you build one

Having now gone through this build, there are four pieces that, in my view, separate a real MCP implementation from a demo:

Designing it well. What tools does the agent actually need, scoped to what it should actually be allowed to do? This isn't "expose everything and hope for the best." It's a real design exercise, figuring out the right surface area for the agent to operate on.
Giving it the right tools. Not every capability your system has should become a tool the agent can reach for. The tools you expose are, in effect, the agent's entire vocabulary for acting in the world. Choose deliberately.
Testing it thoroughly. An agent will use a tool exactly as it's built to be used, including all the edge cases and unintended consequences you didn't think through. This has to be tested like production infrastructure, because it is production infrastructure.
Hosting it securely. This is the layer with the keys to your actual systems. Authentication, authorization, and access boundaries here aren't a nice-to-have; they're the whole ballgame.

What this actually looks like: a rating and pricing MCP

Abstractions only go so far, so let me make this concrete. Say you're building an MCP server that sits in front of a rating and pricing engine, the kind of system an underwriter uses to price a commercial account. What would the agent's actual toolbox look like?

Something like this:

price_account: takes a submission and produces a bindable-quality quote, running it through the actual rating logic rather than approximating it.
compare_options: prices a base account alongside two or three named variations side by side, so a broker asking "what if we raised the deductible" gets a real answer, not a guess.
lookup_rate: pulls the actual rows from a rate table, optionally as of a specific effective date, so the agent is citing the rate that was actually filed rather than reconstructing one from memory.
check_class_codes: a pre-check that validates payroll class codes against the current rating manual before the agent commits to a full pricing run.
explain_premium: returns the ordered derivation chain behind one number, meaning which factors, tables, and rules combined to produce that premium. This one matters as much for trust as for function. A rating tool nobody can explain is a rating tool nobody will bind on.
render_quote: turns a priced account into an actual print-ready quotation document.

Notice what's not on that list: nothing that lets the agent silently change a bound policy, override an authority limit, or skip a referral. Those either don't exist as tools, or they route through a check_authority-style tool that flags when something needs a human. That's the design discipline from pillar one showing up directly in the tool list. The agent's capability boundary and the business's risk boundary should be the same line.

This is also where the "guidance" question from a few weeks back becomes very real rather than theoretical. A tool called price_account with a two-word description will get called at the wrong moment with the wrong inputs. A tool called price_account with a description that says what a complete submission requires, what "bindable-quality" means versus an indication, and when to reach for compare_options instead, gets used the way an underwriter would actually use it. The tool list is the org chart of what the agent is allowed to touch. The descriptions are the training manual for how to touch it well.

Why every company will eventually build these

I'll say the quiet part out loud: I think within a few years, every company of any real size is going to be designing, hosting, and maintaining its own MCP servers, the same way every company today maintains its own APIs. This isn't optional infrastructure for the ambitious few. It's going to be table stakes.

The companies that get there first, that treat the MCP layer with the same rigor as their core systems rather than as an afterthought bolted on to make a demo work, are the ones whose agents will actually be useful inside their business, rather than just impressive in a sales pitch.

The model gets the headlines. The MCP is what makes any of it real. That's the part worth paying attention to.

(One distinction I'll save for a future post: how an MCP server differs from a "Skill" in the Claude sense. They solve related but genuinely different problems, and conflating them is a mistake I'm already seeing people make.)