# Context, Agents, MCP: A Working Glossary for Platform Teams

*Published: May 11, 2026 | Author: David Tuite | Tags: context-engineering, mcp-servers, idp, ai*


# Context, Agents, MCP: A Working Glossary for Platform Teams

The vocabulary around AI agents is a mess right now. "Context" means different things depending on which tool or blog post you're reading. "MCP" gets used for the protocol and the server in the same sentence. "Agent" covers everything from a one-shot LLM call to a fully autonomous system that writes code and merges PRs.

If you're building on a platform team and trying to hold a coherent conversation about how AI fits into your stack, shared definitions help. These are the terms we use at Roadie. They cluster into three groups: agents, context, and MCP. For each term below: a definition, a concrete example from platform engineering, and a note on how it connects to the others. If you want the full argument for why context architecture matters before diving into definitions, [Smart Agents Need Smart Context](/blog/smart-agents-smart-context/) covers that ground.

## Agents

An agent is an LLM-driven system that can plan and execute multi-step work by combining reasoning with tool use. In practice, an agent wraps a model with a goal or task specification, access to tools and data sources, memory or state, and guardrails covering permissions, policies, and evaluation hooks.

What separates an agent from a plain prompt-and-response is that the model decides what steps to take, uses tools to get information or take actions, and checks its work against the original goal. An agent investigating a production incident queries logs, checks recent deployments, correlates alerts, and produces a structured finding. It can open a ticket or page a team.

Agents need context to do useful work. That context comes from the context layer and is accessed via tools. MCP is the protocol that makes tool access standardised across different systems.

### Harness

A harness is the surrounding runtime and scaffolding that makes an agent reliable and testable.

If the agent is the reasoning engine, the harness is everything around it: the system instructions that set the agent's behaviour, the adapters that connect it to tools and handle retries, the state management that tracks intermediate work across a multi-step task, and the logging and evaluation hooks that let you see what the agent did and whether it did it correctly.

When a team says "we've built an agent for X", they usually mean they've built a harness around a model for X. The model is often off-the-shelf. The harness is the engineering work - and it's where most of the investment sits. Swapping the underlying model in a well-built harness takes days. Rebuilding the harness from scratch takes months.

In the context of this glossary, the harness is what connects an agent to its context sources and MCP tools. It handles auth, retries, and observability so the agent logic doesn't have to.

### Skill

A skill is a reusable, named capability that packages decision logic, procedural knowledge, and optionally tool use into a unit an agent can invoke for a specific class of task. Where a tool is a discrete function - fetching data or taking an action - a skill is the logic that defines how to approach a task: which steps to take, which tools to call, what conventions to apply, and how to structure the result. Some skills are purely instructional, encoding domain knowledge or output standards with no tool calls. Others orchestrate sequences of tool calls. Most non-trivial ones combine both.

An agent handling an on-call alert might invoke a "triage-deployment-failure" skill, which encodes the steps to follow, calls the deployment history tool, queries active alerts, checks the owning team's runbook, and returns a structured finding. The calling agent gets a result without reconstructing that logic each time.

For platform teams, skills are the unit of reuse in a multi-agent system. A blast-radius-assessment skill, once defined, can be called by any agent in the stack - an incident-responder, a change-management agent, a developer-facing assistant. They're also a natural testing boundary: you can verify that a skill produces the right output for a given set of tool responses without testing the full agent reasoning loop.

Skills are distinct from a Harness, which wraps a specific agent and manages its runtime. A skill is portable logic that any harness exposing the right tools can invoke.

## Context

Context is the information an agent needs to do useful work for a specific situation.

This is distinct from the model's general training. A model is trained on broad data. Context is what you inject at runtime to make that general capability specific to the task at hand. When an agent reviews a pull request, the relevant context isn't everything the model knows about code review - it's this PR, this repo's conventions, this team's recent deployment history, and the services this change touches.

Good context is relevant to the task, scoped so the model isn't processing noise, and trustworthy enough to act on. Bad context - stale, incomplete, or inaccurate - doesn't just fail to help. It causes the agent to act on wrong information, which is often worse than no information at all. The sub-concepts below describe different ways of structuring and thinking about context when you build infrastructure to support it.

### Context Plane / Context Layer

The context plane - also called the context layer - is what an agent has access to at runtime: entities, relationships, temporal state, rules, provenance, and standard operating procedures compiled into a queryable store. Usually this is a graph.

In a platform engineering context, services are nodes, ownership and dependency relationships are edges, and metadata attaches to each node. An agent assessing the blast radius of a proposed change queries this graph and gets a typed, traversable result - not a text description. The context plane is what turns a service catalog from a documentation tool into a data source agents can actually use at runtime.

The quality of the context plane depends on the quality and freshness of its underlying sources - catalog data, deployment records, observability outputs, runbooks. A graph built on stale or partial data produces stale or partial answers.

### Context Lake

A context lake is the broader collection of raw context sources available to a platform: service catalog data, repository metadata, observability signals, incident history, ticketing systems, runbooks, deployment records, and anything else an agent might eventually draw from.

The context lake is everything that could feed the context layer. It's distinct from the context layer in that not all of it is structured, current, or relevant to any given task. An agent investigating an incident needs recent deployment history and active alerts, not every runbook in the organisation. The context layer selects and structures what's relevant for each task; the context lake is where those raw inputs live.

The term is in wider market use. What Roadie means by it specifically: the full set of raw data sources that platform teams manage and that the context layer compiles from - not the compiled, queryable result, but the underlying inputs.

### Business Context Layer

The business context layer is the set of organisational and operational facts that sit above raw technical data: team ownership, cost attribution, service criticality, SLOs, compliance requirements, and incident accountability.

Technical context tells an agent what a service does and how it's connected. The business context layer tells it who owns it, what it costs to run, and what the consequences of a failure are. An agent that can traverse a dependency graph but doesn't know which team owns a downstream service - or whether that service is customer-facing - doesn't have enough context to make reliable decisions about escalation or rollback.

For most platform teams, the business context layer is the hardest part of the context problem. Not because the information doesn't exist, but because it's distributed across ticketing systems, cost dashboards, spreadsheets, and institutional memory.

### Minimum Viable Context

Minimum Viable Context (MVC) is the smallest set of context that enables an agent to complete a specific task reliably. More context is not always better. A larger context window costs more, takes longer to process, and can distract the model from what it actually needs to do. Too little context produces hallucinations and errors. Minimum Viable Context is the engineering discipline of finding the right set - the context that's necessary, and no more.

Determining the MVC for a task means asking: what does this agent actually need to do this job without making critical errors? For an agent investigating a deployment failure, the MVC might be the deployment record, the last three alerts, and the owning team's runbook. It probably doesn't need the full incident history for every service in the organisation.

Getting MVC right is one of the main levers for improving agent reliability and cost. It's also a harness design problem: a well-built harness assembles only the context the task requires, rather than passing everything available into the model's context window.

## MCP

MCP, or Model Context Protocol, is a lightweight protocol that standardises how an application or agent provides tools and context to an LLM. MCP defines a common interface for discovering tools, calling them with structured inputs, and returning structured outputs - so models can interact with external systems without custom integration code for each one.

The practical benefit is that MCP solves an integration problem at the protocol level. Before MCP, connecting a model to an external tool meant writing bespoke code for that tool, and repeating that work for every new tool. MCP replaces that with a shared specification that tools and models both implement once.

### MCP Server

An MCP Server is a service that implements the MCP specification and exposes capabilities to models. It publishes a catalog of available tools, validates inputs, enforces auth and permissions, executes tool calls or forwards them to the underlying service, and returns results in a consistent, machine-readable format.

In a platform engineering context, a service catalog backed by an MCP Server means any agent can call "which teams own services in the payments domain?" and get a typed, structured answer - not a blob of markdown from a docs search. The MCP Server handles the translation between the model's tool call and the catalog's underlying data model.

A catalog-backed MCP Server tends to be the highest-value starting point for platform teams building agent tooling, because service ownership and dependency data is already maintained and is high signal for most agent tasks.

### MCP Gateway

An MCP Gateway is a routing and policy layer that sits in front of one or more MCP Servers.

As you add more MCP Servers - one for the catalog, one for your observability platform, one for your ticketing system - you end up with scattered auth configurations, inconsistent rate limits, and no single point for setting policies about what agents can do. The gateway centralises all of that: authentication and tenant isolation, tool allowlists and safety policies, request routing and load balancing, observability across all tool calls, and version compatibility between clients and servers.

Most platform teams don't need a gateway on day one. In our experience, it becomes worth introducing somewhere between three and ten MCP Servers. Before that point, direct connections work fine. After it, the overhead of managing each server's auth and policies separately makes a gateway the cheaper option overall.

### Tools

Tools are the discrete, callable functions exposed via MCP that let an LLM take actions or fetch data. Each tool has three parts: a name and description so the model knows the tool exists and when to use it, a strict input schema so calls are valid and typed, and a structured output so results are usable without parsing. Examples: "get service entity", "query deployment history", "search runbooks", "create incident".

Tools are what make agents useful rather than just knowledgeable. A model without tools can reason about a problem. A model with the right tools can act on it. In a platform engineering context, the quality of the tools - specifically how precisely their schemas match real workflows - determines most of the difference between agents that work and agents that look good in demos.
