Agents, MCP, and Kubernetes, Part 1

Learn how to deploy AI agents and Model Context Protocol (MCP) servers on Kubernetes to build secure, scalable autonomous systems. Part 1 of a four-part series exploring production-ready agentic architecture.

Prashant Ramhit
December 9, 2025
9 min read

Introduction

In this four part series, I want to share my recent findings from deploying AI agents and Model Context Protocol (MCP) servers and related tools on Kubernetes. At Mirantis, I’ve been working on a number of R&D projects in a small skunkworks team focused on “agentic AI.” While it’s still the early days in terms of determining “agentic best practices,” some emerging patterns and anti-patterns are already gaining ground. One emerging pattern is the use of MCP ¹ as the de facto lingua franca of agent-to-tool integration. Another is the use of Kubernetes as the platform on top of which everything runs. Anti-patterns include building MCP servers that have 1:1 functionality of their REST API equivalent.

In this first part of the series, we will focus exclusively on the baseline of deploying agents and MCP servers on Kubernetes, using an example multi-agent workflow. In the subsequent parts, we will discuss how to add more advanced security-related features for locking down and hardening your agents and tools for production usage.

Let’s get started!

Why MCP Matters for AI Agents

In every agentic workflow, there are four pillars:

Agents - autonomous workers that have “agency” (ownership), personas, and execute domain-specific tasks
LLMs - provide core reasoning and (fuzzy) business logic to agents
Tools - capabilities and services the agents can call on behalf of LLMs
Context - the ongoing state that informs the LLM’s reasoning

The biggest challenge is enabling agents (and hence LLMs) to interact with tools safely, consistently, and with proper governance.

This is where Model Context Protocol (MCP) comes in.

MCP provides:

A standardized interface for tools.
Structured schemas for requests and responses.
Secure, typed (schema-validated) access to internal and external data sources.
Pre-built connectors for common data sources and services, eliminating the need for custom development.
And probably soon, the ability to execute multi-step functions via code.

These capabilities are why MCP is a compelling solution that is being rapidly adopted for agentic tool calling.

Why Kubernetes Matters for AI Agents

A recent conversation with a Mirantis FinServ customer surfaced a compelling data point: only about 25% of their production Kubernetes workloads are truly cloud-native. The remainder consists of legacy systems progressively adopting Kubernetes to gain orchestration capabilities and self-healing benefits.

This ubiquity is precisely what positions Kubernetes as the natural, battle-tested platform for AI agents. Consider the infrastructure choices of the major AI players: Anthropic’s backend, OpenAI’s serving layer, and LangChain deployments all run on Kubernetes (and in some cases, they use open source k0s). Even local development environments are converging on this stack - Docker Desktop now integrates MCP gateway tooling. Meanwhile, projects like KServe, Kubeflow, and Ray continue to cement Kubernetes as the definitive AI runtime standard.

The operational benefits are simply too decisive to ignore. Declarative deployments ensure reproducibility across environments. Resource quotas prevent contention between competing workloads. Horizontal pod autoscaling elastically meets workload demand. And perhaps most critically, organizations gain cloud portability and self-healing infrastructure without requiring architectural rework of their existing applications.

If we accept that MCP has emerged as the de facto standard protocol for agent-to-tool communication, and that Kubernetes has become the de facto standard hosting platform for production AI workloads, then the natural question becomes: how do we put it all together?

An AI-Powered Code Review System

To demonstrate this, we built a full working implementation of an autonomous code review system that showcases production-ready agentic architecture.

Our implementation is an AI-driven Autonomous Code Reviewer running on Kubernetes, using MCP for tool integration and standard microservices patterns for agent coordination. This is a relatively simple, multi-agent system that does the following:

PR-Agent: focuses on code quality and logic
MCP Server for Code Scanning: focuses on vulnerability detection via Semgrep
Summarizer Agent: focuses on executive synthesis
Orchestrator Agent: focuses on workflow coordination
MCP Server for GitHub: allows adding comments to GitHub PRs
Kubernetes: hosts all of the above

All of this is triggered via GitHub webhooks to the main Orchestrator Agent.

Autonomous Code Reviewer Architecture

The diagram below describes this agentic workflow in more detail.

Let’s quickly walk through the workflow and components.

GitHub PR Webhook Trigger
- The GH repository is configured to send a webhook to the Orchestrator Agent on new PRs.
Orchestrator Agent
- Receives webhook requests, triggering a sequence of events
  - It clones the GH repository locally, extracts the PR metadata into a clear format and provides a clean internal context.
  - It sends the context to the PR-Agent for code review and analysis.
  - It then triggers the MCP Code Scanning Agent to run static code analysis and vulnerability checks.
  - After collecting the results from both Agents, it sends everything to the Executive Summary Agent.
  - The Summary Agent, merges everything into a single unified report.
  - Finally, the Orchestrator, recovers the unified report and delivers it to the GitHub repository’s PR as a comment with a list of actions.
PR Agent
- Analyze the Pull Request context.
- With the help of LLMs, it reviews the codes, produces structured metadata and code-aware insights.
MCP Server for Code Scanning
- Runs Static code analysis using Semgrep, via MCP to detect vulnerabilities and coding issues on demand.
Executive Summary Agent
- Generates high level, human readable summary of the PR (e.g., Quality, Risks and findings).
GitHub MCP Server
- Provides secure MCP-Based access to the GitHub repository to post comments to the PR.
LLM Services
- Publicly available LLMs to analyze, reason and summarize the requests from all the above microservices, OpenAI GPT-4o for PR-Agent and Claude sonnet-4 for the Executive Summary Agent.

Autonomous Code Reviewer Workflow

Webhook Reception and PR Context Extraction

The workflow begins when GitHub sends the PR webhook to the Orchestrator’s endpoint. The Orchestrator extracts metadata such as commit SHAs, repo details, and changed files. It then clones the PR locally. This ensures only authentic and meaningful PR events enter the Automation Pipeline. The Orchestrator converts the raw webhook payload into a normalized PR context (JSON payload). The code is shared, and the context is then sent (dispatched) to the PR-Agent and the Static Code Analysis Agent.

Semantic Code Review with AI

The PR-Agent uses GPT-4o to act as a Senior Software Engineer to review the PR code. It creates an analysis (e.g., how the PR changes the code feature sets and application behavior, understands the functions and detects bugs, highlights potential performance issues, and suggests refactoring). The summary is then sent back to the Orchestrator. Can we say that this step ensures “semantic understanding”?

Static Code Analysis and Security Scanning

Next, the Orchestrator engages with the Semgrep MCP Server, parsing the repo URL and PR commit SHA. The MCP server then executes static code analysis rulesets (i.e., correctness, security, OWASP, maintainability), and the result (in JSON) is returned back to the Orchestrator, containing the filenames, severities, and fix suggestions.

Unified Report Generation and GitHub Integration

After receiving the outputs from both the PR-Agent and the Static Code Analysis Server, the Orchestrator sends both types of output to the Executive Summarizer Agent, which uses Claude’s Sonnet-4 to aggregate, deduplicate, summarize, and create a unified GitHub-standard report representing the entire PR analysis. The report is posted back to the Orchestrator, which routes it to the GitHub MCP Agent, which commits the report as a comment to the PR.

Event Logging

The Orchestrator logs all workflow states, timestamps, and API exchanges before closing the event. This completes the autonomous PR review loop with traceability.

Et voilà!

Why This Architecture?

The architecture cleanly separates orchestration from analysis to allow each agent to operate cleanly, with its own intent, persona, and context, which is an Agentic best practice. The Orchestrator manages data flow between the PR, the Agents, and GitHub, while Kubernetes enables independent scaling and upgrades for each component without disrupting the overall workflow. MCP servers act as adapters between the Orchestrator and various tools, providing a standardized protocol that eliminates custom integration code and enables composable evolution.

Separation of Concerns

Orchestrator handles PR data flow coordination; Agents handle specialized analysis
Components can be upgraded independently without workflow disruption
Kubernetes enables independent scaling for both Orchestrator and Agents

Parallel Processing

Orchestrator engages PR-Agent and Semgrep static analysis simultaneously, avoiding sequential bottlenecks
Agents run as individual pods with full isolation
Workloads distribute across nodes, preventing resource contention
Each agent scales independently based on demand

MCP for Simplicity and Extensibility

MCP servers provide standardized adapters to various tools (internal and external):
- i.e., Semgrep, GitHub, and possible future tools (e.g. Context7)
Protocol handles authentication, requests, and responses
Each MCP server runs independently and scales independently
Tools can be added, updated, or removed without modifying the Orchestrator
The adapter pattern provides loose-coupling, shielding the pipeline from vendor API changes

Different LLMs for Different Purposes

Optimize between cost, performance, and function
The right tool for the job
Allow for using local LLMs when it makes sense

Run it Yourself

The complete implementation is open source and available on GitHub.

● https://github.com/Mirantis/agensys-codereview-demo

The repo includes:

Complete source code for all four agents
Docker Compose configuration for local deployment
Kubernetes manifests for production deployment
MCP server implementations
Comprehensive documentation and deployment guides

Prerequisites

Before running the application, you will need all of the following:

A Kubernetes cluster with a minimum of 1 worker node.
OpenAI API Key
Anthropic API Key
GitHub PAT (token)
SemGrep App Token

Quick Start

Clone the repository
Follow the instructions in the README.md to run, test, and stop the application either on your local or a k8s cluster

Building the Foundation for Autonomous Systems

Deploying AI agents and MCP servers on Kubernetes is not merely a technical choice, but a strategic one. What we’ve demonstrated here is an initial set of best practices that transforms isolated agents into a cohesive, scalable system by leveraging infrastructure primitives that enterprise teams already understand and trust.

MCP eliminates the integration tax that has historically plagued tool connectivity, replacing bespoke API clients with standardized adapters that can evolve independently. Kubernetes provides the orchestration layer that makes this evolution operationally simple: declarative configuration, automatic scaling, self-healing capabilities, and cloud portability.

This matters because Kubernetes has emerged as the universal runtime for modern application development, and AI agents are no exception. As agents become central to enterprise business processes they deserve the same enterprise-grade guarantees we extend to any production system.

In Parts 2 and 3 of this blog series, we’ll extend this foundation by exploring how to harden and secure your agents and MCP servers for production environments. Deploying agents is only the beginning. Keeping them safe, secure, governed, and observable is where the real fun begins.

Stay tuned!

Footnotes

If you are new to MCP, for more information see this MCP primer. Or What is MCP, and Why is Everyone Suddenly Talking About it? ↩︎