Agents, MCP, and Kubernetes, Part 3

Learn how to implement application-level security and intelligent routing for AI agents and MCP servers using kGateway and AgentGateway. Part 3 of a three-part series exploring production-ready agentic architecture

  • Prashant Ramhit & Yevhen Khvastunov
  • 7 min read

Introduction

In Part 1 and Part 2 of this series, we focused on getting AI agents and MCP servers running on Kubernetes, secured with Istio Ambient Mode and a zero-trust network model.

That work established a solid baseline: workloads can authenticate each other, connections are encrypted, and access is explicitly authorized.

What it does not address is how those workloads behave once a connection is established.

When an agent executes a workflow, it assembles prompts, calls MCP tools, chains services together, and makes decisions that affect production systems. At the transport layer, Istio only sees encrypted traffic moving between endpoints. It has no visibility into whether a request contains sensitive data, targets the right backend, or follows internal governance rules.

Once agents run production workloads, transport security alone is no longer sufficient. The focus shifts from connectivity to controlling execution behavior.

In this third part, we extend the platform beyond transport security and introduce application-level controls. By placing MCP-aware gateways above Istio’s data plane, we can inspect requests, validate protocol usage, and apply policy based on what agents are actually doing.

Network security and application governance operate independently. Together, they form a layered control model where failures in one layer do not automatically expose the system.

From Network Security to Application Intelligence

L4 to L7 Architecture

Istio Ambient Mode provides strong transport guarantees: SPIFFE identities, mutual TLS, and explicit authorization rules that control which workloads can establish connections.

This works well for securing service-to-service traffic. It does not help with understanding the content of that traffic.

At Layer 4, the platform sees TCP streams and connection metadata. When an agent sends an MCP request, Istio sees bytes. It does not see:

  • Which tool is being invoked
  • What parameters are being passed
  • What prompt triggered the call
  • Whether the workflow violates internal policy
  • Whether sensitive data is being exposed

An agent with valid credentials can use any capability supported by the protocol, regardless of whether that behavior is expected or acceptable.

This is where operational risk starts to appear.

Application-level gateways change the enforcement model. Instead of authorizing connections, they evaluate individual requests. They understand MCP, JSON-RPC, and LLM provider APIs, and can make decisions based on message structure and content.

This enables controls that are not possible at the transport layer:

  • Inspecting prompts for injection patterns or secrets
  • Validating MCP message formats
  • Restricting tool access per agent
  • Applying different limits to LLM calls and tool invocations
  • Tracking cost and usage per workflow
  • Auditing actual agent behavior

At this point, responsibilities are clearly separated:

  • Layer 4 controls identity and connectivity
  • Layer 7 controls behavior and intent

Both checks are required. Each fails in different ways. Neither replaces the other.

If an agent’s credentials are compromised, Layer 7 still limits what it can do. If a gateway rule is misconfigured, the request still has to pass transport authorization first.

This dual control model keeps failures contained and makes agent behavior predictable.

What L7 Adds on Top of L4

Capability AreaL4 Only (Istio Ambient)L4 + L7 Gateway
Identity & TrustSPIFFE identities, mTLSSPIFFE identities, mTLS
Authorization ScopeConnection-levelRequest-level
Traffic VisibilityTCP flowsMCP messages, prompts
Policy EnforcementAllow / denyPrompt and protocol validation
Routing LogicStaticContent-aware
Rate LimitingCoarsePer-agent, per-request
ObservabilityL4 metricsRequest tracing, cost tracking
Failure HandlingRetriesLLM-aware failover
Governance ModelNetwork trustBehavioral governance

Layer 4 continues to guarantee secure transport. Layer 7 governs how that transport is used.

Introducing AgentGateway

Most ingress controllers and API gateways are designed for predictable HTTP traffic: fixed endpoints, stable routes, and clearly defined services.

Agentic systems do not fit that model.

Agents generate requests dynamically, use JSON-RPC and MCP, chain tools without predefined paths, and encode business logic inside prompts rather than URLs. Traffic patterns change continuously and depend on model output.

We use AgentGateway from Solo.io because it handles these patterns out of the box. It understands OpenAI and Anthropic APIs, validates MCP requests, inspects prompts, injects credentials, tracks token usage, and applies inline CEL policies.

Its multi-port design separates LLM traffic, MCP calls, and webhooks so each class of traffic can be governed independently.

Why AgentGateway Instead of kGateway

Solo.io provides both kGateway and AgentGateway. We chose AgentGateway for three practical reasons.

Istio Ambient Compatibility

kGateway manages its own data plane and is designed around sidecars. In Ambient Mode, this overlaps with ztunnel’s responsibilities and creates competing enforcement paths.

AgentGateway runs as a standard application proxy above Istio and does not interfere with the transport layer.

AI-Focused Capabilities

Features such as provider integrations, credential injection, streaming support, and token accounting are built in. With kGateway, these would require custom filters and external processors.

Operational Simplicity

Configuration is YAML-based, policies use inline CEL expressions, and there is no separate control plane. Istio handles transport security. AgentGateway handles application logic.

Each layer has a clearly defined role.

Architecture Overview

Before introducing the gateway:

Agent -> Istio ztunnel -> LLM Provider
Agent -> Istio ztunnel -> MCP Server

With AgentGateway:

Agent -> Istio ztunnel -> AgentGateway -> LLM Provider
Agent -> Istio ztunnel -> AgentGateway -> MCP Server

AgentGateway Operation

Istio is not replaced. It remains responsible for identity and encryption.

All traffic passes through two independent checks:

  • ztunnel verifies workload identity and connection policy
  • AgentGateway validates content and enforces application rules

This keeps enforcement centralized without introducing additional control planes.

Security Layers

LayerComponentEnforcementFailure Mode
L4Istio ztunnelIdentity, mTLS, authzConnection blocked
L7AgentGatewayContent and policyRequest rejected

A request must pass both layers to proceed.

Why Both Layers Matter

Transport security answers one question: who is allowed to connect.

Application governance answers another: what is allowed to happen after the connection exists.

Every request is evaluated twice:

  • L4: Is this workload authorized?
  • L7: Is this request acceptable?

This reduces the impact of individual failures and makes agent behavior easier to reason about.

Overall Architecture

AgentGateway Architecture

PortPurposeBackend
9080OrchestratorInternal service
9081Anthropicapi.anthropic.com
9082OpenAIapi.openai.com
9083MCPGitHub, Semgrep
15000AdminManagement UI
15020MetricsPrometheus

The agent generates different types of traffic as part of its workflow: webhooks, LLM API calls, and MCP tool invocations. Instead of connecting directly to backend services, all requests pass through AgentGateway.

The gateway exposes multiple dedicated listeners, each on its own port. The orchestrator proxy (9080) handles internal coordination and applies secret filtering and rate limits. The Anthropic (9081) and OpenAI (9082) proxies manage LLM traffic, inject API keys, inspect prompts, and enforce provider-specific limits. The MCP proxy (9083) validates JSON-RPC requests, checks agent identity, and routes calls to GitHub and Semgrep MCP servers.

Operational endpoints are exposed separately: Prometheus scrapes metrics on port 15020, and the admin UI on port 15000 is used for monitoring and debugging.

Each listener enforces its own policies before forwarding traffic to backend services. This prevents agents from bypassing controls and keeps enforcement centralized.

Overall, the diagram illustrates a layered model where all agent traffic is inspected, governed, and observed before reaching external systems, enabling predictable and secure operation at scale.

Separating traffic this way prevents noisy components from affecting others and simplifies monitoring.

Implementation

The full configuration is available in:

https://github.com/Mirantis/agensys-codereview-demo/tree/main/agent-gateway

agent-gateway/
├── 01-namespace.yaml
├── 02-secrets.yaml
├── 03-agentgateway-config.yaml
├── 04-agentgateway-deployment.yaml
├── 05-agentgateway-service.yaml
├── 06-mcp-servers.yaml
├── 07-istio-integration.yaml
└── test-scripts/

Core Configuration Structure

binds:
  - port: <PORT>
    listeners:
      - routes:
          - matches: [...]
            policies: [...]
            backends: [...]

Key concepts:

  • Binds define listening ports
  • Listeners group routes
  • Routes match requests
  • Policies enforce rules
  • Backends define upstreams

MCP Routing

Example GitHub MCP configuration:

- port: 9083
  listeners:
    - routes:
        - matches:
            - path:
                pathPrefix: /mcp
          headers:
            - name: x-mcp-server
              value: github

Requests are routed using the x-mcp-server header. JSON-RPC validation and agent identification are enforced before forwarding.

Conclusion: From Protocol to Platform

MCP provides a standard way for agents to interact with external systems. On its own, that is not enough.

Once agents run in production, they need the same governance, auditability, and operational controls as any other platform component.

Combining Istio Ambient Mode with AgentGateway creates a clear separation of responsibilities. Transport security handles identity and encryption. Application gateways handle content, policy, and cost.

This keeps the system manageable as workloads scale and complexity increases.

At that point, MCP is no longer just a protocol. It becomes part of an operational model that supports long-running, autonomous systems.

The challenge going forward is not whether agents can automate work. It is whether they can do so reliably, predictably, and within defined boundaries.

This architecture is one practical way to achieve that.

Resources:

Recommended for You

Agents, MCP, and Kubernetes, Part 2

Agents, MCP, and Kubernetes, Part 2

Learn how to Secure AI agents and Model Context Protocol (MCP) servers on Kubernetes with Istio Ambient Mode. Part 2 of a three-part series exploring production-ready agentic architecture.

Prashant Ramhit & Yevhen Khvastunov

Agents, MCP, and Kubernetes, Part 1

Agents, MCP, and Kubernetes, Part 1

Learn how to deploy AI agents and Model Context Protocol (MCP) servers on Kubernetes to build secure, scalable autonomous systems. Part 1 of a four-part series exploring production-ready agentic architecture.

Prashant Ramhit