TrojAI Defend for MCP: Runtime Defense for Agentic Workflows

TrojAI is proud to announce TrojAI Defend for MCP, a runtime defense solution for Model Context Protocol (MCP) that helps secure agentic AI workflows. TrojAI Defend for MCP gives security teams the visibility, policy control, and protection needed to secure MCP deployments.

Model Context Protocol (MCP) and agentic AI

Model Context Protocol (MCP) is an open client-server framework that standardizes how AI models and agents communicate with external tools and data through a common schema. It enables both models and tools to dynamically discover and invoke each other’s capabilities in a structured, auditable way. In this way, it forms the foundation for secure and interoperable agentic AI workflows. Because MCP standardizes the format for function calls, tool usage, memory, and state, it eliminates the need for custom integrations for each new data source.

Acting as a coordination layer between models and their environment, MCP allows AI systems to safely interact with APIs, databases, and enterprise platforms through a consistent interface. If interoperability is the key to agentic AI, MCP is the universal translator that ensures every system speaks the same language.

Risks of MCP

As with any other expansion of the tech stack, MCP deployments can introduce new risks. Each connection, tool, and server introduces another moving part that traditional security tools weren’t built to see or control. Agentic AI-driven systems create and use tools dynamically, opening the door to new attacks:

Unapproved MCP servers: Malicious or unverified servers can expose tools that perform unauthorized actions or leak sensitive data.
Unapproved tools: Even on trusted servers, unapproved tools can slip past security controls to act maliciously.
Malicious tool descriptions: Attackers can hide prompt injections inside tool metadata such as names, descriptions, or parameters that seem harmless but change how an AI model behaves.
Post-approval server and tool integrity: Server or tool metadata changes after approval could signal tampering or a rug pull attack in which a trusted tool is quietly rewritten to do harm.

Securing MCP with TrojAI Defend for MCP

TrojAI Defend for MCP gives security teams the visibility, policy control, and runtime enforcement needed to secure MCP deployments. It extends TrojAI Defend to the MCP layer, ensuring that every server, agent, and tool operates within approved governance and audit frameworks. TrojAI Defend for MCP eliminates blind spots, detects tampering, and stops unauthorized use before it becomes a breach.

The following outlines several of the key features available in TrojAI Defend for MCP. These features include an MCP registry, visibility into all MCP traffic, drift/tampering protection, and a comprehensive and customizable policy engine.

MCP server registry and tool approval

TrojAI Defend for MCP features an MCP registry that allows you to discover all MCP servers in your environment. Once a new MCP server has been discovered, you are able to grant explicit approval for that server before it becomes available to the platform. New MCP servers are automatically blocked pending approval.

Not only does the MCP registry eliminate shadow MCP, but it also prevents malicious or unverified servers from exposing tools that could perform unauthorized actions or exfiltrate data.

Even when a server is approved, individual tools require additional review and approval before being exposed to the LLM. TrojAI Defend for MCP finds every tool associated with each server, allowing security professionals the flexibility to approve only those that meet enterprise security standards. This prevents servers from introducing new tools post-approval that bypass security controls.

‍

MCP traffic visibility

Once all servers and tools have been identified, TrojAI Defend for MCP monitors MCP traffic to and from each server, including all prompts and responses. With this visibility into MCP traffic, you can easily block connections to unregistered or rogue servers to eliminate hidden communication paths, which helps protect against attacks like prompt injection and data exfiltration.

Visibility into MCP traffic is essential because it turns opaque AI interactions into auditable, governable, and secure communication. It allows you to detect anomalous or high-risk behavior before it turns into a breach. It also allows you to enforce policies. Visibility into MCP traffic gives you the ground truth of what your AI agents are actually doing, not just what you think they’re doing. And once you know this, you can take the steps to secure it.

Server and tool change detection

Registering and approving servers and tools is not a one and done event. Tool definitions could be tampered with, hiding a prompt injection attack inside its metadata. Similarly, a seemingly legitimate and previously approved MCP server could later covertly be modified to be malicious. This rug pull attack exploits the trust the user has placed in the tool.

TrojAI Defend for MCP addresses these risks by continuously monitoring tool and server metadata to detect unauthorized changes that may signal tampering, drift, or poisoning. During onboarding, tool descriptions are reviewed and approved. Any subsequent modification—such as updates to a name, description, or parameters—automatically triggers the blocking of the tool until it is re-approved. This process stops prompt-injection attempts hidden inside tool metadata, like malicious instructions to read local files or conceal actions. By tracking and blocking metadata changes in real time, TrojAI Defend prevents silent redefinitions or “rug-pull” attacks where a previously trusted tool suddenly alters its behavior.

‍

MCP policy engine

In addition to the capabilities highlighted above, TrojAI Defend for MCP includes a comprehensive policy engine that secures every stage of your MCP workflows. Purpose-built for MCP, these policies inspect, audit, and enforce security controls in real time. The policy engine strengthens governance by ensuring that all agent interactions align with enterprise data handling requirements. It also provides a detailed audit trail to support compliance and incident response.

Real-World Attack: Runtime Description Mutation

The following describes a real-world scenario in which there is an attempt to compromise your MCP environment but TrojAI Defend for MCP stops the attack.

Imagine you’ve onboarded a tool into your MCP environment. It’s been reviewed, approved, and logged in the MCP registry. Everything looks clean, but at runtime, something changes. The server quietly flips the tool’s description on first invocation, injecting attacker instructions that tell the model to perform hidden actions like reading local files or masking activity. To make matters worse, the server may even try to re-register this altered version, hoping the change propagates to clients before anyone notices.

This type of runtime description mutation represents a subtle but dangerous attack vector: manipulating metadata, not code. Without proper defenses, an agent could ingest the malicious description and unknowingly follow injected instructions.

TrojAI Defend for MCP stops this kind of attack cold by anchoring tool definitions to a registry-centric protection model. When a client requests a list of available tools or descriptions, it doesn’t ask the server directly. Instead, it queries the registry via a proxy, which always returns the last approved version. The registry maintains a canonical record of truth, storing each tool’s verified description, hash, and version.

If a server attempts to alter a tool’s description — whether through a push, a re-registration, or an observed drift — the registry automatically detects the mismatch, flags the update as pending, and blocks it from being distributed to clients. During this hold period, the proxy continues serving the last approved version, ensuring that no malicious content ever reaches production agents.

Clients may briefly see a neutral placeholder message such as “Tool description unavailable” or continue to see the previously accepted description to avoid disruption. Either way, agents are protected.

The result is a closed loop of trust. Even if a server mutates its live description during a call, those attacker instructions never reach the model. The registry remains the single source of truth, enforcing re-approval for every change and logging every attempt for audit.

Securing the future of agentic AI with TrojAI

TrojAI is redefining how enterprises protect the next generation of intelligent systems so they can confidently embrace agentic AI innovation securely, transparently, and at scale.

As organizations adopt the Model Context Protocol (MCP) and deploy increasingly autonomous AI agents, the attack surface is expanding beyond traditional model risks to include the complex web of tools, data, and workflows these agents rely on. TrojAI delivers end-to-end visibility and control across this new landscape securing the entire AI ecosystem, from the model layer to the agent layer to the infrastructure that connects them.

How TrojAI can help

TrojAI delivers security for AI. Our mission is to enable the secure rollout of AI in the enterprise. Our comprehensive platform protects AI models, applications, and agents, empowering enterprises to safeguard AI systems both at build time and run time.

In addition to TrojAI Defend for MCP, we offer the following solutions:

TrojAI Detect automatically red teams AI models during development, evaluating behavioral risks and providing actionable remediation guidance before deployment.
TrojAI Defend acts as an AI application firewall, protecting enterprises from real-time threats as AI systems interact with live data, tools, and users — including MCP-based workflows.

By assessing and mitigating AI risk throughout the development lifecycle and continuously defending against runtime attacks, TrojAI provides the unified protection enterprises need to deploy agentic AI with confidence.

Want to learn more about how TrojAI secures the largest enterprises globally with a highly scalable, performant, and extensible solution?

Book a demo now.

‍