MCP Tool Poisoning: Attacking AI Agents Through Malicious Tool Descriptions

Model Context Protocol tool descriptions can be weaponized to hijack agent behavior.

The Model Context Protocol (MCP) allows AI agents to discover and use external tools. We demonstrate that malicious MCP servers can craft tool descriptions containing hidden instructions that override the agent's original task. In our tests across 4 major agent frameworks, we achieved a 78% success rate in redirecting agent behavior through tool description injection.