Skip to content

Test MCP Tools

Test MCP tool integrations in Arena by connecting real MCP servers and verifying tool calls in your scenarios.

This guide covers stdio (command) and static HTTP+SSE (url) servers — long-lived processes shared across all scenarios. For per-scenario or per-session containers (e.g. codegen sandboxes), see Provision an MCP Sandbox per Scenario.


Add an mcp_servers block to your Arena config:

# config.arena.yaml
apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Arena
metadata:
name: mcp-test
spec:
mcp_servers:
- name: everything
command: npx
args:
- "-y"
- "@modelcontextprotocol/server-everything"
prompt_configs:
- id: assistant
file: prompts/assistant.yaml
scenarios:
- file: scenarios/echo-test.yaml
providers:
- file: providers/mock-provider.yaml

Arena starts the MCP server, discovers its tools, and registers them for use in scenarios.


MCP servers often expose many tools. Use tool_filter to limit which tools are available:

mcp_servers:
- name: everything
command: npx
args: ["-y", "@modelcontextprotocol/server-everything"]
tool_filter:
allowlist:
- echo
- get-sum
mcp_servers:
- name: database
command: python
args: ["mcp_db_server.py"]
tool_filter:
blocklist:
- drop_table
- truncate_table

Pass environment variables to the server process:

mcp_servers:
- name: github
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_TOKEN: "${GITHUB_TOKEN}"
mcp_servers:
- name: filesystem
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "./data"]
working_dir: /path/to/project

Set a per-request timeout in milliseconds:

mcp_servers:
- name: everything
command: npx
args: ["-y", "@modelcontextprotocol/server-everything"]
timeout_ms: 10000

Enable MCP tools in your prompt config with allowed_tools. MCP tools follow the naming pattern mcp__{serverName}__{toolName}:

# prompts/assistant.yaml
apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: PromptConfig
spec:
task_type: assistant
version: v1.0.0
description: Assistant with MCP tools
allowed_tools:
- mcp__everything__echo
- mcp__everything__get-sum
system_template: |
You are a helpful assistant with access to echo and math tools.

Configure the mock LLM to issue tool calls against MCP tools:

# mock-responses.yaml
scenarios:
echo-test:
turns:
1:
tool_calls:
- name: "mcp__everything__echo"
arguments:
message: "Hello from Arena!"
2:
response: "The echo tool returned: Hello from Arena!"

# scenarios/echo-test.yaml
apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Scenario
metadata:
name: echo-test
spec:
id: echo-test
task_type: assistant
description: Test MCP echo tool
tool_policy:
tool_choice: auto
max_tool_calls_per_turn: 3
turns:
- role: user
content: "Echo the message: Hello from Arena!"
assertions:
- type: tools_called
params:
tools:
- mcp__everything__echo
- type: content_includes
params:
patterns:
- "Hello from Arena"

Register multiple servers in the same config:

mcp_servers:
- name: everything
command: npx
args: ["-y", "@modelcontextprotocol/server-everything"]
tool_filter:
allowlist: [echo, get-sum]
- name: filesystem
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "./data"]
tool_filter:
allowlist: [read_file, list_directory]

# config.arena.yaml
apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Arena
metadata:
name: mcp-everything-test
spec:
mcp_servers:
- name: everything
command: npx
args: ["-y", "@modelcontextprotocol/server-everything"]
timeout_ms: 10000
tool_filter:
allowlist: [echo, get-sum]
prompt_configs:
- id: mcp-assistant
file: prompts/mcp-assistant.yaml
scenarios:
- file: scenarios/echo-and-add.scenario.yaml
providers:
- file: providers/mock-provider.yaml
defaults:
temperature: 0.7
max_tokens: 500
concurrency: 1
output:
dir: out
formats: ["json", "html"]

Run it:

Terminal window
cd examples/mcp-everything-test
promptarena run -c config.arena.yaml