Test A2A Agents
Test agent-to-agent delegation using Arena’s built-in mock A2A infrastructure. Define mock agents in your Arena config, write scenarios that trigger tool calls, and verify the full delegation pipeline.
Overview
Section titled “Overview”sequenceDiagram participant U as User Turn participant ML as Mock LLM participant E as a2aExecutor participant MS as Mock A2A Server
U->>ML: "Search for papers about quantum" ML->>E: tool_call: a2a__research_agent__search_papers E->>MS: JSON-RPC message/send MS->>E: Task (completed, artifacts) E->>ML: tool result ML->>U: "Found 3 papers..."Arena spins up mock A2A servers from your config, registers their skills as tools via the Tool Bridge, and connects them to the mock LLM provider. You control both sides — the LLM responses and the A2A agent responses — for fully deterministic tests.
Configure A2A Agents
Section titled “Configure A2A Agents”Add an a2a_agents block to your Arena config:
# config.arena.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Arenametadata: name: a2a-demospec: providers: - file: providers/mock-provider.yaml prompt_configs: - id: assistant file: prompts/assistant.yaml scenarios: - file: scenarios/delegated_research.yaml
a2a_agents: - name: research_agent card: name: Research Agent description: A mock research agent that returns papers skills: - id: search_papers name: Search Papers description: Search for academic papers on a given topic tags: [research, papers, search] responses: - skill: search_papers match: contains: quantum response: parts: - text: | Found 3 papers on quantum computing: 1. Quantum Computing Fundamentals (2024) 2. Advances in Quantum Error Correction (2024) 3. Practical Quantum Algorithms for Optimization (2023) - skill: search_papers response: parts: - text: | Found 1 paper on the requested topic: 1. General Survey of the Field (2024)Each agent produces a tool named a2a__{name}__{skillId} — in this case, a2a__research_agent__search_papers.
Prompt Config
Section titled “Prompt Config”Enable the A2A tool in your prompt config with allowed_tools:
# prompts/assistant.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: PromptConfigspec: task_type: assistant version: v1.0.0 description: An assistant that can delegate research to A2A agents allowed_tools: - a2a__research_agent__search_papers system_template: | You are a helpful assistant with access to research tools. When asked to search for papers or research a topic, use the available tools. Summarize the results for the user.Mock LLM Responses
Section titled “Mock LLM Responses”Configure the mock LLM to issue tool calls and then summarize results:
# mock-responses.yamlscenarios: delegated-research: turns: 1: tool_calls: - name: "a2a__research_agent__search_papers" arguments: query: "Search for papers about quantum computing" 2: response: | Based on the research results, I found 3 papers on quantum computing: 1. Quantum Computing Fundamentals (2024) 2. Advances in Quantum Error Correction (2024) 3. Practical Quantum Algorithms for Optimization (2023)Turn 1: the mock LLM emits a tool call. Arena’s a2aExecutor routes it to the mock A2A server, which matches the “quantum” rule and returns the canned response. Turn 2: the mock LLM sees the tool result and produces a final summary.
Scenario Setup
Section titled “Scenario Setup”Write a scenario that sends the user message and asserts on the outcome:
# scenarios/delegated_research.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Scenariometadata: name: delegated-researchspec: id: delegated-research task_type: assistant description: Test delegation of research queries to an A2A agent tool_policy: tool_choice: auto max_tool_calls_per_turn: 3 max_total_tool_calls: 5 turns: - role: user content: "Search for papers about quantum computing" assertions: - type: tools_called params: tools: - a2a__research_agent__search_papers - type: content_includes params: patterns: - "Quantum Computing Fundamentals"Assertions
Section titled “Assertions”| Assertion | Description |
|---|---|
tools_called | Verify that specific tools were invoked during the turn |
content_includes | Check that the final response contains expected patterns |
agent_invoked | Verify that specific agents were delegated to (checks for agent tool calls) |
agent_not_invoked | Verify that specific agents were NOT delegated to |
agent_response_contains | Verify that an agent’s response contains expected text |
The agent_invoked and agent_response_contains assertions provide a higher-level alternative to tools_called for agent delegation testing. Instead of checking raw tool names, you can express intent directly:
assertions: # Lower-level: check the raw tool name - type: tools_called params: tools: - a2a__research_agent__search_papers
# Higher-level: check that an agent was invoked and returned expected content - type: agent_invoked params: agents: - a2a__research_agent__search_papers - type: agent_response_contains params: agent: a2a__research_agent__search_papers contains: "Quantum Computing Fundamentals"Run and Verify
Section titled “Run and Verify”cd examples/a2a-demopromptarena run -c arena/config.arena.yamlArena will:
- Start the mock A2A server from your config
- Register agent skills as tools via the Tool Bridge
- Run each scenario turn with the mock LLM
- Route tool calls through the
a2aExecutorto the mock server - Evaluate assertions and produce a report
Check the output directory for results:
ls arena/out/# report-2026-02-08T17-55-25.html# report-2026-02-08T17-55-25-data.jsonMultiple Agents
Section titled “Multiple Agents”Add more agents to test multi-agent delegation:
a2a_agents: - name: research_agent card: name: Research Agent skills: - id: search_papers name: Search Papers responses: - skill: search_papers response: parts: - text: "Research results..."
- name: translation_agent card: name: Translation Agent skills: - id: translate name: Translate responses: - skill: translate response: parts: - text: "Translated text..."This registers both a2a__research_agent__search_papers and a2a__translation_agent__translate as available tools.
Use agent_invoked and agent_not_invoked assertions to verify which agents are delegated to:
turns: - role: user content: "Research quantum computing and translate the summary to French" assertions: - type: agent_invoked params: agents: - a2a__research_agent__search_papers - a2a__translation_agent__translate message: "Should delegate to both agents" - type: agent_not_invoked params: agents: - a2a__admin_agent__execute message: "Should not invoke admin agent" - type: agent_response_contains params: agent: a2a__research_agent__search_papers contains: "Research results" message: "Research agent should return results"For conversation-wide checks, use conversation_assertions:
conversation_assertions: - type: agent_invoked params: agent_names: - a2a__research_agent__search_papers min_calls: 1 message: "Research agent should be invoked at least once across the conversation" - type: agent_not_invoked params: agent_names: - a2a__admin_agent__execute message: "Admin agent should never be invoked"Remote A2A Agents
Section titled “Remote A2A Agents”Instead of mock agents, you can test against real remote A2A servers. Set url on the agent config — Arena discovers the agent’s skills from its /.well-known/agent.json endpoint.
Basic Remote Agent
Section titled “Basic Remote Agent”a2a_agents: - name: echo_agent url: "http://localhost:9877"When url is set, card and responses are ignored — the agent card is fetched from the remote server.
Authentication
Section titled “Authentication”Use auth to send credentials with every request:
a2a_agents: - name: echo_agent url: "http://localhost:9877" auth: scheme: Bearer token: "test-token-123"To keep tokens out of config files, use token_env to read from an environment variable:
a2a_agents: - name: echo_agent url: "https://agent.example.com" auth: scheme: Bearer token_env: AGENT_TOKENHeaders
Section titled “Headers”Add static headers or headers resolved from environment variables:
a2a_agents: - name: echo_agent url: "https://agent.example.com" headers: X-Tenant-ID: acme X-Request-Source: arena headers_from_env: - "X-API-Key=API_KEY_ENV"headers_from_env entries use the format Header-Name=ENV_VAR_NAME. If the env var is unset, the header is silently skipped.
Skill Filtering
Section titled “Skill Filtering”Limit which skills from the remote agent are exposed as tools:
a2a_agents: - name: echo_agent url: "http://localhost:9877" skill_filter: allowlist: - echoUse blocklist to exclude specific skills instead:
skill_filter: blocklist: - debug - adminTimeout
Section titled “Timeout”Set a per-request timeout in milliseconds:
a2a_agents: - name: echo_agent url: "http://localhost:9877" timeout_ms: 5000Complete Remote Agent Config
Section titled “Complete Remote Agent Config”a2a_agents: - name: echo_agent url: "http://localhost:9877" auth: scheme: Bearer token: "test-token-123" headers: X-Tenant-ID: acme headers_from_env: - "X-API-Key=API_KEY_ENV" timeout_ms: 5000 skill_filter: allowlist: - echoResponse Rules Reference
Section titled “Response Rules Reference”Match Config
Section titled “Match Config”| Field | Description |
|---|---|
contains | Case-insensitive substring match on message text |
regex | Regular expression match on message text |
Error Responses
Section titled “Error Responses”Return a failed task instead of a successful response:
responses: - skill: search_papers error: "Service temporarily unavailable"Next Steps
Section titled “Next Steps”- A2A Concept — learn about the A2A protocol
- Use Mock Server — use the mock programmatically in Go tests
- Write Scenarios — general scenario authoring guide
- Assertions Reference — all available assertion types