Skip to content

A2A (Agent-to-Agent)

A deep dive into the A2A protocol design, task lifecycle, and how PromptKit implements it.


Modern AI applications often need multiple specialized agents working together. A research agent might delegate citation lookups to a knowledge agent, or an orchestrator might fan out tasks to domain experts.

A2A solves this by treating agents as services — each agent publishes a card describing its capabilities, and other agents call it over standard HTTP. This enables:

  • Skill delegation — route tasks to the best-suited agent
  • Composability — build complex systems from simple, focused agents
  • Language independence — any HTTP client can call any A2A server
  • Discovery — agents self-describe their skills, input/output modes, and capabilities

A2A uses JSON-RPC 2.0 over HTTP. All method calls go to a single endpoint (POST /a2a), and agent discovery uses a well-known URL.

EndpointMethodPurpose
GET /.well-known/agent.jsonHTTP GETAgent discovery
POST /a2amessage/sendSend a message (synchronous)
POST /a2amessage/streamSend a message (SSE streaming)
POST /a2atasks/getGet task by ID
POST /a2atasks/cancelCancel a running task
POST /a2atasks/listList tasks by context ID
POST /a2atasks/subscribeSubscribe to task updates (SSE)

Every request is a standard JSON-RPC envelope:

{
"jsonrpc": "2.0",
"id": 1,
"method": "message/send",
"params": { ... }
}

An Agent Card is a JSON document served at /.well-known/agent.json that describes an agent’s identity and capabilities:

{
"name": "Research Agent",
"description": "Searches academic papers on a given topic",
"version": "1.0.0",
"capabilities": {
"streaming": true
},
"skills": [
{
"id": "search_papers",
"name": "Search Papers",
"description": "Search for academic papers on a given topic",
"tags": ["research", "papers"]
}
],
"defaultInputModes": ["text/plain"],
"defaultOutputModes": ["text/plain"]
}

Key fields:

FieldDescription
nameHuman-readable agent name
descriptionWhat the agent does
capabilitiesFeature flags (streaming, push notifications)
skillsList of specific tasks the agent can perform
defaultInputModesMIME types the agent accepts (e.g., text/plain, image/png)
defaultOutputModesMIME types the agent can produce

Skills can override the agent’s default input/output modes with their own inputModes and outputModes.


Every message creates a Task that progresses through a state machine:

stateDiagram-v2
[*] --> submitted
submitted --> working
working --> completed
working --> failed
working --> canceled
working --> input_required
working --> auth_required
working --> rejected
input_required --> working
input_required --> canceled
auth_required --> working
auth_required --> canceled
StateMeaning
submittedTask created, not yet processing
workingAgent is actively processing
completedTask finished successfully
failedTask encountered an error
canceledTask was canceled by the caller
input_requiredAgent needs more information from the caller
auth_requiredAgent requires authentication
rejectedAgent declined the task

Terminal states (completed, failed, canceled, rejected) cannot transition further. The input_required and auth_required states allow the caller to provide additional input and resume processing.


Messages consist of parts — each part carries one type of content:

type Part struct {
Text *string `json:"text,omitempty"`
Raw []byte `json:"raw,omitempty"`
URL *string `json:"url,omitempty"`
Data map[string]any `json:"data,omitempty"`
Metadata map[string]any `json:"metadata,omitempty"`
Filename string `json:"filename,omitempty"`
MediaType string `json:"mediaType,omitempty"`
}

A message has a role (user or agent), a list of parts, and optional metadata:

type Message struct {
MessageID string `json:"messageId"`
ContextID string `json:"contextId,omitempty"`
TaskID string `json:"taskId,omitempty"`
Role Role `json:"role"`
Parts []Part `json:"parts"`
Metadata map[string]any `json:"metadata,omitempty"`
}

The contextId groups related tasks into a conversation. If omitted, the server generates one automatically.


When a task completes, the agent’s output is stored as artifacts on the task:

type Artifact struct {
ArtifactID string `json:"artifactId"`
Name string `json:"name,omitempty"`
Description string `json:"description,omitempty"`
Parts []Part `json:"parts"`
}

A single task can produce multiple artifacts (e.g., text response + generated image).


The message/stream method returns Server-Sent Events (SSE) instead of a single JSON response. The server sends two types of events:

TaskStatusUpdateEvent — emitted when the task state changes:

{
"taskId": "abc123",
"contextId": "ctx456",
"status": { "state": "working" }
}

TaskArtifactUpdateEvent — emitted as the agent produces output:

{
"taskId": "abc123",
"contextId": "ctx456",
"artifact": {
"artifactId": "artifact-0",
"parts": [{ "text": "The capital" }]
},
"append": true
}

Each SSE event is wrapped in a JSON-RPC response envelope:

data: {"jsonrpc":"2.0","id":1,"result":{"taskId":"abc123","status":{"state":"working"}}}
data: {"jsonrpc":"2.0","id":1,"result":{"taskId":"abc123","artifact":{"artifactId":"artifact-0","parts":[{"text":"Hello"}]},"append":true}}

The client parses these events and delivers them as a channel of StreamEvent values, each containing either a StatusUpdate or ArtifactUpdate.


JSON-RPC provides a clean request/response model with typed methods, error codes, and request IDs — all over a single HTTP endpoint. This avoids the complexity of REST path design and keeps the protocol simple.

Agent cards enable dynamic discovery. A client can connect to any A2A server, fetch its card, and understand what it can do — no hardcoded knowledge required. The Tool Bridge uses this to automatically generate tool descriptors from agent skills.

PromptKit splits A2A into two packages:

  • runtime/a2a — protocol types, client, tool bridge, mock server, and conversion helpers. This is for consuming A2A services.
  • sdk — A2A server (A2AServer), task store, and conversation opener. This is for exposing an agent as an A2A service.

This separation keeps the runtime free of SDK dependencies while letting the SDK build on top of the protocol types.


A2A handles agent-to-agent communication — agents calling other agents as services. For the complementary problem of connecting agents to frontend applications (user interfaces, chat widgets, dashboards), PromptKit supports the AG-UI protocol. Together, these protocols cover the full communication surface:

ProtocolDirectionUse Case
MCPAgent ↔ ToolsTool discovery and execution
A2AAgent ↔ AgentsTask delegation between agents
AG-UIAgent ↔ FrontendsReal-time streaming to user interfaces