Skip to content

Runtime Protocols

The AgentCore runtime serves three communication protocols on port 8080 (HTTP bridge). Clients can choose between blocking JSON, Server-Sent Events (SSE) streaming, or WebSocket depending on their latency and interactivity requirements.

The runtime also serves the A2A protocol on port 9000 for agent-to-agent communication. The protocol config field controls which servers are started.

The protocol field in the deploy config controls which servers the runtime starts:

ValuePort 8080 (HTTP bridge)Port 9000 (A2A server)Use case
"both" (default)StartedStartedStandard deployment. Supports external HTTP clients and inter-agent A2A calls.
"http"StartedSkippedExternal-facing agents that do not participate in multi-agent A2A networks.
"a2a"SkippedStartedInternal agents that are only called by other agents via A2A.

When omitted, the runtime defaults to "both".

All HTTP bridge endpoints are served on port 8080.

EndpointMethodDescription
POST /invocationsPOSTAgent invocation (blocking JSON or SSE streaming)
/wsGET (upgrade)WebSocket bidirectional messaging
/pingGETHealth check

Sends a message to the agent and waits for the complete response. This is the default mode when no Accept: text/event-stream header is present.

POST /invocations HTTP/1.1
Content-Type: application/json
X-Amzn-Bedrock-AgentCore-Runtime-Session-Id: session-123 (optional)
{
"prompt": "What is the capital of France?",
"metadata": {
"user_id": "u-abc",
"trace_id": "t-xyz"
}
}

Request fields:

FieldTypeRequiredDescription
promptstringYes (or input)The user’s message. Takes priority over input.
inputstringYes (or prompt)Alternative field name for the user’s message. Used when prompt is empty.
metadataobjectNoArbitrary metadata forwarded to the A2A server as message-level metadata.

Any additional top-level fields beyond prompt, input, and metadata are captured and forwarded under metadata.payload to avoid collisions with explicit metadata.

Headers:

HeaderRequiredDescription
Content-TypeYesMust be application/json.
X-Amzn-Bedrock-AgentCore-Runtime-Session-IdNoSession ID for multi-turn conversation continuity. Maps to the A2A contextId.
{
"response": "The capital of France is Paris.",
"status": "success",
"task_id": "task-001",
"context_id": "session-123",
"usage": {
"input_tokens": 12,
"output_tokens": 8
}
}

Response fields:

FieldTypeAlways presentDescription
responsestringYesThe agent’s text response. Concatenated from all artifact parts.
statusstringYes"success" or "error".
task_idstringNoThe A2A task ID. Omitted when empty.
context_idstringNoThe A2A context ID (session). Omitted when empty.
usageobjectNoToken usage from the LLM. Omitted when not available.
usage.input_tokensintegerNoNumber of input tokens consumed.
usage.output_tokensintegerNoNumber of output tokens generated.

On failure, the response has status: "error" and the error message in response:

{
"response": "rate limit exceeded",
"status": "error"
}

HTTP status codes:

CodeCause
200Success (check status field for application-level errors)
400Missing or invalid JSON body, or missing prompt/input
502A2A server unavailable
500Internal error

When the client sends Accept: text/event-stream, the bridge switches to streaming mode. Instead of waiting for the full response, it relays individual events as they arrive from the A2A server.

Same as blocking mode, with the addition of the Accept header:

POST /invocations HTTP/1.1
Content-Type: application/json
Accept: text/event-stream
X-Amzn-Bedrock-AgentCore-Runtime-Session-Id: session-123 (optional)
{
"prompt": "Write a short poem about clouds."
}

The response is a standard SSE stream. Each event is a data: line containing a JSON object:

data: {"type":"status","state":"working","task_id":"task-001","context_id":"session-123"}
data: {"type":"text","content":"Soft pillows ","task_id":"task-001","context_id":"session-123"}
data: {"type":"text","content":"drift across ","task_id":"task-001","context_id":"session-123"}
data: {"type":"text","content":"the azure sky.","task_id":"task-001","context_id":"session-123"}
data: {"type":"status","state":"completed","task_id":"task-001","context_id":"session-123"}
data: {"type":"done"}

SSE event fields:

FieldTypeDescription
typestringEvent type: "status", "text", "error", or "done".
contentstringText content (for "text" and "error" events).
statestringTask state (for "status" events): "working", "completed", "failed", "canceled", "rejected".
task_idstringThe A2A task ID.
context_idstringThe A2A context ID (session).

Event sequence:

  1. status with state: "working" — the agent has started processing.
  2. Zero or more text events — incremental response chunks.
  3. status with a terminal state (completed, failed, canceled, or rejected).
  4. done — signals the end of the stream. Always the last event.

Error during stream:

If the A2A server returns a JSON-RPC error mid-stream, it appears as an error event:

data: {"type":"error","content":"model overloaded"}

Response headers:

HeaderValue
Content-Typetext/event-stream
Cache-Controlno-cache
Connectionkeep-alive

The /ws endpoint provides bidirectional messaging over a persistent WebSocket connection. Each message sent by the client triggers a blocking A2A invocation, and the response is written back to the same connection.

ws://host:8080/ws

The WebSocket upgrade uses permissive origin checks (the bridge is only reachable from within the AgentCore VPC). Maximum message size is 1 MiB.

{
"prompt": "What is 2 + 2?",
"metadata": {
"user_id": "u-abc"
}
}

Client message fields:

FieldTypeRequiredDescription
promptstringYes (or input)The user’s message. Takes priority over input.
inputstringYes (or prompt)Alternative field name for the user’s message.
metadataobjectNoArbitrary metadata forwarded to the A2A server.

For each client message, the server sends two messages:

Success:

{"type":"text","content":"2 + 2 = 4","task_id":"task-001","context_id":"ctx-abc","usage":{"input_tokens":8,"output_tokens":6}}
{"type":"done"}

Error:

{"type":"error","content":"agent unavailable"}

Server message fields:

FieldTypeDescription
typestring"text", "error", or "done".
contentstringResponse text (for "text") or error message (for "error").
task_idstringThe A2A task ID (present on "text" responses).
context_idstringThe A2A context ID (present on "text" responses).
usageobjectToken usage (present on "text" responses when available).
  • The connection stays open after each request/response exchange.
  • Multiple messages can be sent sequentially on the same connection.
  • If an error occurs (invalid JSON, missing prompt, A2A failure), the server sends an error message but keeps the connection open for subsequent messages.
  • The connection closes when the client disconnects or sends a close frame.

Health check endpoint. Returns the runtime’s readiness status.

{"status": "healthy"}

HTTP 200.

{"status": "draining"}

HTTP 503. Returned during graceful shutdown after SIGTERM/SIGINT.

ScenarioRecommended protocolWhy
Simple request/responseBlocking /invocationsSimplest integration. Single HTTP call.
Real-time token streamingSSE /invocationsLow-latency incremental output. Standard SSE client libraries.
Interactive chat UIWebSocket /wsPersistent connection avoids per-message overhead. Supports multi-turn without reconnecting.
Agent-to-agent callsA2A (port 9000)Native A2A protocol with task lifecycle management.
Health monitoringGET /pingLightweight liveness probe for load balancers.

Blocking:

Terminal window
curl -X POST http://localhost:8080/invocations \
-H "Content-Type: application/json" \
-d '{"prompt": "Hello"}'

SSE streaming:

Terminal window
curl -X POST http://localhost:8080/invocations \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d '{"prompt": "Tell me a story"}'

Health check:

Terminal window
curl http://localhost:8080/ping
const response = await fetch("http://localhost:8080/invocations", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Accept": "text/event-stream",
},
body: JSON.stringify({ prompt: "Tell me a story" }),
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split("\n");
buffer = lines.pop(); // keep incomplete line
for (const line of lines) {
if (!line.startsWith("data: ")) continue;
const event = JSON.parse(line.slice(6));
if (event.type === "text") process.stdout.write(event.content);
if (event.type === "done") console.log("\n[done]");
}
}
const ws = new WebSocket("ws://localhost:8080/ws");
ws.onmessage = (event) => {
const msg = JSON.parse(event.data);
if (msg.type === "text") console.log("Agent:", msg.content);
if (msg.type === "error") console.error("Error:", msg.content);
if (msg.type === "done") console.log("[done]");
};
ws.onopen = () => {
ws.send(JSON.stringify({ prompt: "What is the meaning of life?" }));
};