Skip to content

Exec Hooks

Run external programs as hooks to intercept provider calls, tool executions, and session events. Hooks can be written in any language — the runtime communicates over stdin/stdout using JSON.


Add a provider hook to your RuntimeConfig YAML:

spec:
hooks:
pii_redactor:
command: ./hooks/pii-redactor
hook: provider
phases: [before_call]
mode: filter
timeout_ms: 3000

The runtime starts ./hooks/pii-redactor before each provider call, sends the request as JSON on stdin, and reads the verdict from stdout.


There are four hook types, each receiving a different payload on stdin.

Intercept LLM provider requests and responses.

PhaseWhen it fires
before_callBefore the request is sent to the provider
after_callAfter the provider returns a response

Intercept tool executions.

PhaseWhen it fires
before_executionBefore a tool handler runs
after_executionAfter a tool handler returns

Observe session lifecycle events.

PhaseWhen it fires
session_startA new session begins
session_updateThe session state changes
session_endThe session ends

Observe eval results as they are produced by the runner. Eval hooks are always fire-and-forgetmode and phases are ignored; the subprocess never gates execution.

PhaseWhen it fires
(implicit)Once per executed eval, after the handler runs, before the result is emitted

Each hook runs in one of two modes.

The hook decides whether the operation proceeds. If the subprocess returns {"allow": false}, crashes, or exceeds the timeout, the operation is denied.

pii_redactor:
command: ./hooks/pii-redactor
hook: provider
phases: [before_call, after_call]
mode: filter
timeout_ms: 3000

The hook receives the event but cannot block the pipeline. Subprocess failures are swallowed — the operation always continues.

audit_logger:
command: ./hooks/audit-logger
hook: session
phases: [session_start, session_update, session_end]
mode: observe

Hooks only run for the phases listed in phases. A provider hook configured with phases: [before_call] will not fire after the provider responds.

spec:
hooks:
input_guard:
command: ./hooks/input-guard
hook: provider
phases: [before_call] # runs before the call only
mode: filter
response_logger:
command: ./hooks/response-logger
hook: provider
phases: [after_call] # runs after the call only
mode: observe

The runtime starts the subprocess, writes a JSON object to stdin, and reads a JSON object from stdout.

stdin:

{
"hook": "provider",
"phase": "before_call",
"request": {
"messages": [{"role": "user", "content": "..."}],
"model": "gpt-4o"
}
}

stdout — allow:

{"allow": true}

stdout — deny:

{"allow": false, "reason": "PII detected in input"}

stdout — deny with enforcement detail:

{"allow": false, "enforced": true, "reason": "PII redacted"}

stdin:

{
"hook": "tool",
"phase": "before_execution",
"request": {
"name": "db_query",
"args": {"sql": "SELECT * FROM users"}
}
}

stdout — allow:

{"allow": true}

stdout — deny:

{"allow": false, "reason": "Query not in allowlist"}

stdin:

{
"hook": "session",
"phase": "session_start",
"event": {
"session_id": "abc-123",
"messages": []
}
}

stdout:

{"ack": true}

The eval runner writes the raw EvalResult JSON to the subprocess’s stdin. Stdout is ignored — this is strictly fire-and-forget.

stdin:

{
"eval_id": "assertion_1_tool_called",
"type": "assertion",
"score": 1.0,
"passed": true,
"duration_ms": 3,
"explanation": "tool 'lookup_order' was called",
"details": {"tool_name": "lookup_order"}
}

stdout: discarded.

Errors, non-zero exits, and timeouts are logged via the runtime logger but never propagate to the eval pipeline. Missing stdout, empty stdout, or any other I/O anomaly is not an error.


Block requests containing personally identifiable information before they reach the provider.

spec:
hooks:
pii_redactor:
command: ./hooks/pii-redactor
hook: provider
phases: [before_call, after_call]
mode: filter
timeout_ms: 3000

A minimal implementation in Python:

#!/usr/bin/env python3
import json, sys, re
payload = json.load(sys.stdin)
messages = payload.get("request", {}).get("messages", [])
PII_PATTERN = re.compile(r"\b\d{3}-\d{2}-\d{4}\b") # SSN pattern
for msg in messages:
content = msg.get("content", "")
if isinstance(content, str) and PII_PATTERN.search(content):
json.dump({"allow": False, "reason": "PII detected in input"}, sys.stdout)
sys.exit(0)
json.dump({"allow": True}, sys.stdout)

Log session events to an external system without blocking the pipeline.

spec:
hooks:
audit_logger:
command: ./hooks/audit-logger
hook: session
phases: [session_start, session_update, session_end]
mode: observe
#!/usr/bin/env python3
import json, sys, datetime
payload = json.load(sys.stdin)
entry = {
"timestamp": datetime.datetime.utcnow().isoformat(),
"phase": payload["phase"],
"session_id": payload.get("event", {}).get("session_id"),
}
with open("/var/log/promptkit-audit.jsonl", "a") as f:
f.write(json.dumps(entry) + "\n")
json.dump({"ack": True}, sys.stdout)

Push every eval result to an external metrics backend without blocking the eval pipeline.

spec:
hooks:
eval_metrics:
command: python3
args: [./hooks/eval-metrics.py]
hook: eval
timeout_ms: 5000
#!/usr/bin/env python3
import json, sys, urllib.request
result = json.load(sys.stdin)
payload = {
"eval_id": result["eval_id"],
"score": result.get("score", 0.0),
"passed": result.get("passed", False),
"duration_ms": result.get("duration_ms", 0),
}
req = urllib.request.Request(
"https://metrics.internal/evals",
data=json.dumps(payload).encode(),
headers={"Content-Type": "application/json"},
)
urllib.request.urlopen(req, timeout=2) # stdout is ignored

Only permit pre-approved SQL queries to run.

spec:
hooks:
query_allowlist:
command: python3
args: [./hooks/query-allowlist.py]
hook: tool
phases: [before_execution]
mode: filter
#!/usr/bin/env python3
import json, sys
ALLOWED_QUERIES = {
"SELECT * FROM products WHERE category = ?",
"SELECT COUNT(*) FROM orders WHERE status = ?",
}
payload = json.load(sys.stdin)
request = payload.get("request", {})
if request.get("name") != "db_query":
json.dump({"allow": True}, sys.stdout)
sys.exit(0)
sql = request.get("args", {}).get("sql", "")
if sql in ALLOWED_QUERIES:
json.dump({"allow": True}, sys.stdout)
else:
json.dump({"allow": False, "reason": "Query not in allowlist"}, sys.stdout)

spec:
hooks:
pii_redactor:
command: ./hooks/pii-redactor
hook: provider
phases: [before_call, after_call]
mode: filter
timeout_ms: 3000
query_allowlist:
command: python3
args: [./hooks/query-allowlist.py]
hook: tool
phases: [before_execution]
mode: filter
audit_logger:
command: ./hooks/audit-logger
hook: session
phases: [session_start, session_update, session_end]
mode: observe
eval_metrics:
command: python3
args: [./hooks/eval-metrics.py]
hook: eval
timeout_ms: 5000
FieldRequiredDescription
commandYesPath to the executable or interpreter
argsNoAdditional arguments passed to the command
hookYesHook type: provider, tool, session, or eval
phasesYes (ignored for eval)List of phases this hook fires on
modeYes (ignored for eval)filter (fail-closed) or observe (fire-and-forget)
timeout_msNoSubprocess timeout in milliseconds