Generate Mock Responses from Arena Results
Convert recorded Arena JSON results into mock provider YAML and replay conversations without calling an external LLM. This is ideal for tightening CI feedback loops and keeping IoT maintenance demos deterministic.
Prerequisites
- Arena run outputs in
out/*.json(generated bypromptarena run ... --format json) - Go toolchain installed (the CLI builds on demand)
- A workspace where you want to store mock responses (e.g.,
providers/mock-generated.yamlor per-scenario files)
Steps
-
Run Arena and capture JSON results
promptarena run \ --scenario hardware-faults \ --provider openai-gpt4o \ --format json \ --out out -
Generate mocks from the recorded runs
promptarena mocks generate \ --input out \ --scenario hardware-faults \ --provider openai-gpt4o \ --output providers/mock-generated.yaml \ --merge--inputcan be a single run file or a directory of JSON results.--scenario/--providerfilter which runs are included.--mergeoverlays onto an existing mock file instead of overwriting.
-
(Optional) Split per scenario
promptarena mocks generate \ --input out \ --per-scenario \ --output providers/responses \ --mergeThis writes one YAML per scenario under
providers/responses/. -
(Optional) Preview without writing
promptarena mocks generate --input out --dry-runPrints the generated YAML to stdout.
Example: IoT Maintenance
Using the hardware-faults run artifacts in tools/arena/templates/testdata:
promptarena mocks generate \
--input tools/arena/templates/testdata \
--scenario hardware-faults \
--output iot-maintenance-demo/providers/responses/mock-assistant.yaml \
--merge
This refreshes the IoT maintenance demo mocks with real tool calls and responses captured from a prior OpenAI run.
Tips
- Add a
--default-responseif you want a fallback when no turn-specific response exists. - Keep recorded JSON fixtures under version control (
tools/arena/templates/testdata/) so tests stay deterministic. - After generating mocks, run your flows with the mock provider to validate determinism before committing.
Was this page helpful?