Test Scenario Format
PromptPack is an open-source specification for defining LLM prompts, test scenarios, and configurations in a portable, version-controllable format.
Official Documentation
Section titled “Official Documentation”For complete specification documentation, please visit:
The official PromptPack specification site includes:
- Specification Overview - Understanding the PromptPack format
- File Format & Structure - Pack JSON structure
- Schema Reference - JSON schema validation
- Real-World Examples - Complete example packs
- Getting Started Guide - Quick start instructions
- Version History - v1.0, v1.1, etc.
PromptArena Implementation
Section titled “PromptArena Implementation”PromptArena is a reference implementation and testing tool for PromptPack files.
Supported Features
Section titled “Supported Features”- ✅ PromptPack v1.1 with multimodal support (images, audio, video)
- ✅ Kubernetes-style YAML resources:
Arena,PromptConfig,Scenario,Provider,Tool,Persona - ✅ Multi-provider testing: OpenAI, Anthropic, Google Gemini, Azure, Bedrock, and Mock
- ✅ MCP (Model Context Protocol) server integration
- ✅ Comprehensive assertion framework for validation
- ✅ HTML, JSON, and Markdown output formats
Quick Start
Section titled “Quick Start”# Run a test scenariopromptarena run examples/arena-media-test/arena.yaml
# Test across multiple providerspromptarena run arena.yaml --provider openai,anthropic --format htmlQuick Links
Section titled “Quick Links”- Schema: v1.1 JSON Schema
- Local Examples:
examples/directory in this repository - Arena Guides: Writing Scenarios | Assertions | Self-Play
- Community: GitHub Discussions
PromptArena-Specific Extensions
Section titled “PromptArena-Specific Extensions”While implementing the PromptPack specification, PromptArena adds these testing-focused features:
1. Arena Configuration Resource
Section titled “1. Arena Configuration Resource”The Arena resource orchestrates testing across multiple prompts, providers, and scenarios:
apiVersion: promptkit.altairalabs.ai/v1alpha1kind: Arenametadata: name: my-test-suitespec: prompt_configs: - id: support file: prompts/support-bot.yaml
providers: - file: providers/openai-gpt4o.yaml - file: providers/claude-sonnet.yaml
scenarios: - file: scenarios/test-1.yaml
# MCP server integration mcp_servers: filesystem: command: npx args: ["@modelcontextprotocol/server-filesystem", "/data"]
defaults: output: dir: out formats: ["html", "json"]2. Enhanced Assertions
Section titled “2. Enhanced Assertions”PromptArena extends standard assertions with testing-specific validators:
# Turn-level assertionsassertions: # Content validation - type: content_includes - type: content_matches
# Tool usage validation - type: tools_called - type: tools_not_called
# JSON validation - type: is_valid_json - type: json_schema - type: json_path
# Multimodal validation - type: image_format - type: image_dimensions - type: audio_format - type: audio_duration - type: video_resolution - type: video_duration
# LLM Judge - type: llm_judge
# Conversation-level assertions (in conversation_assertions field)conversation_assertions: - type: tools_called - type: tools_not_called - type: tool_calls_with_args - type: tools_not_called_with_args - type: content_includes_any - type: content_not_includes - type: llm_judge_conversationSee the Assertions Guide for complete documentation.
3. Multimodal Testing
Section titled “3. Multimodal Testing”PromptArena implements PromptPack v1.1 multimodal support with comprehensive testing capabilities:
# In PromptConfigspec: media: enabled: true supported_types: [image, audio, video, document] image: max_size_mb: 20 allowed_formats: [jpeg, png, webp] document: max_size_mb: 32 allowed_formats: [pdf]# In Scenarioturns: - role: user content: - type: text patterns: ["What's in this image?"] - type: image image_url: url: "path/to/image.jpg" detail: "high" - type: document document_url: url: "path/to/document.pdf"See examples/arena-media-test/ and examples/document-analysis/ for complete examples.
4. Mock Provider Support
Section titled “4. Mock Provider Support”Test without API costs using the Mock provider with configurable responses:
apiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: mock-providerspec: type: mock model: mock-modelConfigure responses in providers/mock-responses.yaml. See Mock Provider Usage.
5. Self-Play Testing
Section titled “5. Self-Play Testing”Define AI personas to automatically test conversational flows:
apiVersion: promptkit.altairalabs.ai/v1alpha1kind: Personametadata: name: frustrated-customerspec: system_prompt: | You are a frustrated customer... max_turns: 8 goal: "Get reassurance about order delivery and feel heard"See the Self-Play Guide for details.
Directory Structure
Section titled “Directory Structure”Recommended project layout for PromptArena tests:
my-project/├── arena.yaml # Main Arena configuration├── prompts/│ ├── support.yaml│ └── sales.yaml├── scenarios/│ ├── smoke-tests/│ └── regression/├── providers/│ ├── mock.yaml│ └── openai.yaml├── tools/│ └── weather.yaml└── out/ # Generated reports (add to .gitignore)Version Support
Section titled “Version Support”| PromptPack Version | PromptArena Support | Key Features |
|---|---|---|
| v1.0 | ✅ Full | Core specification |
| v1.1 | ✅ Full | Multimodal support (images, audio, video) |
| v1alpha1 | ✅ Full | Kubernetes-style resource format |
Learn More
Section titled “Learn More”PromptPack Specification
Section titled “PromptPack Specification”- PromptPack.org - Official specification
- GitHub Repository - Spec source and discussions
PromptArena Guides
Section titled “PromptArena Guides”- Writing Scenarios - Create effective test cases
- Assertions Reference - Complete assertion documentation
- Self-Play Testing - AI-driven testing with personas
- MCP Integration - Model Context Protocol servers
Examples
Section titled “Examples”examples/customer-support/- Basic support botexamples/arena-media-test/- Multimodal testingexamples/mcp-chatbot/- MCP server integration
Questions? Visit PromptPack.org or GitHub Discussions