PromptPack is an open-source specification for defining LLM prompts, test scenarios, and configurations in a portable, version-controllable format.

Official Documentation

For complete specification documentation, please visit:

PromptPack.org 📘

The official PromptPack specification site includes:


PromptArena Implementation

PromptArena is a reference implementation and testing tool for PromptPack files.

Supported Features

Quick Start

# Run a test scenario
promptarena run examples/arena-media-test/arena.yaml

# Test across multiple providers
promptarena run arena.yaml --provider openai,anthropic --format html

PromptArena-Specific Extensions

While implementing the PromptPack specification, PromptArena adds these testing-focused features:

1. Arena Configuration Resource

The Arena resource orchestrates testing across multiple prompts, providers, and scenarios:

apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Arena
metadata:
  name: my-test-suite
spec:
  prompt_configs:
    - id: support
      file: prompts/support-bot.yaml
  
  providers:
    - file: providers/openai-gpt4o.yaml
    - file: providers/claude-sonnet.yaml
  
  scenarios:
    - file: scenarios/test-1.yaml
  
  # MCP server integration
  mcp_servers:
    filesystem:
      command: npx
      args: ["@modelcontextprotocol/server-filesystem", "/data"]
  
  defaults:
    output:
      dir: out
      formats: ["html", "json"]

2. Enhanced Assertions

PromptArena extends standard assertions with testing-specific validators:

# Turn-level assertions
assertions:
  # Content validation
  - type: content_includes
  - type: content_matches
  
  # Tool usage validation
  - type: tools_called
  - type: tools_not_called
  
  # JSON validation
  - type: is_valid_json
  - type: json_schema
  - type: json_path
  
  # Multimodal validation
  - type: image_format
  - type: image_dimensions
  - type: audio_format
  - type: audio_duration
  - type: video_resolution
  - type: video_duration
  
  # LLM Judge
  - type: llm_judge

# Conversation-level assertions (in conversation_assertions field)
conversation_assertions:
  - type: tools_called
  - type: tools_not_called
  - type: tool_calls_with_args
  - type: tools_not_called_with_args
  - type: content_includes_any
  - type: content_not_includes
  - type: llm_judge_conversation

See the Assertions Guide for complete documentation.

3. Multimodal Testing

PromptArena implements PromptPack v1.1 multimodal support with comprehensive testing capabilities:

# In PromptConfig
spec:
  media:
    enabled: true
    supported_types: [image, audio, video]
    image:
      max_size_mb: 20
      allowed_formats: [jpeg, png, webp]
# In Scenario
turns:
  - role: user
    content:
      - type: text
        patterns: ["What's in this image?"]
      - type: image
        image_url:
          url: "path/to/image.jpg"
          detail: "high"

See examples/arena-media-test/ for complete examples.

4. Mock Provider Support

Test without API costs using the Mock provider with configurable responses:

apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Provider
metadata:
  name: mock-provider
spec:
  type: mock
  model: mock-model

Configure responses in providers/mock-responses.yaml. See Mock Provider Usage.

5. Self-Play Testing

Define AI personas to automatically test conversational flows:

apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Persona
metadata:
  name: frustrated-customer
spec:
  system_prompt: |
    You are a frustrated customer...
  max_turns: 8
  goal: "Get reassurance about order delivery and feel heard"

See the Self-Play Guide for details.


Directory Structure

Recommended project layout for PromptArena tests:

my-project/
├── arena.yaml           # Main Arena configuration
├── prompts/
│   ├── support.yaml
│   └── sales.yaml
├── scenarios/
│   ├── smoke-tests/
│   └── regression/
├── providers/
│   ├── mock.yaml
│   └── openai.yaml
├── tools/
│   └── weather.yaml
└── out/                 # Generated reports (add to .gitignore)

Version Support

PromptPack VersionPromptArena SupportKey Features
v1.0✅ FullCore specification
v1.1✅ FullMultimodal support (images, audio, video)
v1alpha1✅ FullKubernetes-style resource format

Learn More

PromptPack Specification

PromptArena Guides

Examples


Questions? Visit PromptPack.org or GitHub Discussions