Tutorial 1: Your First Test

Learn the basics of PromptArena by creating and running your first LLM test.

What You’ll Learn

Prerequisites

Step 1: Install PromptArena

Choose your preferred installation method:

Option 1: Homebrew (Recommended)

brew install promptkit

Option 2: Go Install

go install github.com/AltairaLabs/PromptKit/tools/arena@latest

Verify installation:

promptarena --version

You should see the PromptArena version information.

Step 2: Create Your Test Project

The Easy Way: Use the Template Generator

# Create a complete test project in seconds
promptarena init my-first-test --quick --provider openai

# Navigate to your project
cd my-first-test

That’s it! The init command created everything you need:

The Manual Way: Create Files Step-by-Step

If you prefer to understand each component:

# Create a new directory
mkdir my-first-test
cd my-first-test

# Create the directory structure
mkdir -p prompts providers scenarios

Step 3: Create a Prompt Configuration

If you used promptarena init: You already have prompts/assistant.yaml. Feel free to edit it!

If creating manually: Create prompts/greeter.yaml:

apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: PromptConfig
metadata:
  name: greeter

spec:
  task_type: greeting
  
  system_template: |
    You are a friendly assistant who greets users warmly.
    Keep responses brief and welcoming.

What’s happening here?

Step 4: Configure a Provider

Create providers/openai.yaml:

apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Provider
metadata:
  name: openai-gpt4o-mini

spec:
  type: openai
  model: gpt-4o-mini
  
  defaults:
    temperature: 0.7
    max_tokens: 150

What’s happening here?

Step 5: Set Your API Key

# Set the OpenAI API key
export OPENAI_API_KEY="sk-your-api-key-here"

# Or add to your shell profile (~/.zshrc or ~/.bashrc)
echo 'export OPENAI_API_KEY="sk-your-key"' >> ~/.zshrc
source ~/.zshrc

Step 6: Write Your First Test Scenario

Create scenarios/greeting-test.yaml:

apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Scenario
metadata:
  name: greeting-test
  labels:
    category: basic

spec:
  task_type: greeting  # Links to prompts/greeter.yaml
  
  turns:
    - role: user
      content: "Hello!"
      assertions:
        - type: content_includes
          params:
            patterns: ["hello"]
            message: "Should include greeting"
        
        - type: content_length
          params:
            max: 100
            message: "Response should be brief"
    
    - role: user
      content: "How are you?"
      assertions:
        - type: content_includes
          params:
            patterns: ["good"]
            message: "Should respond positively"

What’s happening here?

Step 7: Create Main Configuration

Create arena.yaml in your project root:

apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Arena
metadata:
  name: my-first-test

spec:
  prompt_configs:
    - id: greeter
      file: prompts/greeter.yaml
  
  providers:
    - file: providers/openai.yaml
  
  scenarios:
    - file: scenarios/greeting-test.yaml

This tells Arena which configurations to load and how to connect them.

Step 8: Run Your First Test

promptarena run

You should see output like:

🚀 PromptArena Starting...

Loading configuration...
  ✓ Loaded 1 prompt config
  ✓ Loaded 1 provider
  ✓ Loaded 1 scenario

Running tests...
  ✓ Basic Greeting - Turn 1 [openai-gpt4o-mini] (1.2s)
  ✓ Basic Greeting - Turn 2 [openai-gpt4o-mini] (1.1s)

Results:
  Total: 2 turns
  Passed: 2
  Failed: 0
  Pass Rate: 100%

Reports generated:
  - out/results.json

Step 9: Review Results

View the JSON results:

cat out/results.json

Or generate an HTML report:

promptarena run --format html

# Open in browser
open out/report-*.html

Understanding Your First Test

Let’s break down what just happened:

1. Configuration Loading

Arena loaded your prompt, provider, and scenario files.

2. Prompt Assembly

For each turn, Arena:

3. Response Validation

Arena checked each response against your assertions:

4. Report Generation

Arena saved results in multiple formats for analysis.

Experiment: Modify the Test

Add More Assertions

Edit scenarios/greeting-test.yaml to add more checks:

spec:
  turns:
    - role: user
      content: "Hello!"
      assertions:
        - type: content_includes
          params:
            patterns: ["hello"]
        
        - type: content_length
          params:
            max: 100
params:
            max_seconds: 3

Run again:

promptarena run

Test Edge Cases

Create a new scenario file scenarios/edge-cases.yaml:

apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Scenario
metadata:
  name: edge-cases
  labels:
    category: edge-case

spec:
  task_type: greeting
  
  turns:
    - role: user
      content: ""
      assertions:
        - type: content_length
          params:
            min: 10

Add it to arena.yaml:

spec:
  scenarios:
    - file: scenarios/greeting-test.yaml
    - file: scenarios/edge-cases.yaml

Adjust Temperature

Edit providers/openai.yaml:

spec:
  defaults:
    temperature: 0.2  # More deterministic
    max_tokens: 150

Run and compare:

promptarena run

Lower temperature = more consistent responses.

Common Issues

”command not found: promptarena"

# Ensure Go bin is in PATH
export PATH=$PATH:$(go env GOPATH)/bin

"API key not found"

# Verify environment variable is set
echo $OPENAI_API_KEY

# Should output: sk-...

"No scenarios found”

Check your arena.yaml paths match your directory structure:

# List your files
ls prompts/
ls providers/
ls scenarios/

“Assertion failed”

This is expected! Assertions validate quality. If one fails:

  1. Check the error message in the output
  2. Review the actual response in out/results.json
  3. Adjust your assertions or prompt as needed

Next Steps

Congratulations! You’ve run your first LLM test.

Continue learning:

Quick wins:

What’s Next?

In Tutorial 2, you’ll learn how to test the same scenario across multiple LLM providers (OpenAI, Claude, Gemini) and compare their responses.