Tutorial 1: Your First Test
Learn the basics of PromptArena by creating and running your first LLM test.
What You’ll Learn
- Install PromptArena
- Create a basic configuration
- Write your first test scenario
- Configure an LLM provider
- Run tests and review results
Prerequisites
- An OpenAI API key (free tier works)
Step 1: Install PromptArena
Choose your preferred installation method:
Option 1: Homebrew (Recommended)
brew install promptkit
Option 2: Go Install
go install github.com/AltairaLabs/PromptKit/tools/arena@latest
Verify installation:
promptarena --version
You should see the PromptArena version information.
Step 2: Create Your Test Project
The Easy Way: Use the Template Generator
# Create a complete test project in seconds
promptarena init my-first-test --quick --provider openai
# Navigate to your project
cd my-first-test
That’s it! The init command created everything you need:
- ✅ Arena configuration (
arena.yaml) - ✅ Prompt setup (
prompts/assistant.yaml) - ✅ Provider configuration (
providers/openai.yaml) - ✅ Sample test scenario (
scenarios/basic-test.yaml) - ✅ Environment setup (
.env)
The Manual Way: Create Files Step-by-Step
If you prefer to understand each component:
# Create a new directory
mkdir my-first-test
cd my-first-test
# Create the directory structure
mkdir -p prompts providers scenarios
Step 3: Create a Prompt Configuration
If you used promptarena init: You already have prompts/assistant.yaml. Feel free to edit it!
If creating manually: Create prompts/greeter.yaml:
apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: PromptConfig
metadata:
name: greeter
spec:
task_type: greeting
system_template: |
You are a friendly assistant who greets users warmly.
Keep responses brief and welcoming.
What’s happening here?
apiVersionandkind: Standard PromptKit resource identifiersmetadata.name: Identifies this prompt configuration (we’ll reference it later)spec.task_type: Categorizes the prompt’s purposespec.system_template: System instructions sent to the LLM
Step 4: Configure a Provider
Create providers/openai.yaml:
apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Provider
metadata:
name: openai-gpt4o-mini
spec:
type: openai
model: gpt-4o-mini
defaults:
temperature: 0.7
max_tokens: 150
What’s happening here?
apiVersionandkind: Standard PromptKit resource identifiersmetadata.name: Friendly name for this provider configurationspec.type: The provider type (openai, anthropic, gemini)spec.model: Specific model to usespec.defaults: Model parameters like temperature and max_tokens- Authentication uses environment variable
OPENAI_API_KEYautomatically
Step 5: Set Your API Key
# Set the OpenAI API key
export OPENAI_API_KEY="sk-your-api-key-here"
# Or add to your shell profile (~/.zshrc or ~/.bashrc)
echo 'export OPENAI_API_KEY="sk-your-key"' >> ~/.zshrc
source ~/.zshrc
Step 6: Write Your First Test Scenario
Create scenarios/greeting-test.yaml:
apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Scenario
metadata:
name: greeting-test
labels:
category: basic
spec:
task_type: greeting # Links to prompts/greeter.yaml
turns:
- role: user
content: "Hello!"
assertions:
- type: content_includes
params:
patterns: ["hello"]
message: "Should include greeting"
- type: content_length
params:
max: 100
message: "Response should be brief"
- role: user
content: "How are you?"
assertions:
- type: content_includes
params:
patterns: ["good"]
message: "Should respond positively"
What’s happening here?
apiVersionandkind: Standard PromptKit resource identifiersmetadata.name: Identifies this scenariospec.task_type: Links to the prompt configuration with matching task_typespec.turns: Array of conversation exchangesrole: user: Each user turn triggers an LLM responseassertions: Checks to validate the LLM’s response
Step 7: Create Main Configuration
Create arena.yaml in your project root:
apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Arena
metadata:
name: my-first-test
spec:
prompt_configs:
- id: greeter
file: prompts/greeter.yaml
providers:
- file: providers/openai.yaml
scenarios:
- file: scenarios/greeting-test.yaml
This tells Arena which configurations to load and how to connect them.
Step 8: Run Your First Test
promptarena run
You should see output like:
🚀 PromptArena Starting...
Loading configuration...
✓ Loaded 1 prompt config
✓ Loaded 1 provider
✓ Loaded 1 scenario
Running tests...
✓ Basic Greeting - Turn 1 [openai-gpt4o-mini] (1.2s)
✓ Basic Greeting - Turn 2 [openai-gpt4o-mini] (1.1s)
Results:
Total: 2 turns
Passed: 2
Failed: 0
Pass Rate: 100%
Reports generated:
- out/results.json
Step 9: Review Results
View the JSON results:
cat out/results.json
Or generate an HTML report:
promptarena run --format html
# Open in browser
open out/report-*.html
Understanding Your First Test
Let’s break down what just happened:
1. Configuration Loading
Arena loaded your prompt, provider, and scenario files.
2. Prompt Assembly
For each turn, Arena:
- Took the system prompt from
greeter.yaml - Filled in the user message template
- Sent the complete prompt to OpenAI
3. Response Validation
Arena checked each response against your assertions:
- Contains: Verified greeting words were present
- Max Length: Ensured response wasn’t too long
- Sentiment: Confirmed positive tone
4. Report Generation
Arena saved results in multiple formats for analysis.
Experiment: Modify the Test
Add More Assertions
Edit scenarios/greeting-test.yaml to add more checks:
spec:
turns:
- role: user
content: "Hello!"
assertions:
- type: content_includes
params:
patterns: ["hello"]
- type: content_length
params:
max: 100
params:
max_seconds: 3
Run again:
promptarena run
Test Edge Cases
Create a new scenario file scenarios/edge-cases.yaml:
apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Scenario
metadata:
name: edge-cases
labels:
category: edge-case
spec:
task_type: greeting
turns:
- role: user
content: ""
assertions:
- type: content_length
params:
min: 10
Add it to arena.yaml:
spec:
scenarios:
- file: scenarios/greeting-test.yaml
- file: scenarios/edge-cases.yaml
Adjust Temperature
Edit providers/openai.yaml:
spec:
defaults:
temperature: 0.2 # More deterministic
max_tokens: 150
Run and compare:
promptarena run
Lower temperature = more consistent responses.
Common Issues
”command not found: promptarena"
# Ensure Go bin is in PATH
export PATH=$PATH:$(go env GOPATH)/bin
"API key not found"
# Verify environment variable is set
echo $OPENAI_API_KEY
# Should output: sk-...
"No scenarios found”
Check your arena.yaml paths match your directory structure:
# List your files
ls prompts/
ls providers/
ls scenarios/
“Assertion failed”
This is expected! Assertions validate quality. If one fails:
- Check the error message in the output
- Review the actual response in
out/results.json - Adjust your assertions or prompt as needed
Next Steps
Congratulations! You’ve run your first LLM test.
Continue learning:
- Tutorial 2: Multi-Provider Testing - Test across OpenAI, Claude, and Gemini
- Tutorial 3: Multi-Turn Conversations - Build complex dialog flows
- How-To: Write Scenarios - Advanced scenario patterns
Quick wins:
- Try different models:
gpt-4o,gpt-4o-mini - Add more test cases to your scenario
- Generate HTML reports:
promptarena run --format html
What’s Next?
In Tutorial 2, you’ll learn how to test the same scenario across multiple LLM providers (OpenAI, Claude, Gemini) and compare their responses.