Configure LLM Providers
Learn how to configure and manage LLM providers for testing.
Overview
Section titled “Overview”Providers define how PromptArena connects to different LLM services (OpenAI, Anthropic, Google, etc.). Each provider configuration specifies authentication, model selection, and default parameters.
Quick Start with Templates
Section titled “Quick Start with Templates”The easiest way to set up providers is using the project generator:
# Create project with OpenAIpromptarena init my-test --quick --provider openai
# Or choose during interactive setuppromptarena init my-test# Select provider when prompted: openai, anthropic, google, or mockThis automatically creates a working provider configuration with:
- Correct API version and schema
- Recommended model defaults
- Environment variable setup (.env file)
- Ready-to-use configuration
Manual Provider Configuration
Section titled “Manual Provider Configuration”For custom setups or advanced configurations, create provider files in providers/ directory:
# providers/openai.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: openai-gpt4o-mini labels: provider: openai
spec: type: openai model: gpt-4o-mini
capabilities: - text - streaming - vision - tools - json
defaults: temperature: 0.6 max_tokens: 2000 top_p: 1.0Authentication uses the OPENAI_API_KEY environment variable automatically.
Provider Capabilities
Section titled “Provider Capabilities”The capabilities field declares what features a provider supports. Scenarios can then use required_capabilities to only run against providers with matching capabilities.
Available Capabilities:
| Capability | Description |
|---|---|
text | Basic text completion |
streaming | Streaming responses |
vision | Image understanding |
tools | Function/tool calling |
json | JSON mode output |
audio | Audio input understanding |
video | Video input understanding |
documents | PDF/document upload support |
duplex | Real-time bidirectional audio |
Example - Vision-capable provider:
spec: type: openai model: gpt-4o capabilities: - text - streaming - vision - tools - jsonExample - Audio-only provider:
spec: type: openai model: gpt-4o-audio-preview capabilities: - audio - duplexExample - Local model with limited capabilities:
spec: type: ollama model: llama3.2:3b capabilities: - text - streaming - tools - json # Note: llama3.2:3b does NOT support vision (only 11B/90B models do)When a scenario specifies required_capabilities, only providers with ALL listed capabilities will run that scenario. See Write Scenarios for details.
Supported Providers
Section titled “Supported Providers”OpenAI
Section titled “OpenAI”# providers/openai-gpt4.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: openai-gpt4o labels: provider: openai
spec: type: openai model: gpt-4o
defaults: temperature: 0.7 max_tokens: 4000Available Models: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo
Anthropic Claude
Section titled “Anthropic Claude”# providers/claude.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: claude-sonnet labels: provider: anthropic
spec: type: anthropic model: claude-3-5-sonnet-20241022
defaults: temperature: 0.6 max_tokens: 4000Authentication uses the ANTHROPIC_API_KEY environment variable automatically.
Available Models: claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022, claude-3-opus-20240229
Google Gemini
Section titled “Google Gemini”# providers/gemini.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: gemini-flash labels: provider: google
spec: type: gemini model: gemini-1.5-flash
defaults: temperature: 0.7 max_tokens: 2000Authentication uses the GOOGLE_API_KEY environment variable automatically.
Available Models: gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash-exp
Azure OpenAI
Section titled “Azure OpenAI”# providers/azure-openai.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: azure-openai-gpt4o labels: provider: azure-openai
spec: type: azure-openai model: gpt-4o
base_url: https://your-resource.openai.azure.com
defaults: temperature: 0.6 max_tokens: 2000Authentication uses the AZURE_OPENAI_API_KEY environment variable automatically.
Ollama (Local)
Section titled “Ollama (Local)”# providers/ollama.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: ollama-llama labels: provider: ollama
spec: type: ollama model: llama3.2:1b base_url: http://localhost:11434
additional_config: keep_alive: "5m" # Keep model loaded for 5 minutes
defaults: temperature: 0.7 max_tokens: 2048No API key required - Ollama runs locally. Start Ollama with:
# Install Ollamabrew install ollama # macOS# or visit https://ollama.ai for other platforms
# Start Ollama serverollama serve
# Pull a modelollama pull llama3.2:1bOr use Docker:
docker run -d -p 11434:11434 -v ollama:/root/.ollama ollama/ollamadocker exec -it <container> ollama pull llama3.2:1bAvailable Models: Any model from ollama list - llama3.2:1b, llama3.2:3b, mistral, llava, deepseek-r1:8b, etc.
vLLM (High-Performance)
Section titled “vLLM (High-Performance)”# providers/vllm.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: vllm-llama labels: provider: vllm
spec: type: vllm model: meta-llama/Llama-3.2-3B-Instruct base_url: http://localhost:8000
additional_config: use_beam_search: false best_of: 1 # Guided decoding for structured output # guided_json: '{"type": "object", "properties": {...}}'
defaults: temperature: 0.7 max_tokens: 2048No API key required - vLLM runs as a self-hosted service. Start vLLM with Docker:
# GPU-accelerated (recommended)docker run --rm --gpus all \ -p 8000:8000 \ vllm/vllm-openai:latest \ --model meta-llama/Llama-3.2-3B-Instruct \ --dtype half \ --max-model-len 4096
# CPU-only (for testing, slow)docker run --rm \ -p 8000:8000 \ vllm/vllm-openai:latest \ --model meta-llama/Llama-3.2-1B-Instruct \ --max-model-len 2048Or use Docker Compose:
services: vllm: image: vllm/vllm-openai:latest ports: - "8000:8000" volumes: - vllm_cache:/root/.cache/huggingface command: - --model - meta-llama/Llama-3.2-3B-Instruct - --dtype - half - --max-model-len - "4096" deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu]
volumes: vllm_cache:Available Models: Any HuggingFace model supported by vLLM - Llama 3.x, Mistral, Qwen, Phi, LLaVA for vision, etc. See vLLM docs.
Advanced Features:
# Guided JSON outputspec: additional_config: guided_json: | { "type": "object", "properties": { "sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]}, "confidence": {"type": "number", "minimum": 0, "maximum": 1} }, "required": ["sentiment", "confidence"] }
# Regex-constrained outputspec: additional_config: guided_regex: "^[0-9]{3}-[0-9]{3}-[0-9]{4}$" # Phone number format
# Choice selectionspec: additional_config: guided_choice: ["yes", "no", "maybe"]Arena Configuration
Section titled “Arena Configuration”Reference providers in your arena.yaml:
apiVersion: promptkit.altairalabs.ai/v1alpha1kind: Arenametadata: name: multi-provider-arena
spec: prompt_configs: - id: support file: prompts/support.yaml
providers: - file: providers/openai.yaml - file: providers/claude.yaml - file: providers/gemini.yaml
scenarios: - file: scenarios/customer-support.yamlAuthentication Setup
Section titled “Authentication Setup”Environment Variables
Section titled “Environment Variables”Set API keys as environment variables:
# Add to ~/.zshrc or ~/.bashrcexport OPENAI_API_KEY="sk-..."export ANTHROPIC_API_KEY="sk-ant-..."export GOOGLE_API_KEY="..."
# Reload shell configurationsource ~/.zshrc.env File (Local Development)
Section titled “.env File (Local Development)”Create a .env file (never commit this):
# .envOPENAI_API_KEY=sk-...ANTHROPIC_API_KEY=sk-ant-...GOOGLE_API_KEY=...Load environment variables before running:
# Load .env and run testsexport $(cat .env | xargs) && promptarena runCI/CD Secrets
Section titled “CI/CD Secrets”For GitHub Actions, GitLab CI, or other platforms:
# .github/workflows/test.ymlenv: OPENAI_API_KEY: $ ANTHROPIC_API_KEY: $Common Configurations
Section titled “Common Configurations”Multiple Model Variants
Section titled “Multiple Model Variants”Test across different model sizes/versions:
# providers/openai-gpt4.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: openai-gpt4 labels: provider: openai tier: premium
spec: type: openai model: gpt-4o defaults: temperature: 0.6
---# providers/openai-mini.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: openai-mini labels: provider: openai tier: cost-effective
spec: type: openai model: gpt-4o-mini defaults: temperature: 0.6Temperature Variations
Section titled “Temperature Variations”# providers/openai-creative.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: openai-creative labels: mode: creative
spec: type: openai model: gpt-4o defaults: temperature: 0.9 # More creative/random
---# providers/openai-precise.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Providermetadata: name: openai-precise labels: mode: deterministic
spec: type: openai model: gpt-4o defaults: temperature: 0.1 # More deterministicProvider Selection
Section titled “Provider Selection”Run Specific Providers
Section titled “Run Specific Providers”# Test with only OpenAIpromptarena run --provider openai-gpt4
# Test with multiple providerspromptarena run --provider openai-gpt4,claude-sonnet
# Test all configured providers (default)promptarena runScenario-specific Providers
Section titled “Scenario-specific Providers”Use labels to specify provider constraints:
# scenarios/openai-only.yamlapiVersion: promptkit.altairalabs.ai/v1alpha1kind: Scenariometadata: name: openai-specific-test labels: provider-specific: openai
spec: task_type: support
turns: - role: user content: "Test message"Parameter Overrides
Section titled “Parameter Overrides”Override provider parameters at runtime:
# Override temperature for all providerspromptarena run --temperature 0.8
# Override max tokenspromptarena run --max-tokens 1000
# Combined overridespromptarena run --temperature 0.9 --max-tokens 4000Validation
Section titled “Validation”Verify provider configuration:
# Inspect loaded providerspromptarena config-inspect
# Should show:# Providers:# ✓ openai-gpt4 (providers/openai.yaml)# ✓ claude-sonnet (providers/claude.yaml)Troubleshooting
Section titled “Troubleshooting”Authentication Errors
Section titled “Authentication Errors”# Verify API key is setecho $OPENAI_API_KEY# Should display: sk-...
# Test with verbose loggingpromptarena run --provider openai-gpt4 --verboseProvider Not Found
Section titled “Provider Not Found”# Check provider configurationpromptarena config-inspect --verbose
# Verify file path in arena.yaml matches actual file locationRate Limiting
Section titled “Rate Limiting”Configure concurrency to avoid rate limits:
# Reduce concurrent requestspromptarena run --concurrency 2
# For large test suitespromptarena run --concurrency 1 # Sequential executionNext Steps
Section titled “Next Steps”- Use Mock Providers - Test without API calls
- Validate Outputs - Add assertions
- Integrate CI/CD - Automate testing
- Config Reference - Complete configuration options
Examples
Section titled “Examples”See working provider configurations in:
examples/customer-support/providers/examples/mcp-chatbot/providers/examples/ollama-local/providers/- Local Ollama setup with Docker