This example demonstrates Arena’s duplex streaming capabilities for testing real-time, bidirectional audio conversations with LLMs.

What is Duplex Streaming?

Duplex streaming enables simultaneous input and output audio streams, allowing for natural voice conversations where:

This is ideal for testing voice assistants, customer support bots, and any real-time conversational AI.

Features Demonstrated

FeatureDescription
Duplex ModeBidirectional audio streaming with configurable timeouts
VAD Turn DetectionVoice activity detection for natural conversation flow
Self-Play with TTSLLM-generated user messages converted to audio via TTS
Multiple ProvidersTest across Gemini 2.0 Flash and OpenAI GPT-4o Realtime
Mock ModeCI-friendly testing without API keys

Prerequisites

For Local Testing (Real Providers)

# Set your API keys
export GEMINI_API_KEY="your-gemini-api-key"
export OPENAI_API_KEY="your-openai-api-key"

For CI Testing (Mock Provider)

No API keys required - uses deterministic mock responses.

Quick Start

Run with Mock Provider (CI Mode)

# Navigate to the example directory
cd examples/duplex-streaming

# Run all scenarios with mock provider
promptarena run --provider mock-duplex

# Run a specific scenario
promptarena run --scenario duplex-basic --provider mock-duplex

Run with Real Providers (Local Testing)

# Run with Gemini 2.0 Flash (requires GEMINI_API_KEY)
promptarena run --provider gemini-2-flash

# Run with OpenAI GPT-4o Realtime (requires OPENAI_API_KEY)
promptarena run --provider openai-gpt4o-realtime

# Run specific scenario
promptarena run --scenario duplex-selfplay --provider gemini-2-flash

Scenarios

1. duplex-basic - Basic Duplex Streaming

Simple scripted conversation to verify duplex functionality:

duplex:
  timeout: "5m"
  turn_detection:
    mode: vad
    vad:
      silence_threshold_ms: 500
      min_speech_ms: 1000

2. duplex-selfplay - Self-Play with TTS

Demonstrates automated conversation testing using self-play:

turns:
  - role: selfplay-user
    persona: curious-customer
    turns: 2
    tts:
      provider: openai
      voice: alloy

3. duplex-interactive - Interactive Technical Support

Extended conversation simulating a support call:

Configuration Reference

Duplex Configuration

spec:
  duplex:
    # Maximum session duration
    timeout: "10m"

    # Turn detection settings
    turn_detection:
      mode: vad  # "vad" or "asm" (provider-native)
      vad:
        # Silence duration to trigger turn end (ms)
        silence_threshold_ms: 500
        # Minimum speech before silence counts (ms)
        min_speech_ms: 1000

TTS Configuration (Self-Play)

turns:
  - role: selfplay-user
    persona: curious-customer
    tts:
      provider: openai    # "openai", "elevenlabs", "cartesia"
      voice: alloy        # Provider-specific voice ID

Available TTS Voices

ProviderVoices
OpenAIalloy, echo, fable, onyx, nova, shimmer
ElevenLabsUse voice IDs from your ElevenLabs account
CartesiaUse voice IDs from your Cartesia account

Audio File Input

For testing with pre-recorded audio files, use the parts field with media content:

turns:
  # Turn 1: Greeting - "Hello, can you hear me?"
  - role: user
    parts:
      - type: audio
        media:
          file_path: audio/greeting.pcm
          mime_type: audio/L16

In duplex mode, the audio from parts is streamed directly to the model. Use comments to document what each audio file contains.

Supported audio formats:

File Structure

duplex-streaming/
├── config.arena.yaml           # Main arena configuration
├── README.md                   # This file
├── mock-responses.yaml         # Mock responses for CI testing
├── audio/                      # Pre-recorded audio fixtures
│   ├── greeting.pcm            # "Hello, can you hear me?"
│   ├── question.pcm            # "What's your name?"
│   └── funfact.pcm             # "Tell me a fun fact"
├── providers/
│   ├── gemini-2-flash.provider.yaml
│   ├── openai-gpt4o-realtime.provider.yaml
│   └── mock-duplex.provider.yaml
├── scenarios/
│   ├── duplex-basic.scenario.yaml
│   ├── duplex-selfplay.scenario.yaml
│   └── duplex-interactive.scenario.yaml
├── prompts/
│   └── voice-assistant.prompt.yaml
├── personas/
│   ├── curious-customer.persona.yaml
│   └── technical-user.persona.yaml
└── out/                        # Test results output

Current Status

Duplex streaming requires providers that support bidirectional audio streaming.

Provider Requirements

Duplex mode requires providers to implement StreamInputSupport interface, which enables:

Supported providers:

Not supported:

When running with unsupported providers, you’ll see:

Error: provider does not support streaming input

CI/CD Integration

Using Mock Provider

The mock provider fully supports duplex streaming, enabling CI testing without API keys:

# GitHub Actions example - run duplex tests
- name: Run Duplex Streaming Tests
  run: |
    cd examples/duplex-streaming
    promptarena run --provider mock-duplex

For schema validation only:

# GitHub Actions example - validate configuration
- name: Validate Duplex Streaming Config
  run: |
    cd examples/duplex-streaming
    promptarena validate config.arena.yaml

Audio Fixtures

Pre-recorded PCM audio files are included in the audio/ directory for testing:

These can be used to test audio streaming without TTS dependencies.

Troubleshooting

”Provider does not support streaming”

Ensure you’re using a provider that supports duplex mode:

“TTS provider not configured”

For self-play scenarios with TTS, ensure:

  1. The TTS provider API key is set (e.g., OPENAI_API_KEY)
  2. The voice ID is valid for the chosen provider

”VAD timeout”

If turn detection isn’t working:

Learn More