Duplex Streaming Example
This example demonstrates bidirectional streaming using OpenDuplex() with the Gemini Live API.
Features
Section titled “Features”- Interactive Voice Mode: Real-time audio capture from microphone with voice activity detection
- Real-time bidirectional streaming
- Text and audio chunk streaming
- Response handling with streaming chunks
- Duplex session lifecycle management
Requirements
Section titled “Requirements”- Gemini API key with Live API access enabled
- Model:
gemini-2.0-flash-exp(supports streaming input) - Microphone (for interactive voice mode)
- PortAudio library (for audio capture)
Note: The Gemini Live API is currently in preview and requires special access. If you encounter authentication errors, visit https://ai.google.dev/ to request Live API access.
- Set your Gemini API key:
export GEMINI_API_KEY=your-key-here- Install PortAudio (for audio capture):
# macOSbrew install portaudio
# Ubuntu/Debiansudo apt-get install portaudio19-dev
# Fedorasudo dnf install portaudio-devel- Run the example:
# Interactive voice mode (default)go run .
# Text streaming onlygo run . text
# Multiple chunks examplego run . chunksThe example supports three modes:
-
interactive (default): Real-time voice input via microphone
- Captures audio from your microphone continuously
- Streams audio chunks to Gemini in real-time (bidirectional)
- Receives and plays audio responses through speakers
- Also displays text transcription for debugging
-
text: Text streaming example
- Sends a text message
- Receives streaming response
-
chunks: Multiple chunk sending
- Sends message in multiple chunks
- Demonstrates incremental content building
API Usage
Section titled “API Usage”Interactive Audio Mode
Section titled “Interactive Audio Mode”// Open duplex conversationconv, err := sdk.OpenDuplex("./duplex.pack.json", "assistant")
// Send audio chunkaudioData := string(pcmBytes) // PCM16 audio datachunk := &providers.StreamChunk{ MediaDelta: &types.MediaContent{ MIMEType: types.MIMETypeAudioWAV, Data: &audioData, },}conv.SendChunk(ctx, chunk)
// Receive streaming responsesrespCh, _ := conv.Response()for chunk := range respCh { fmt.Print(chunk.Delta) if chunk.FinishReason != nil { break }}Text Streaming
Section titled “Text Streaming”// Open duplex conversationconv, err := sdk.OpenDuplex( "./duplex.pack.json", "assistant", sdk.WithModel("gemini-2.0-flash-exp"), sdk.WithAPIKey(apiKey),)defer conv.Close()
// Send textconv.SendText(ctx, "Hello!")
// Get response channelrespCh, _ := conv.Response()
// Receive streaming responsesfor chunk := range respCh { fmt.Print(chunk.Delta) if chunk.FinishReason != nil { break }}How It Works
Section titled “How It Works”Interactive Voice Mode
Section titled “Interactive Voice Mode”- Audio Capture: Uses PortAudio to capture microphone input at 16kHz mono PCM16
- Continuous Streaming: Audio is streamed continuously to Gemini Live API (no turn detection)
- Bidirectional Audio: Gemini ASM model streams audio responses back in real-time
- Audio Playback: Responses are played through speakers at 24kHz
- Text Display: Text transcription also shown for debugging
Visual feedback during capture:
█= Audio detected (high energy)░= Low/no audio
OpenDuplex vs Stream
Section titled “OpenDuplex vs Stream”- OpenDuplex: Full bidirectional streaming with the model. You can send multiple chunks and receive responses in real-time.
- Stream: Unary mode with streaming responses. You send one complete message and receive a streaming response.
Use OpenDuplex when you need:
- Real-time audio/video streaming
- Interactive back-and-forth during model response
- Voice conversation applications
- Live media processing