Realtime Video Streaming Example
This example demonstrates how to stream video frames to an LLM for real-time vision analysis using PromptKit’s duplex session capabilities.
Features Demonstrated
Section titled “Features Demonstrated”- OpenDuplex with video: Opening a bidirectional streaming session for video
- WithStreamingVideo: Configuring frame rate limiting and preprocessing
- SendFrame(): Sending individual image frames to the session
- Frame rate limiting: Automatic dropping of excess frames to match LLM processing speed
Use Cases
Section titled “Use Cases”- Webcam analysis: Real-time description of what a webcam sees
- Screen sharing: Analyzing screen content as it changes
- Security monitoring: Continuous analysis of camera feeds
- Accessibility: Describing visual content for users with visual impairments
export GEMINI_API_KEY=your-keygo run .How It Works
Section titled “How It Works”1. Open a Duplex Session with Video Config
Section titled “1. Open a Duplex Session with Video Config”conv, err := sdk.OpenDuplex( "./pack.json", "vision-stream", sdk.WithStreamingVideo(&sdk.VideoStreamConfig{ TargetFPS: 1.0, // Process 1 frame per second MaxWidth: 1024, // Resize large frames MaxHeight: 1024, Quality: 85, // JPEG quality EnableResize: true, }),)2. Send Frames
Section titled “2. Send Frames”frame := &session.ImageFrame{ Data: jpegBytes, MIMEType: "image/jpeg", Width: 640, Height: 480, FrameNum: frameCount, Timestamp: time.Now(),}err = conv.SendFrame(ctx, frame)3. Receive Responses
Section titled “3. Receive Responses”for chunk := range conv.Response() { if chunk.Content != "" { fmt.Print(chunk.Content) }}Frame Rate Limiting
Section titled “Frame Rate Limiting”The TargetFPS setting automatically drops excess frames:
- Webcam at 30 FPS → With
TargetFPS: 1.0, only ~1 frame/second reaches the LLM - No frame loss: The most recent frame is kept, older frames are dropped
- Reduces costs: Fewer frames = fewer tokens = lower API costs
Video Chunk Streaming
Section titled “Video Chunk Streaming”For encoded video segments (H.264, VP8, etc.), use SendVideoChunk():
chunk := &session.VideoChunk{ Data: h264Data, MIMEType: "video/h264", ChunkIndex: 0, IsKeyFrame: true, Timestamp: time.Now(),}err = conv.SendVideoChunk(ctx, chunk)Real-World Integration
Section titled “Real-World Integration”In a real application, replace the simulated frames with actual capture:
Using gocv (OpenCV for Go)
Section titled “Using gocv (OpenCV for Go)”import "gocv.io/x/gocv"
webcam, _ := gocv.VideoCaptureDevice(0)defer webcam.Close()
mat := gocv.NewMat()defer mat.Close()
for webcam.Read(&mat) { // Encode to JPEG buf, _ := gocv.IMEncode(".jpg", mat)
frame := &session.ImageFrame{ Data: buf.GetBytes(), MIMEType: "image/jpeg", Timestamp: time.Now(), } conv.SendFrame(ctx, frame)}Provider Support
Section titled “Provider Support”Currently, realtime video streaming works best with providers that support bidirectional streaming:
- Gemini Live API: Full support for real-time vision (when available)
- OpenAI Realtime: Audio-focused, video support may be added later
The SDK is provider-agnostic - video frames flow through when the provider supports them.