Session Recording Architecture
This document explains how PromptKit’s session recording system captures, stores, and replays LLM conversations with full fidelity.
Overview
Section titled “Overview”Session recording provides a complete audit trail of LLM interactions. Unlike simple logging, recordings capture:
- Precise timing: Millisecond-accurate event timestamps
- Complete data: All messages, tool calls, and media
- Reconstructable state: Enough information to replay conversations exactly
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐│ Live Session │ ──► │ Event Store │ ──► │ Recording ││ (Emitter) │ │ (JSONL) │ │ (Replay) │└─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ ▼ ▼ ▼ Real-time Persistent Synchronized events storage playbackEvent-Driven Architecture
Section titled “Event-Driven Architecture”Event Emitter
Section titled “Event Emitter”The Emitter is the heart of session recording. It captures events as they occur:
emitter := events.NewEmitter(eventBus, runID, scenarioID, conversationID)
// Events are emitted automatically during conversationemitter.ConversationStarted(systemPrompt)emitter.MessageCreated(role, content)emitter.AudioInput(audioData)emitter.ProviderCallStarted(provider, model)emitter.ToolCallStarted(toolName, args)// ...Events flow through an EventBus to registered subscribers:
Emitter ──► EventBus ──┬──► FileEventStore (persistence) ├──► Metrics Collector └──► Real-time UIEvent Types
Section titled “Event Types”Events are categorized by their domain:
| Category | Events | Purpose |
|---|---|---|
| Conversation | started, ended | Session lifecycle |
| Message | created, updated | Content exchange |
| Audio | input, output | Voice data |
| Provider | call.started, call.completed | LLM API calls |
| Tool | call.started, call.completed | Function execution |
| Validation | started, completed | Guardrail checks |
Event Structure
Section titled “Event Structure”Each event contains:
type Event struct { Type EventType // Event category Timestamp time.Time // When it occurred SessionID string // Session identifier Sequence int64 // Ordering guarantee Data interface{} // Event-specific payload}Storage Format
Section titled “Storage Format”FileEventStore
Section titled “FileEventStore”Events are persisted in JSONL (JSON Lines) format:
{"seq":1,"event":{"type":"conversation.started","timestamp":"2024-01-15T10:30:00Z",...}}{"seq":2,"event":{"type":"message.created","timestamp":"2024-01-15T10:30:00.1Z",...}}{"seq":3,"event":{"type":"audio.input","timestamp":"2024-01-15T10:30:00.2Z",...}}Benefits:
- Append-only: Safe for concurrent writes
- Streamable: Process without loading entire file
- Human-readable: Easy debugging
SessionRecording Format
Section titled “SessionRecording Format”For export/import, recordings use a structured format:
{"type":"metadata","session_id":"...","start_time":"...","duration":"..."}{"type":"event","offset":"100ms","event_type":"message.created",...}{"type":"event","offset":"200ms","event_type":"audio.input",...}The loader auto-detects both formats:
rec, err := recording.Load("session.jsonl") // Works with either formatMedia Timeline
Section titled “Media Timeline”For recordings with audio/video, the MediaTimeline provides synchronized access:
Recording │ ▼MediaTimeline ├── TrackAudioInput ──► User speech segments ├── TrackAudioOutput ──► Assistant speech segments └── TrackVideo ──► Video frames (if present)Track Structure
Section titled “Track Structure”Each track contains time-ordered segments:
type MediaSegment struct { Offset time.Duration // Start time relative to session Duration time.Duration // Segment length Data []byte // Raw media data Format AudioFormat // Sample rate, channels, encoding}Audio Reconstruction
Section titled “Audio Reconstruction”Audio can be exported as standard WAV files:
timeline := rec.ToMediaTimeline(nil)timeline.ExportAudioToWAV(events.TrackAudioInput, "user.wav")timeline.ExportAudioToWAV(events.TrackAudioOutput, "assistant.wav")The export process:
- Collects all segments for the track
- Concatenates PCM data in time order
- Writes RIFF/WAVE header with format info
- Outputs standard 16-bit PCM WAV
Replay System
Section titled “Replay System”ReplayPlayer
Section titled “ReplayPlayer”The ReplayPlayer provides synchronized access to recordings:
player, _ := recording.NewReplayPlayer(rec)
// Seek to any positionplayer.Seek(5 * time.Second)
// Query state at current positionstate := player.GetState()// state.CurrentEvents - events at this moment// state.RecentEvents - events in last 2 seconds// state.Messages - all messages up to this point// state.AudioInputActive - is user speaking?// state.AudioOutputActive - is assistant speaking?Playback State
Section titled “Playback State”At any position, you can access:
| Field | Description |
|---|---|
Position | Current offset from session start |
Timestamp | Absolute timestamp |
CurrentEvents | Events within 50ms of position |
RecentEvents | Events in last 2 seconds |
Messages | Accumulated conversation |
AudioInputActive | User audio present |
AudioOutputActive | Assistant audio present |
ActiveAnnotations | Annotations in scope |
Annotation Correlation
Section titled “Annotation Correlation”Annotations can be attached to recordings for review:
player.SetAnnotations([]*annotations.Annotation{ // Session-level annotation annotations.ForSession().WithScore("quality", 0.92),
// Time-range annotation annotations.InTimeRange(start, end).WithComment("Good response"),
// Event-targeted annotation annotations.ForEvent(eventSeq).WithLabel("category", "greeting"),})During playback, active annotations are included in state queries.
Replay Provider
Section titled “Replay Provider”For deterministic replay, the ReplayProvider simulates the original provider:
provider, _ := replay.NewProviderFromRecording(rec)
// Returns the same responses as the original sessionresponse, _ := provider.Complete(ctx, messages, opts)Use cases:
- Regression testing: Verify behavior against known-good responses
- Debugging: Reproduce exact conversation flows
- Offline testing: No API calls needed
Data Flow Summary
Section titled “Data Flow Summary”┌─────────────────────────────────────────────────────────────────┐│ LIVE SESSION ││ ││ User ──► Pipeline ──► Provider ──► Response ──► User ││ │ │ ││ ▼ ▼ ││ Emitter ────────────────► EventBus ││ │ │└────────────────────────────────────────┼──────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────────┐│ STORAGE ││ ││ EventBus ──► FileEventStore ──► session.jsonl ││ │└────────────────────────────────────────┬──────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────────┐│ REPLAY ││ ││ session.jsonl ──► SessionRecording ──► ReplayPlayer ││ │ │ ││ ▼ ▼ ││ MediaTimeline Synchronized ││ (WAV export) Playback ││ │└─────────────────────────────────────────────────────────────────┘Performance Considerations
Section titled “Performance Considerations”Recording Overhead
Section titled “Recording Overhead”- CPU: Minimal - events are queued and written asynchronously
- Memory: Bounded - segments are written incrementally
- Disk: Proportional to conversation length and audio duration
Audio Data Size
Section titled “Audio Data Size”Audio is stored as base64-encoded PCM:
- Input: 16kHz, 16-bit mono = ~32KB/second
- Output: 24kHz, 16-bit mono = ~48KB/second
- Base64 encoding adds ~33% overhead
A 5-minute voice conversation generates approximately:
- Raw audio: ~24MB
- JSONL with metadata: ~32MB
Optimization Tips
Section titled “Optimization Tips”- Disable for CI: Skip recording for quick validation runs
- Compress archives: JSONL compresses well (70-80% reduction)
- Retention policies: Auto-delete old recordings
- Selective recording: Only enable for specific scenarios
Next Steps
Section titled “Next Steps”- Session Recording How-To - Practical usage guide
- Duplex Architecture - Voice conversation system
- Testing Philosophy - Why test prompts?