Runtime Explanation
Understanding the architecture and design of PromptKit Runtime.
Purpose
Section titled “Purpose”These explanations help you understand why Runtime works the way it does. They cover architectural decisions, design patterns, and the reasoning behind Runtime’s implementation.
Topics
Section titled “Topics”Architecture
Section titled “Architecture”- Pipeline Architecture - How the stage-based streaming pipeline works
- Provider System - LLM provider abstraction and implementation
Integration
Section titled “Integration”- Stage Design - Composable stage patterns
- State Management - Conversation history and persistence
When to Read These
Section titled “When to Read These”Read explanations when you want to:
- Understand architectural decisions
- Learn design patterns used in Runtime
- Extend Runtime with custom components
- Troubleshoot complex issues
- Contribute to Runtime development
Don’t read these when you need to:
- Quickly complete a specific task → See How-To Guides
- Learn Runtime from scratch → See Tutorials
- Look up API details → See Reference
Key Concepts
Section titled “Key Concepts”Stage-Based Architecture
Section titled “Stage-Based Architecture”Runtime uses a stage-based streaming architecture where each stage processes elements in its own goroutine. This provides:
- True Streaming: Elements flow through as they’re produced
- Concurrency: Stages run in parallel
- Composability: Mix and match stages
- Backpressure: Channel-based flow control
- Testability: Test stages in isolation
Provider Abstraction
Section titled “Provider Abstraction”Runtime abstracts LLM providers behind a common interface, enabling:
- Provider independence: Switch providers without code changes
- Fallback strategies: Try multiple providers automatically
- Consistent API: Same code for OpenAI, Claude, Gemini
- Easy testing: Mock providers for tests
Tool System
Section titled “Tool System”Runtime implements function calling through:
- Tool descriptors: Define tools with schemas
- Executors: Pluggable tool execution backends
- MCP integration: Connect external tool servers
- Automatic calling: LLM decides when to use tools
Architecture Overview
Section titled “Architecture Overview”Each stage runs concurrently in its own goroutine, connected by channels.
Pipeline Modes
Section titled “Pipeline Modes”Runtime supports three execution modes:
Text Mode
Section titled “Text Mode”Standard HTTP-based LLM interactions for chat and completion.
VAD Mode
Section titled “VAD Mode”Voice Activity Detection for voice applications using text-based LLMs: Audio → STT → LLM → TTS
ASM Mode
Section titled “ASM Mode”Audio Streaming Mode for native multimodal LLMs with real-time audio via WebSocket.
Design Principles
Section titled “Design Principles”1. Streaming First
- Channel-based data flow
- Concurrent stage execution
- True streaming (not simulated)
2. Composability
- Stage pipeline is flexible
- Mix components as needed
- Build custom pipelines
3. Extensibility
- Custom stages supported
- Custom providers possible
- Custom tools and executors
4. Production-Ready
- Error handling built-in
- Resource cleanup automatic
- Monitoring capabilities
Related Documentation
Section titled “Related Documentation”- Reference: Complete API documentation
- How-To Guides: Task-focused guides
- Tutorials: Step-by-step learning
Contributing
Section titled “Contributing”Understanding these architectural concepts is valuable when contributing to Runtime. See the Runtime codebase for implementation details.