Runtime Explanation

Understanding the architecture and design of PromptKit Runtime.

Purpose

These explanations help you understand why Runtime works the way it does. They cover architectural decisions, design patterns, and the reasoning behind Runtime’s implementation.

Topics

Architecture

Integration

When to Read These

Read explanations when you want to:

Don’t read these when you need to:

Key Concepts

Stage-Based Architecture

Runtime uses a stage-based streaming architecture where each stage processes elements in its own goroutine. This provides:

Provider Abstraction

Runtime abstracts LLM providers behind a common interface, enabling:

Tool System

Runtime implements function calling through:

Architecture Overview

Input

Pipeline
  ├── StateStoreLoad Stage (load conversation)
  ├── PromptAssembly Stage (apply templates)
  ├── Validation Stage (check content)
  └── Provider Stage (call LLM)
      ├── Tool Registry (available tools)
      └── Provider (OpenAI/Claude/Gemini)

Output

Each stage runs concurrently in its own goroutine, connected by channels.

Pipeline Modes

Runtime supports three execution modes:

Text Mode

Standard HTTP-based LLM interactions for chat and completion.

VAD Mode

Voice Activity Detection for voice applications using text-based LLMs: Audio → STT → LLM → TTS

ASM Mode

Audio Streaming Mode for native multimodal LLMs with real-time audio via WebSocket.

Design Principles

1. Streaming First

2. Composability

3. Extensibility

4. Production-Ready

Contributing

Understanding these architectural concepts is valuable when contributing to Runtime. See the Runtime codebase for implementation details.