Runtime Reference
Complete API reference for the PromptKit Runtime components.
Overview
Section titled “Overview”The PromptKit Runtime provides the core execution engine for LLM interactions. It handles:
- Pipeline Execution: Stage-based processing with streaming support
- Provider Integration: Multi-LLM support (OpenAI, Anthropic, Google Gemini)
- Tool Execution: Function calling with MCP integration
- State Management: Conversation persistence and caching
- Validation: Content and response validation
- Configuration: Flexible runtime configuration
Quick Reference
Section titled “Quick Reference”Core Components
Section titled “Core Components”| Component | Description | Reference |
|---|---|---|
| Pipeline | Stage-based execution engine | pipeline.md |
| Providers | LLM provider implementations | providers.md |
| Tools | Function calling and MCP integration | tools.md |
| MCP | Model Context Protocol support | mcp.md |
| State Store | Conversation persistence | statestore.md |
| Validators | Content validation | validators.md |
| Types | Core data structures | types.md |
| A2A | Client, types, tool bridge, mock | a2a.md |
| Logging | Structured logging with context | logging.md |
| Telemetry | OpenTelemetry trace export | telemetry.md |
Import Paths
Section titled “Import Paths”import ( "github.com/AltairaLabs/PromptKit/runtime/pipeline" "github.com/AltairaLabs/PromptKit/runtime/providers" "github.com/AltairaLabs/PromptKit/runtime/tools" "github.com/AltairaLabs/PromptKit/runtime/mcp" "github.com/AltairaLabs/PromptKit/runtime/statestore" "github.com/AltairaLabs/PromptKit/runtime/hooks" "github.com/AltairaLabs/PromptKit/runtime/hooks/guardrails" "github.com/AltairaLabs/PromptKit/runtime/types" "github.com/AltairaLabs/PromptKit/runtime/logger" "github.com/AltairaLabs/PromptKit/runtime/telemetry")Basic Usage
Section titled “Basic Usage”Simple Provider Usage
Section titled “Simple Provider Usage”import ( "context" "github.com/AltairaLabs/PromptKit/runtime/providers" "github.com/AltairaLabs/PromptKit/runtime/providers/openai" "github.com/AltairaLabs/PromptKit/runtime/types")
// Create providerprovider := openai.NewProvider( "openai", "gpt-4o-mini", "", // default baseURL (uses env var for API key) providers.ProviderDefaults{Temperature: 0.7, MaxTokens: 1500}, false, // includeRawOutput)defer provider.Close()
// Execute predictionctx := context.Background()resp, err := provider.Predict(ctx, providers.PredictionRequest{ Messages: []types.Message{ {Role: "user", Content: "Hello!"}, }, Temperature: 0.7, MaxTokens: 1500,})if err != nil { log.Fatal(err)}
fmt.Println(resp.Content)With MCP Tools
Section titled “With MCP Tools”import ( "github.com/AltairaLabs/PromptKit/runtime/mcp")
// Register MCP servermcpRegistry := mcp.NewRegistry()defer mcpRegistry.Close()
mcpRegistry.RegisterServer(mcp.ServerConfig{ Name: "filesystem", Command: "npx", Args: []string{"-y", "@modelcontextprotocol/server-filesystem", "/allowed"},})
// Discover toolsctx := context.Background()serverTools, err := mcpRegistry.ListAllTools(ctx)if err != nil { log.Fatal(err)}
for serverName, tools := range serverTools { log.Printf("Server %s has %d tools\n", serverName, len(tools))}Streaming Execution
Section titled “Streaming Execution”// Execute with streamingstreamChan, err := provider.PredictStream(ctx, providers.PredictionRequest{ Messages: []types.Message{{Role: "user", Content: "Write a story"}}, Temperature: 0.7, MaxTokens: 1500,})if err != nil { log.Fatal(err)}
// Process chunksfor chunk := range streamChan { if chunk.Error != nil { log.Printf("Error: %v\n", chunk.Error) break }
if chunk.Delta != "" { fmt.Print(chunk.Delta) }
if chunk.FinishReason != nil { fmt.Printf("\n\nStream complete: %s\n", *chunk.FinishReason) }}Configuration
Section titled “Configuration”Provider Configuration
Section titled “Provider Configuration”defaults := providers.ProviderDefaults{ Temperature: 0.7, TopP: 0.95, MaxTokens: 2000, Pricing: providers.Pricing{ InputCostPer1K: 0.00015, // $0.15 per 1M tokens OutputCostPer1K: 0.0006, // $0.60 per 1M tokens },}
provider := openai.NewProvider("openai", "gpt-4o-mini", "", defaults, false)Tool Policy
Section titled “Tool Policy”policy := &pipeline.ToolPolicy{ ToolChoice: "auto", // "auto", "required", "none", or specific tool MaxRounds: 5, // Max tool execution rounds (default: 50) MaxToolCallsPerTurn: 10, // Max tools per LLM response MaxParallelToolCalls: 5, // Max concurrent tool executions (default: 10) MaxCostUSD: 1.00, // Stop after $1 spent (0 = unlimited) Blocklist: []string{"dangerous_tool"}, // Blocked tools}Error Handling
Section titled “Error Handling”Provider Errors
Section titled “Provider Errors”resp, err := provider.Predict(ctx, req)if err != nil { switch { case errors.Is(err, context.DeadlineExceeded): log.Println("Request timeout") default: log.Printf("Provider error: %v", err) }}Tool Errors
Section titled “Tool Errors”// MCP tool call errorsresponse, err := client.CallTool(ctx, "read_file", args)if err != nil { log.Printf("Tool execution failed: %v", err)}Best Practices
Section titled “Best Practices”Resource Management
Section titled “Resource Management”// Always close resourcesdefer provider.Close()defer mcpRegistry.Close()Context Cancellation
Section titled “Context Cancellation”// Use context for cancellationctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)defer cancel()
resp, err := provider.Predict(ctx, req)Streaming Cleanup
Section titled “Streaming Cleanup”// Always drain streaming channelsstreamChan, err := provider.PredictStream(ctx, req)if err != nil { return err}
for chunk := range streamChan { // Process chunks if chunk.Error != nil { break }}Performance Considerations
Section titled “Performance Considerations”Concurrency Control
Section titled “Concurrency Control”- Configure
MaxConcurrentExecutionsbased on provider rate limits - Use semaphores to prevent overwhelming providers
- Consider graceful degradation under load
Streaming vs Non-Streaming
Section titled “Streaming vs Non-Streaming”- Use streaming for interactive applications (chatbots, UIs)
- Use non-streaming for batch processing, testing, analytics
- Streaming has ~10% overhead but better UX
Tool Execution
Section titled “Tool Execution”- MCP tools run in separate processes (stdio overhead)
- Consider tool execution timeouts
- Use repository executors for fast in-memory tools
Memory Management
Section titled “Memory Management”- Pipeline creates fresh
ExecutionContextper call (prevents contamination) - Large conversation histories can increase memory usage
- Consider state store cleanup strategies
See Also
Section titled “See Also”- Pipeline Reference - Detailed pipeline API
- Provider Reference - Provider implementations
- Tools Reference - Tool registry and execution
- MCP Reference - Model Context Protocol integration
- Types Reference - Core data structures
Next Steps
Section titled “Next Steps”- How-To Guides - Task-focused guides
- Tutorials - Learn by building
- Explanation - Architecture and concepts