The PromptKit provider system provides a unified abstraction for interacting with multiple LLM providers while handling provider-specific details transparently. This architecture enables seamless switching between providers and facilitates multi-provider testing and comparison.

Overview

The provider system abstracts away the differences between various LLM APIs (OpenAI, Anthropic, Google Gemini, etc.) behind a consistent interface. This allows the runtime to work with any provider without changing application code.

graph TB
    subgraph "Application Layer"
        SDK["SDK / Arena"]
    end

    subgraph "Runtime Layer"
        Pipeline["Pipeline"]
        Registry["Provider Registry"]
    end

    subgraph "Provider Implementations"
        OpenAI["OpenAI Provider<br/>GPT-4, GPT-3.5"]
        Anthropic["Anthropic Provider<br/>Claude 3 Family"]
        Gemini["Gemini Provider<br/>Gemini Pro/Ultra"]
        Mock["Mock Provider<br/>Testing"]
    end

    subgraph "External APIs"
        OpenAI_API["OpenAI API"]
        Anthropic_API["Anthropic API"]
        Google_API["Google AI API"]
    end

    SDK --> Pipeline
    Pipeline --> Registry
    Registry --> OpenAI
    Registry --> Anthropic
    Registry --> Gemini
    Registry --> Mock

    OpenAI --> OpenAI_API
    Anthropic --> Anthropic_API
    Gemini --> Google_API

    style Registry fill:#f9f,stroke:#333,stroke-width:3px

Core Interfaces

Provider Interface

The base Provider interface defines the minimal contract all providers must implement:

type Provider interface {
    // Identity
    ID() string

    // Chat operations
    Chat(ctx context.Context, req ChatRequest) (ChatResponse, error)
    ChatStream(ctx context.Context, req ChatRequest) (<-chan StreamChunk, error)
    SupportsStreaming() bool

    // Cost calculation
    CalculateCost(inputTokens, outputTokens, cachedTokens int) types.CostInfo

    // Configuration
    ShouldIncludeRawOutput() bool

    // Lifecycle
    Close() error
}

Key Responsibilities:

ToolSupport Interface

Providers that support tool/function calling extend the base interface:

type ToolSupport interface {
    Provider // Extends base Provider

    // Convert tool descriptors to provider-native format
    BuildTooling(descriptors []*ToolDescriptor) (interface{}, error)

    // Execute chat with tool support
    ChatWithTools(
        ctx context.Context,
        req ChatRequest,
        tools interface{},
        toolChoice string,
    ) (ChatResponse, []types.MessageToolCall, error)
}

Why Separate?: Not all providers support tools. This optional interface allows capability detection and graceful fallbacks.

StreamingInputSupport Interface

For providers that support streaming input (audio, video, etc.):

type StreamingInputSupport interface {
    Provider // Extends base Provider

    // Execute chat with streaming media input
    ChatStreamWithMedia(
        ctx context.Context,
        req ChatRequest,
        mediaStream <-chan types.MediaContent,
    ) (<-chan StreamChunk, error)
}

Use Cases:

Request & Response Types

ChatRequest

Unified request format across all providers:

type ChatRequest struct {
    System      string                 // System prompt
    Messages    []types.Message        // Conversation history
    Temperature float32                // Sampling temperature
    TopP        float32                // Nucleus sampling
    MaxTokens   int                    // Maximum response length
    Seed        *int                   // Reproducibility seed
    Metadata    map[string]interface{} // Provider-specific extras
}

Design Principles:

ChatResponse

Unified response format:

type ChatResponse struct {
    Content    string                  // Response text
    ToolCalls  []types.MessageToolCall // Tool invocations
    CostInfo   *types.CostInfo         // Token usage & cost
    Latency    time.Duration           // Response time
    Raw        []byte                  // Raw API response (debug)
    RawRequest interface{}             // Raw API request (debug)
}

Key Features:

StreamChunk

For streaming responses:

type StreamChunk struct {
    Content      string                  // Accumulated content
    Delta        string                  // New content in this chunk
    TokenCount   int                     // Total tokens so far
    DeltaTokens  int                     // Tokens in this delta
    ToolCalls    []types.MessageToolCall // Tool calls (if any)
    FinishReason *string                 // Completion reason
    Error        error                   // Error (if failed)
    CostInfo     *types.CostInfo         // Final chunk only
}

Streaming Design:

Provider Registry

The Registry manages provider instances and provides discovery:

graph TB
    subgraph "Registry"
        ProviderMap["Provider Map<br/>(ID -> Provider)"]
        FactoryMap["Factory Map<br/>(Type -> Factory)"]
    end

    subgraph "Registration"
        RegFactory["RegisterProviderFactory()"]
        CreateProvider["CreateProviderFromSpec()"]
    end

    subgraph "Usage"
        Register["Register(provider)"]
        Get["Get(id)"]
        List["List()"]
        Close["Close()"]
    end

    RegFactory --> FactoryMap
    CreateProvider --> FactoryMap
    CreateProvider --> ProviderMap
    Register --> ProviderMap
    Get --> ProviderMap
    List --> ProviderMap
    Close --> ProviderMap

    style ProviderMap fill:#9f9
    style FactoryMap fill:#9f9

Provider Lifecycle

sequenceDiagram
    participant App
    participant Registry
    participant Factory
    participant Provider

    App->>Registry: CreateProviderFromSpec(spec)
    Registry->>Factory: factory(spec)
    Factory->>Provider: new(config)
    Provider-->>Factory: provider instance
    Factory-->>Registry: provider instance
    Registry->>Registry: Register(provider)
    Registry-->>App: provider

    Note over App,Provider: Provider is ready for use

    App->>Registry: Get(id)
    Registry-->>App: provider

    App->>Provider: Chat(request)
    Provider-->>App: response

    App->>Registry: Close()
    Registry->>Provider: Close()
    Provider-->>Registry: cleaned up

ProviderSpec

Configuration for creating providers:

type ProviderSpec struct {
    ID               string                 // Unique identifier
    Type             string                 // Provider type (openai, claude, etc.)
    Model            string                 // Model name
    BaseURL          string                 // API endpoint (optional)
    Defaults         ProviderDefaults       // Default parameters
    IncludeRawOutput bool                   // Include raw API responses
    AdditionalConfig map[string]interface{} // Provider-specific config
}

Factory Pattern: Each provider type registers a factory function:

func init() {
    RegisterProviderFactory("openai", func(spec ProviderSpec) (Provider, error) {
        return NewOpenAIProvider(spec)
    })
}

Provider Implementations

OpenAI Provider

Supported Models:

Features:

Special Handling:

Anthropic Provider

Supported Models:

Features:

Special Handling:

Google Gemini Provider

Supported Models:

Features:

Special Handling:

Mock Provider

Purpose: Testing and development

Features:

Use Cases:

Streaming Architecture

Non-Streaming Flow

sequenceDiagram
    participant Client
    participant Provider
    participant API

    Client->>Provider: Chat(request)
    Provider->>API: HTTP POST
    Note over API: Processing...
    API-->>Provider: Complete Response
    Provider->>Provider: Parse & Transform
    Provider-->>Client: ChatResponse

Characteristics:

Streaming Flow

sequenceDiagram
    participant Client
    participant Provider
    participant API

    Client->>Provider: ChatStream(request)
    Provider->>API: HTTP POST (Accept: text/event-stream)
    Provider-->>Client: StreamChunk channel

    loop For each chunk
        API->>Provider: SSE Event
        Provider->>Provider: Parse Event
        Provider->>Client: StreamChunk
    end

    API->>Provider: [DONE]
    Provider->>Client: Final Chunk (with CostInfo)
    Provider->>Client: Close channel

Characteristics:

SSE Parser

PromptKit includes a robust SSE (Server-Sent Events) parser:

// Parse SSE stream and emit events
parser := NewSSEParser(reader)
for {
    event, err := parser.Next()
    if err == io.EOF {
        break
    }
    if err != nil {
        return err
    }

    // Process event.Data
    chunk := parseProviderChunk(event.Data)
    chunkChan <- chunk
}

Features:

Cost Calculation

Each provider implements cost calculation based on token usage:

graph LR
    subgraph "Input"
        Tokens["Token Counts<br/>Input: 1000<br/>Output: 500<br/>Cached: 200"]
        Pricing["Pricing Model<br/>Input: $0.01/1K<br/>Output: $0.03/1K<br/>Cache: $0.001/1K"]
    end

    subgraph "Calculation"
        Calc["CalculateCost()"]
    end

    subgraph "Output"
        CostInfo["CostInfo<br/>Input: $0.01<br/>Output: $0.015<br/>Cache: $0.0002<br/>Total: $0.0252"]
    end

    Tokens --> Calc
    Pricing --> Calc
    Calc --> CostInfo

    style CostInfo fill:#9f9

Cost Tracking Features:

Multimodal Support

Content Types

PromptKit supports rich content beyond text:

type MessageContent interface {
    ContentType() string
}

type TextContent struct {
    Text string
}

type MediaContent struct {
    Type     string // "image", "audio", "video"
    MimeType string // "image/jpeg", "audio/mp3", etc.
    Data     []byte // Raw data
    URL      string // Or URL reference
}

Provider-Specific Encoding

graph TB
    subgraph "Unified Content"
        TextContent["TextContent"]
        MediaContent["MediaContent<br/>Image/Audio/Video"]
    end

    subgraph "Provider Adapters"
        OpenAIAdapter["OpenAI Adapter<br/>base64 encoding"]
        AnthropicAdapter["Anthropic Adapter<br/>content blocks"]
        GeminiAdapter["Gemini Adapter<br/>inline data"]
    end

    subgraph "Provider APIs"
        OpenAI_API["OpenAI API"]
        Anthropic_API["Anthropic API"]
        Gemini_API["Gemini API"]
    end

    TextContent --> OpenAIAdapter
    TextContent --> AnthropicAdapter
    TextContent --> GeminiAdapter

    MediaContent --> OpenAIAdapter
    MediaContent --> AnthropicAdapter
    MediaContent --> GeminiAdapter

    OpenAIAdapter --> OpenAI_API
    AnthropicAdapter --> Anthropic_API
    GeminiAdapter --> Gemini_API

Provider Differences:

Error Handling & Resilience

Error Types

// Provider-specific errors
type ProviderError struct {
    Provider string
    Type     string // "rate_limit", "auth", "invalid_request", "timeout"
    Message  string
    Retryable bool
}

// Network errors
type NetworkError struct {
    Operation string
    Err       error
}

// Unsupported provider
type UnsupportedProviderError struct {
    ProviderType string
}

Retry Strategy

graph TD
    Request["Execute Request"]
    Success["Success"]
    Error["Error Occurred"]
    Retryable{"Retryable?"}
    Attempts{"Max Attempts?"}
    Backoff["Exponential Backoff"]
    Fail["Return Error"]

    Request --> Success
    Request --> Error
    Error --> Retryable
    Retryable -->|Yes| Attempts
    Retryable -->|No| Fail
    Attempts -->|No| Backoff
    Attempts -->|Yes| Fail
    Backoff --> Request

    style Success fill:#9f9
    style Fail fill:#f99

Retryable Errors:

Non-Retryable Errors:

Configuration & Defaults

ProviderDefaults

Each provider has sensible defaults:

type ProviderDefaults struct {
    Temperature float32
    TopP        float32
    MaxTokens   int
    Pricing     Pricing
}

OpenAI Defaults:

{
    Temperature: 0.7,
    TopP:        1.0,
    MaxTokens:   2048,
    Pricing: {
        InputCostPer1K:  0.01,
        OutputCostPer1K: 0.03,
    },
}

Override Hierarchy:

  1. Request-level parameters (highest priority)
  2. Provider defaults (from spec)
  3. Global defaults (lowest priority)

Tool Calling Integration

Tool Descriptor Format

Unified tool representation:

type ToolDescriptor struct {
    Name         string          // Function name
    Description  string          // What the tool does
    InputSchema  json.RawMessage // JSON Schema for arguments
    OutputSchema json.RawMessage // JSON Schema for results
}

Provider-Specific Transformation

graph LR
    subgraph "Unified Format"
        ToolDesc["ToolDescriptor<br/>JSON Schema"]
    end

    subgraph "BuildTooling()"
        OpenAIBuild["OpenAI<br/>functions array"]
        AnthropicBuild["Anthropic<br/>tools array"]
        GeminiBuild["Gemini<br/>function_declarations"]
    end

    ToolDesc --> OpenAIBuild
    ToolDesc --> AnthropicBuild
    ToolDesc --> GeminiBuild

    style ToolDesc fill:#9f9

Example Transformation:

// Unified format
tool := &ToolDescriptor{
    Name: "get_weather",
    Description: "Get weather for a location",
    InputSchema: /* JSON Schema */,
}

// OpenAI format
openaiTool := {
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": /* JSON Schema */,
    },
}

// Anthropic format
claudeTool := {
    "name": "get_weather",
    "description": "Get weather for a location",
    "input_schema": /* JSON Schema */,
}

Performance Optimization

Connection Pooling

httpClient := &http.Client{
    Transport: &http.Transport{
        MaxIdleConns:        100,
        MaxIdleConnsPerHost: 10,
        IdleConnTimeout:     90 * time.Second,
    },
}

Benefits:

Response Caching

Providers can implement optional caching:

Batch Processing

Future enhancement for batch operations:

Testing & Observability

Mock Provider

Full-featured mock for testing:

mock := NewMockProvider("test-provider")
mock.AddResponse("Hello", nil)
mock.AddToolCall("get_weather", args, result)
mock.SetLatency(100 * time.Millisecond)

response, err := mock.Chat(ctx, request)

Logging Integration

Providers integrate with the runtime logger:

logger.LLMCall(provider.ID(), "user", len(messages), temperature)
logger.LLMResponse(provider.ID(), "assistant", inputTokens, outputTokens, cost)
logger.LLMError(provider.ID(), "assistant", err)

Metrics

Key metrics to monitor:

Best Practices

  1. Use Registry: Let the registry manage provider instances
  2. Handle Errors Gracefully: Implement retries with exponential backoff
  3. Monitor Costs: Track token usage and spending
  4. Test with Mock: Use mock provider for fast, deterministic tests
  5. Stream When Possible: Provide better user experience
  6. Close Resources: Always call Close() to clean up connections
  7. Respect Rate Limits: Implement proper backoff strategies
  8. Validate Input: Check message formats before sending to provider

Future Enhancements


Related Documentation: