Skip to content

Provider System

Understanding how Runtime abstracts LLM providers.

Runtime uses a provider abstraction to work with multiple LLM services (OpenAI, Anthropic Claude, Google Gemini) through a unified interface.

All providers implement the same interface:

type Provider interface {
Complete(ctx context.Context, messages []Message, config *ProviderConfig) (*ProviderResponse, error)
CompleteStream(ctx context.Context, messages []Message, config *ProviderConfig) (StreamReader, error)
GetProviderName() string
Close() error
}

This allows code like:

// Same code works with any provider
result, err := provider.Complete(ctx, messages, config)

Without abstraction:

// Tied to OpenAI
response := openai.ChatCompletion(...)
// Want to switch to Claude? Rewrite everything
response := anthropic.Messages(...)
// Different APIs, different parameters, different response formats

With abstraction:

// Works with any provider
var provider types.Provider
// OpenAI
provider = openai.NewOpenAIProvider(...)
// Or Claude
provider = anthropic.NewAnthropicProvider(...)
// Or Gemini
provider = gemini.NewGeminiProvider(...)
// Same code!
response, err := provider.Complete(ctx, messages, config)

1. Provider Independence

  • Switch providers without code changes
  • Test with different models easily
  • Compare provider performance

2. Fallback Strategies

  • Try multiple providers automatically
  • Graceful degradation
  • Increased reliability

3. Cost Optimization

  • Route to cheapest provider
  • Use expensive models only when needed
  • Track costs across providers

4. Testing

  • Mock providers for unit tests
  • Predictable test behavior
  • No API calls in tests

Synchronous completion:

Complete(ctx context.Context, messages []Message, config *ProviderConfig) (*ProviderResponse, error)

Parameters:

  • ctx: Timeout and cancellation
  • messages: Conversation history
  • config: Model parameters (temperature, max tokens, etc.)

Returns:

  • ProviderResponse: LLM’s response
  • error: Any errors

Streaming completion:

CompleteStream(ctx context.Context, messages []Message, config *ProviderConfig) (StreamReader, error)

Returns a stream reader for real-time output.

GetProviderName() string // Returns "openai", "claude", "gemini"
Close() error // Cleanup resources

Unified config works across all providers:

type ProviderConfig struct {
MaxTokens int // Output limit
Temperature float64 // Randomness (0.0-2.0)
TopP float64 // Nucleus sampling
Seed *int // Reproducibility
StopSequences []string // Stop generation
}

Each provider has sensible defaults:

// OpenAI defaults
openai.DefaultProviderDefaults() // temperature: 1.0, max_tokens: 4096
// Claude defaults
anthropic.DefaultProviderDefaults() // temperature: 1.0, max_tokens: 4096
// Gemini defaults
gemini.DefaultProviderDefaults() // temperature: 0.9, max_tokens: 8192

Models: GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5-turbo

Features:

  • Function calling
  • JSON mode
  • Vision (image inputs)
  • Streaming
  • Reproducible outputs (seed)

Pricing: Per-token, varies by model

API: REST over HTTPS

Models: Claude 3.5 Sonnet, Claude 3.5 Haiku, Claude 3 Opus

Features:

  • Function calling
  • Vision (image inputs)
  • Long context (200K tokens)
  • Streaming
  • System prompts

Pricing: Per-token, varies by model

API: REST over HTTPS

Models: Gemini 1.5 Pro, Gemini 1.5 Flash

Features:

  • Function calling
  • Vision and video inputs
  • Long context (1M+ tokens)
  • Streaming
  • Multimodal understanding

Pricing: Per-token, free tier available

API: REST over HTTPS

Runtime uses a common message format:

type Message struct {
Role string
Content string
ToolCalls []MessageToolCall
ToolCallID string
}

Roles:

  • system: System instructions
  • user: User messages
  • assistant: AI responses
  • tool: Tool execution results

Each provider translates to its native format:

Runtime → OpenAI:

{Role: "user", Content: "Hello"}
{role: "user", content: "Hello"}

Runtime → Claude:

{Role: "user", Content: "Hello"}
{role: "user", content: "Hello"}

Runtime → Gemini:

{Role: "user", Content: "Hello"}
{role: "user", parts: [{text: "Hello"}]}

This translation is invisible to users.

All providers support function calling:

type ToolDef struct {
Name string
Description string
Parameters json.RawMessage // JSON schema
}

OpenAI:

{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather",
"parameters": { ... }
}
}

Claude:

{
"name": "get_weather",
"description": "Get current weather",
"input_schema": { ... }
}

Gemini:

{
"name": "get_weather",
"description": "Get current weather",
"parameters": { ... }
}

Runtime handles conversion automatically.

Decision: All providers implement the same interface

Rationale:

  • Enables provider-agnostic code
  • Simplifies switching and testing
  • Reduces coupling to vendor APIs

Trade-off: Can’t expose provider-specific features directly. Instead, features are added to the interface when widely supported.

Decision: One provider instance per LLM service

Rationale:

  • Clear resource ownership
  • Independent configuration
  • Separate connection pools
  • Explicit lifecycle management

Alternative considered: Multi-provider registry was considered but rejected as too complex.

Decision: Providers translate directly, no adapter layer

Rationale:

  • Simpler implementation
  • Fewer layers
  • Better performance
  • Easier debugging

Trade-off: Translation code lives in each provider. This is acceptable as translation is straightforward.

Try providers in order:

providers := []types.Provider{primary, secondary, tertiary}
for _, provider := range providers {
result, err := provider.Complete(ctx, messages, config)
if err == nil {
return result, nil
}
log.Printf("Provider %s failed: %v", provider.GetProviderName(), err)
}
return nil, errors.New("all providers failed")

Distribute across providers:

type LoadBalancer struct {
providers []types.Provider
current int
}
func (lb *LoadBalancer) Execute(...) (*ProviderResponse, error) {
provider := lb.providers[lb.current % len(lb.providers)]
lb.current++
return provider.Complete(...)
}

Route to cheapest provider:

func selectProvider(taskComplexity string) types.Provider {
switch taskComplexity {
case "simple":
return openaiMini // Cheapest
case "complex":
return claude // Best quality
case "long_context":
return gemini // Largest context
default:
return openaiMini
}
}

Runtime includes a mock provider:

mockProvider := mock.NewMockProvider()
mockProvider.SetResponse("Hello! How can I help?")
pipe := pipeline.NewPipeline(
middleware.ProviderMiddleware(mockProvider, nil, nil, config),
)
result, _ := pipe.Execute(ctx, "user", "Hi")
// result.Response.Content == "Hello! How can I help?"

Benefits:

  • No API calls
  • Predictable responses
  • Fast tests
  • No cost

Providers maintain HTTP connection pools:

// Good: Reuse provider
provider := openai.NewOpenAIProvider(...)
defer provider.Close()
for _, prompt := range prompts {
provider.Complete(ctx, messages, config) // Reuses connections
}

Always close providers:

provider := openai.NewOpenAIProvider(...)
defer provider.Close() // Essential!

Without closing:

  • Connection leaks
  • Goroutine leaks
  • Memory growth

Use contexts for timeouts:

ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
result, err := provider.Complete(ctx, messages, config)

The provider interface can be extended to support:

New Features:

  • Audio generation
  • Image generation
  • Embeddings
  • Fine-tuned models

New Providers:

  • Mistral
  • Cohere
  • Local models (Ollama)
  • Custom endpoints

Provider abstraction provides:

Vendor Independence: Switch providers easily
Unified API: Same code for all providers
Fallback Support: Try multiple providers
Testing: Mock providers for tests
Extensibility: Add providers without breaking changes

  • Strategy pattern (Gang of Four)
  • Adapter pattern for API translation
  • Repository pattern for provider management