Skip to content

State Management

Understanding conversation state and persistence in PromptKit.

State management maintains conversation history across multiple turns. It allows LLMs to remember previous interactions.

Context: LLMs need history to understand conversations
Continuity: Users expect the AI to remember
Multi-turn: Enable back-and-forth dialogue
Personalization: Remember user preferences

LLMs are stateless:

// First message
response1 := llm.Predict("What's the capital of France?")
// "Paris"
// Second message - no memory!
response2 := llm.Predict("What about Germany?")
// "What do you mean 'what about Germany'?"

Pass conversation history:

messages := []Message{
{Role: "user", Content: "What's the capital of France?"},
{Role: "assistant", Content: "Paris"},
{Role: "user", Content: "What about Germany?"},
}
response := llm.Predict(messages)
// "The capital of Germany is Berlin"

Each conversation has a session ID:

conv, _ := sdk.Open("./assistant.pack.json", "chat",
sdk.WithModel("gpt-4o-mini"),
)
defer conv.Close()
response, _ := conv.Send(ctx, "Hello")

Sessions enable:

  • Multi-user support: Separate conversations
  • History isolation: Users don’t see each other’s messages
  • Concurrent access: Multiple requests per session

The Store interface provides Load, Save, and Fork methods:

store := statestore.NewMemoryStore()
// Save conversation state
err := store.Save(ctx, state)
// Load conversation state
state, err := store.Load(ctx, sessionID)
// Fork a conversation (create a branch)
err := store.Fork(ctx, sourceSessionID, newSessionID)

Fast, but not persistent:

store := statestore.NewMemoryStore()

Pros: Very fast (~1-10µs)
Cons: Lost on restart, single-instance only
Use for: Development, testing, demos

Persistent and scalable:

redisClient := redis.NewClient(&redis.Options{
Addr: "localhost:6379",
})
store := statestore.NewRedisStore(redisClient)

Pros: Persistent, multi-instance, TTL support
Cons: Slower (~1-5ms), requires Redis
Use for: Production, distributed systems

Control history size by trimming messages after loading:

messages, _ := store.Load(sessionID)
if len(messages) > 20 {
messages = messages[len(messages)-20:]
}

Benefits:

  • Lower costs (fewer tokens)
  • Faster loading
  • More relevant context

One session per user:

sessionID := fmt.Sprintf("user-%s", userID)

Use case: Single ongoing conversation per user

Multiple conversations per user:

sessionID := fmt.Sprintf("user-%s-conv-%s", userID, conversationID)

Use case: User can start multiple conversations

Anonymous sessions:

sessionID := uuid.New().String()

Use case: Guest users, no account required

Workflows add a state machine layer on top of conversation state. Each workflow tracks:

  • Current state — Which state the machine is in
  • Transition history — All state transitions with timestamps
  • Per-state conversations — Each state has its own conversation history

Workflow states can control their own persistence behavior:

{
"states": {
"intake": {
"prompt_task": "greeting",
"persistence": "transient"
},
"specialist": {
"prompt_task": "specialist",
"persistence": "persistent"
}
}
}
  • transient — State is not persisted to the store (ephemeral interactions)
  • persistent — State is saved to the store (important conversations)

The workflow.Context captures the full machine state for persistence and resumption:

wf, _ := sdk.OpenWorkflow("./support.pack.json")
// ... interact with the workflow ...
// Get the full workflow context for persistence
wfCtx := wf.Context()
fmt.Println(wfCtx.CurrentState) // "specialist"
fmt.Println(len(wfCtx.History)) // number of transitions
// Resume later
wf, _ = sdk.ResumeWorkflow("workflow-id", "./support.pack.json")

When enabled, summaries from previous states are injected into the next state’s conversation as context, enabling continuity across state transitions:

wf, _ := sdk.OpenWorkflow("./support.pack.json",
sdk.WithContextCarryForward(true),
)

Each workflow state can declare who holds control after a transition into it:

{
"states": {
"triage": {
"prompt_task": "triage",
"on_event": { "Routed": "routed" }
},
"routed": {
"prompt_task": "routed",
"control": "agent",
"on_event": {
"ToBilling": "billing",
"ToTechnical": "technical"
}
},
"billing": { "prompt_task": "billing", "terminal": true },
"technical": { "prompt_task": "technical", "terminal": true }
}
}
  • control: user (default) — the agent’s turn ends after the transition. The state machine commits at end-of-pipeline-turn and the user (or selfplay driver) speaks next.
  • control: agent — the agent keeps the turn after the transition. The state machine commits eagerly inside the pipeline tool loop so subsequent LLM calls in the same turn see the new state’s events and can fire further transitions.

In the example above, the agent reads the user’s first message in triage, fires Routed to enter routed (eager commit, agent keeps the turn), then immediately fires ToBilling or ToTechnical to land in the specialist state — two transitions, one pipeline turn, no extra user message required. See the workflow-router example for the full pack.

When to use which:

ScenarioUse
Standard conversational state (intake → reply → next msg)user (default)
Transient routing state (router → destination)agent
Planner → executor chain inside a single turnagent on the intermediate
User must reply before the next state can runuser

The destination state’s control is what determines behavior, not the source. A user-controlled state may transition into an agent-controlled state and vice versa.

System-prompt scope inside an agent-controlled chain. Because eager commits happen inside the pipeline tool loop, the LLM’s system prompt stays anchored to the source state for the duration of the chain — the new state’s prompt_task, description, and available events are surfaced to the model via the workflow__transition tool result, not by re-templating the system prompt. Treat agent-controlled states as transient: don’t put substantive persona or behavior in their prompt_task. The conversation only opens a fresh prompt for the state the chain finally lands in (the first user-controlled state, or a terminal state).

Use Redis in production

// Production
store := statestore.NewRedisStore(redisClient)
// Development
store := statestore.NewMemoryStore()

Set appropriate limits

// Balance context vs cost
MaxMessages: 10-20 // Good for most cases

Set TTL for privacy

TTL: 24 * time.Hour // Delete old conversations

Handle errors gracefully

messages, err := store.Load(sessionID)
if err != nil {
// Continue with empty history
messages = []Message{}
}

Don’t store infinite history - Cost and performance
Don’t use in-memory in production - Not persistent
Don’t forget to clean up - Privacy and storage
Don’t ignore errors - Handle store failures

User A → [Instance 1] ↘
[Redis Store]
User B → [Instance 2] ↗

Benefits:

  • High availability
  • Horizontal scaling
  • Shared state

Route user to same instance:

User (session-123) → Instance 1 (every time)
User (session-456) → Instance 2 (every time)

Benefits:

  • Local caching
  • Reduced Redis load
  • Lower latency
// Fast: Load 10 messages
MaxMessages: 10
// Slow: Load all messages
MaxMessages: -1 // Unlimited

Load on demand:

// Only load when needed
if requiresHistory {
messages, _ := store.Load(sessionID)
}

For large histories:

compressed := gzip.Compress(messages)
store.Save(sessionID, compressed)
type StateMetrics struct {
ActiveSessions int
AvgHistorySize int
LoadLatency time.Duration
StorageUsed int64
}
if metrics.ActiveSessions > 10000 {
alert.Send("High active session count")
}
if metrics.LoadLatency > 100*time.Millisecond {
alert.Send("Slow state loading")
}
func TestStateStore(t *testing.T) {
store := statestore.NewMemoryStore()
// Save messages
messages := []types.Message{
{Role: "user", Content: "Hello"},
}
err := store.Save("session-1", messages)
assert.NoError(t, err)
// Load messages
loaded, err := store.Load("session-1")
assert.NoError(t, err)
assert.Equal(t, messages, loaded)
}
func TestConversationFlow(t *testing.T) {
pipe := createPipelineWithState()
sessionID := "test-session"
// First message
result1, _ := pipe.ExecuteWithSession(ctx, sessionID, "user", "My name is Alice")
// Second message - should remember
result2, _ := pipe.ExecuteWithSession(ctx, sessionID, "user", "What's my name?")
assert.Contains(t, result2.Response.Content, "Alice")
}

State management provides:

Context - LLMs remember conversations
Continuity - Multi-turn dialogue
Scalability - Redis for distributed systems
Performance - Configurable limits
Privacy - TTL-based cleanup