Customer Support - Integrated Tools Example

This example demonstrates a customer support chatbot that uses tools to retrieve customer information, check orders, and manage tickets.

Overview

Unlike the basic customer-support example, this version integrates with mock backend systems through tools:

Structure

customer-support-integrated/
├── arena.yaml                    # Main configuration with tool definitions
├── prompts/
│   └── support-bot.yaml         # Prompt with tool usage guidelines
├── scenarios/
│   ├── billing-question.yaml    # Billing discrepancy scenario
│   ├── order-inquiry.yaml       # Order status inquiry
│   ├── account-info.yaml        # Account information request
│   ├── security-test.yaml       # Security and privacy policy testing
│   ├── social-engineering-selfplay.yaml  # Self-play adversarial security test
│   └── tool-test.yaml           # Simple tool usage verification
├── tools/
│   ├── get-customer-info.yaml
│   ├── get-order-history.yaml
│   ├── check-subscription-status.yaml
│   └── create-support-ticket.yaml
└── providers/
    ├── openai-gpt4o-mini.yaml
    ├── claude-3-5-haiku.yaml
    └── gemini-2-0-flash.yaml

Tool Definitions

All tools are defined in arena.yaml with schemas for inputs and outputs:

get_customer_info

Retrieves customer account details by email address.

Input:

{
  "email": "customer@email.com"
}

Output:

{
  "customer_id": "CUST-12345",
  "name": "John Doe",
  "email": "customer@email.com",
  "account_created": "2023-01-15",
  "tier": "premium"
}

get_order_history

Gets recent orders for a customer.

Input:

{
  "email": "customer@email.com",
  "limit": 5
}

Output:

{
  "orders": [
    {
      "order_id": "ORD-2024-1234",
      "date": "2024-10-15",
      "status": "delivered",
      "total": 99.99,
      "items": ["Product A", "Product B"]
    }
  ]
}

check_subscription_status

Checks subscription and billing details.

Input:

{
  "email": "customer@email.com"
}

Output:

{
  "subscription_id": "SUB-7890",
  "plan": "Pro",
  "status": "active",
  "next_billing_date": "2024-11-15",
  "amount": 49.99,
  "last_payment_date": "2024-10-15"
}

create_support_ticket

Creates a support ticket for escalation.

Input:

{
  "email": "customer@email.com",
  "issue_type": "billing",
  "priority": "high",
  "description": "Duplicate charge on credit card"
}

Output:

{
  "ticket_id": "TICKET-98765",
  "status": "open",
  "created_at": "2024-10-23T10:30:00Z"
}

Running the Example

# Run all scenarios across all providers
promptarena run -c examples/customer-support-integrated/arena.yaml

# Run specific scenario
promptarena run -c examples/customer-support-integrated/arena.yaml \
  --scenario billing-question

# Run self-play scenario
promptarena run -c examples/customer-support-integrated/arena.yaml \
  --scenario social-engineering-selfplay

# Run with specific provider
promptarena run -c examples/customer-support-integrated/arena.yaml \
  --provider claude-3-5-haiku

Self-Play Testing

This example includes a self-play scenario (social-engineering-selfplay.yaml) that uses an LLM to dynamically generate adversarial user messages. This demonstrates PromptKit’s ability to:

  1. Simulate realistic attackers: The social-engineer persona uses tactics like urgency, impersonation, and manipulation
  2. Generate varied conversations: Each test run produces different attack patterns
  3. Test security boundaries: Validates that the agent maintains security policies under pressure
  4. Scale adversarial testing: Create many variations without manually scripting each turn

Self-Play Configuration:

The self-play system automatically:

Expected Behavior

The AI agent should:

  1. Identify when to use tools - Recognize customer inquiries that require data lookup
  2. Call appropriate tools - Select the right tool(s) for each situation
  3. Handle tool responses - Parse and present information clearly to customers
  4. Chain tool calls - Use multiple tools in sequence when needed
  5. Create tickets - Escalate complex issues appropriately

Test Scenarios

billing-question.yaml

Customer reports duplicate billing charges. Agent should:

order-inquiry.yaml

Customer asks about delayed order. Agent should:

account-info.yaml

Customer requests account details. Agent should:

security-test.yaml

Adversarial scenario testing security and privacy boundaries. Agent should:

social-engineering-selfplay.yaml

Self-play security test using the social-engineer persona. This scenario demonstrates advanced adversarial testing where an LLM plays the role of an attacker attempting to gain unauthorized access. The social engineer persona will:

Unlike the scripted security-test.yaml, this scenario uses self-play where the adversarial user messages are dynamically generated by an LLM using the social-engineer persona. This creates more realistic and varied attack patterns. Agent should:

tool-test.yaml

Simple verification that tools are being called correctly. Agent should:

Mock Tool Implementation

Note: Currently, tools return mock data. In Day 2 of the roadmap, these will be replaced with real MCP (Model Context Protocol) server implementations.

Mock responses are defined in the tool registry and return realistic sample data for testing the agent’s ability to:

Next Steps

This example serves as a foundation for:

  1. MCP Integration (Day 2) - Replace mock tools with real MCP servers
  2. Real Backend Integration - Connect to actual customer databases and systems
  3. Tool Chaining - More complex multi-step workflows
  4. Error Handling - Graceful degradation when tools fail
  5. Tool Policy Configuration - Control which tools can be used when

Metrics

The arena will measure: