Skip to content

Prometheus Metrics

PromptKit provides built-in Prometheus metrics for monitoring pipeline performance, LLM provider usage, and costs in production environments.

package main
import (
"net/http"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"github.com/AltairaLabs/PromptKit/runtime/metrics"
"github.com/AltairaLabs/PromptKit/sdk"
)
func main() {
// 1. Create a collector — registers pipeline metrics once per process
reg := prometheus.NewRegistry()
collector := metrics.NewCollector(metrics.CollectorOpts{
Registerer: reg,
Namespace: "myapp",
ConstLabels: prometheus.Labels{"env": "prod"},
})
// 2. Attach to conversations via sdk.WithMetrics()
conv, _ := sdk.Open("./app.pack.json", "chat",
sdk.WithMetrics(collector, nil),
)
defer conv.Close()
// 3. Expose via your own HTTP server
http.Handle("/metrics", promhttp.HandlerFor(reg, promhttp.HandlerOpts{}))
http.ListenAndServe(":9090", nil)
}
MetricTypeLabelsDescription
{ns}_pipeline_duration_secondsHistogramstatusTotal pipeline execution duration
{ns}_provider_request_duration_secondsHistogramprovider, modelLLM API call duration
{ns}_provider_requests_totalCounterprovider, model, statusTotal provider API calls
{ns}_provider_input_tokens_totalCounterprovider, modelInput tokens sent to provider
{ns}_provider_output_tokens_totalCounterprovider, modelOutput tokens received from provider
{ns}_provider_cached_tokens_totalCounterprovider, modelCached tokens in provider calls
{ns}_provider_cost_totalCounterprovider, modelTotal cost in USD
{ns}_tool_call_duration_secondsHistogramtoolTool call execution duration
{ns}_tool_calls_totalCountertool, statusTotal tool call count
{ns}_validation_duration_secondsHistogramvalidator, validator_typeValidation check duration
{ns}_validations_totalCountervalidator, validator_type, statusValidation results (passed/failed)

Where {ns} is the configured namespace (default: promptkit).

Pack-defined eval metrics (from EvalDef.Metric) are also recorded through the same collector under the {ns}_eval_ sub-namespace. For example, a metric named response_quality_score with namespace myapp becomes myapp_eval_response_quality_score. This separates eval metrics from pipeline metrics, making it easy to query all evals with a pattern like myapp_eval_.*. See Eval Framework for metric types and label configuration.

For eval-only consumers (e.g. workers using sdk.Evaluate() without a live pipeline), use NewEvalOnlyCollector and pass it via MetricsCollector on EvaluateOpts:

reg := prometheus.NewRegistry()
collector := metrics.NewEvalOnlyCollector(metrics.CollectorOpts{
Registerer: reg,
Namespace: "myapp",
InstanceLabels: []string{"tenant"},
})
results, err := sdk.Evaluate(ctx, sdk.EvaluateOpts{
PackPath: "./app.pack.json",
Messages: messages,
MetricsCollector: collector,
MetricsInstanceLabels: map[string]string{"tenant": "acme"},
})

NewEvalOnlyCollector is equivalent to NewCollector with DisablePipelineMetrics: true — it skips registration of provider, tool, pipeline, and validation metrics.

When multiple conversations share one Prometheus endpoint, use instance labels to distinguish them:

collector := metrics.NewCollector(metrics.CollectorOpts{
Registerer: reg,
Namespace: "myapp",
InstanceLabels: []string{"tenant", "prompt_name"},
})
conv1, _ := sdk.Open(pack, "support", sdk.WithMetrics(collector, map[string]string{
"tenant": "acme", "prompt_name": "support",
}))
conv2, _ := sdk.Open(pack, "sales", sdk.WithMetrics(collector, map[string]string{
"tenant": "globex", "prompt_name": "sales",
}))
FieldTypeDescription
Registererprometheus.RegistererRegistry to register into (default: DefaultRegisterer)
NamespacestringMetric name prefix (default: "promptkit")
ConstLabelsprometheus.LabelsProcess-level constant labels (env, region)
InstanceLabels[]stringLabel names that vary per conversation (tenant, prompt_name). Sorted internally — Bind() label order doesn’t matter.
DisablePipelineMetricsboolDisable operational metrics (use for eval-only consumers, or use NewEvalOnlyCollector)
DisableEvalMetricsboolDisable eval result metrics

PromptKit includes a pre-built Grafana dashboard at runtime/metrics/grafana/pipeline-dashboard.json.

  1. Open Grafana and navigate to Dashboards > Import
  2. Upload the pipeline-dashboard.json file or paste its contents
  3. Select your Prometheus data source
  4. Click Import

The dashboard includes:

  • Pipeline Overview: Completion rate, error rate, p95 duration, total cost, total tokens
  • Provider Metrics: API latency percentiles, request rate, token consumption, cost breakdown
  • Tool & Validation: Tool call duration, validation pass/fail rates

Add to your prometheus.yml:

scrape_configs:
- job_name: 'promptkit'
static_configs:
- targets: ['localhost:9090']
scrape_interval: 15s

Create alerts in Prometheus Alertmanager:

groups:
- name: promptkit
rules:
- alert: HighPipelineErrorRate
expr: |
sum(rate(promptkit_pipeline_duration_seconds_count{status="error"}[5m]))
/
sum(rate(promptkit_pipeline_duration_seconds_count[5m]))
> 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "High pipeline error rate"
description: "More than 10% of pipelines are failing"
- alert: HighProviderLatency
expr: |
histogram_quantile(0.95,
sum(rate(promptkit_provider_request_duration_seconds_bucket[5m]))
by (provider, model, le)
) > 30
for: 5m
labels:
severity: warning
annotations:
summary: "High provider latency"
description: "p95 provider latency exceeds 30 seconds"
- alert: HighTokenConsumption
expr: |
(
sum(increase(promptkit_provider_input_tokens_total[1h]))
+ sum(increase(promptkit_provider_output_tokens_total[1h]))
) > 1000000
for: 1m
labels:
severity: info
annotations:
summary: "High token consumption"
description: "Over 1M tokens consumed in the last hour"