Skip to content

Analyze Documents

Send PDF documents to LLMs for analysis, summarization, comparison, and information extraction.

package main
import (
"context"
"fmt"
"log"
"github.com/AltairaLabs/PromptKit/sdk"
)
func main() {
ctx := context.Background()
conv, err := sdk.Open("./app.pack.json", "document-analyzer")
if err != nil {
log.Fatal(err)
}
defer conv.Close()
// Analyze a PDF document
resp, err := conv.Send(ctx, "Summarize this document",
sdk.WithDocumentFile("./report.pdf"),
)
if err != nil {
log.Fatal(err)
}
fmt.Println(resp.Text())
}

Currently, the SDK supports PDF documents only.

FormatMIME TypeStatus
PDFapplication/pdf✅ Supported
Word (.docx)application/vnd.openxmlformats-officedocument.wordprocessingml.document❌ Not yet
Text (.txt)text/plain❌ Not yet

Document analysis support varies by provider:

ProviderMax SizeModelsNotes
Claude32MBHaiku, Sonnet, OpusBest for complex documents
Gemini20MBFlash, ProGood for quick analysis
OpenAIVariesGPT-4VCheck model documentation
// Claude Haiku - fast, cost-effective
conv, _ := sdk.Open("./app.pack.json", "doc-analyzer",
sdk.WithModel("claude-3-5-haiku-20241022"),
)
// Gemini Flash - very fast
conv, _ := sdk.Open("./app.pack.json", "doc-analyzer",
sdk.WithModel("gemini-2.0-flash"),
)

Most common approach - load PDF from disk:

resp, err := conv.Send(ctx, "What are the key findings?",
sdk.WithDocumentFile("./research-paper.pdf"),
)

Load PDF data from memory (e.g., from database, API response):

pdfBytes, err := os.ReadFile("./document.pdf")
if err != nil {
log.Fatal(err)
}
resp, err := conv.Send(ctx, "Summarize this",
sdk.WithDocumentData(pdfBytes, "application/pdf"),
)

Attach multiple PDFs for comparison or analysis:

resp, err := conv.Send(ctx, "Compare these two contracts",
sdk.WithDocumentFile("./contract_v1.pdf"),
sdk.WithDocumentFile("./contract_v2.pdf"),
)

Combine documents with images for comprehensive analysis:

resp, err := conv.Send(ctx, "Analyze the document and diagram",
sdk.WithDocumentFile("./spec.pdf"),
sdk.WithImageFile("./architecture.png"),
)
resp, err := conv.Send(ctx,
"Provide a concise 3-paragraph summary of the key points",
sdk.WithDocumentFile("./report.pdf"),
)
resp, err := conv.Send(ctx,
"Extract all dates, names, and financial figures into a structured format",
sdk.WithDocumentFile("./invoice.pdf"),
)
resp, err := conv.Send(ctx,
"List all changes between version 1 and version 2",
sdk.WithDocumentFile("./v1.pdf"),
sdk.WithDocumentFile("./v2.pdf"),
)
resp, err := conv.Send(ctx,
"What is the refund policy described in this document?",
sdk.WithDocumentFile("./terms.pdf"),
)
resp, err := conv.Send(ctx,
"Translate this document to Spanish, preserving formatting",
sdk.WithDocumentFile("./contract.pdf"),
)
resp, err := conv.Send(ctx, "Summarize",
sdk.WithDocumentFile("./large.pdf"),
)
if err != nil {
if strings.Contains(err.Error(), "size exceeds") {
log.Fatal("PDF too large - max 32MB for Claude, 20MB for Gemini")
}
log.Fatal(err)
}
resp, err := conv.Send(ctx, "Analyze",
sdk.WithDocumentFile("./missing.pdf"),
)
if err != nil {
if os.IsNotExist(err) {
log.Fatal("PDF file not found")
}
log.Fatal(err)
}
// Currently only PDFs are supported
resp, err := conv.Send(ctx, "Analyze",
sdk.WithDocumentData(wordBytes, "application/msword"), // ❌ Not supported yet
)

Keep documents under size limits:

// Check file size before sending
info, _ := os.Stat("document.pdf")
sizeInMB := float64(info.Size()) / (1024 * 1024)
if sizeInMB > 30 {
log.Printf("Warning: Large PDF (%.1fMB) - may fail with some providers", sizeInMB)
}

Be explicit about what you want:

// ❌ Too vague
conv.Send(ctx, "Tell me about this", sdk.WithDocumentFile("./doc.pdf"))
// ✅ Specific and actionable
conv.Send(ctx, "Extract the table on page 3 and convert to CSV format",
sdk.WithDocumentFile("./doc.pdf"))

For detailed analysis, use streaming:

for chunk := range conv.Stream(ctx,
"Provide a detailed analysis of each section",
sdk.WithDocumentFile("./report.pdf"),
) {
if chunk.Error != nil {
log.Fatal(chunk.Error)
}
if chunk.Type == sdk.ChunkDone {
break
}
fmt.Print(chunk.Text)
}

Documents can use many tokens:

// Subscribe to cost tracking
conv.Subscribe("cost_update", func(e hooks.Event) {
cost := e.Data.(float64)
log.Printf("Current cost: $%.4f", cost)
})
resp, err := conv.Send(ctx, "Analyze",
sdk.WithDocumentFile("./long-document.pdf"),
)

Use Arena to test document flows without API costs:

# config.arena.yaml
providers:
- mock-docs.provider.yaml
scenarios:
- document-test.scenario.yaml

Configure document analysis in your pack:

{
"prompts": {
"document-analyzer": {
"id": "document-analyzer",
"name": "Document Analyzer",
"system_template": "You are an expert document analyst. Provide clear, structured analysis of PDF documents.",
"parameters": {
"temperature": 0.3,
"max_tokens": 4096
}
}
},
"provider": {
"name": "claude",
"model": "claude-3-5-haiku-20241022"
}
}

Create automated tests for document analysis:

# scenarios/pdf-summary.scenario.yaml
apiVersion: promptkit.altairalabs.ai/v1alpha1
kind: Scenario
metadata:
name: pdf-summary
spec:
variables:
user_question: "Summarize the key points"
media:
- type: document
source: test-docs/sample.pdf
assertions:
- type: contains
value: "summary"

Run tests:

Terminal window
promptarena run config.arena.yaml --scenario pdf-summary

See the Document Analysis Example for a complete working example.

  • Format: Only PDF documents are currently supported
  • Size: 32MB max for Claude, 20MB for Gemini
  • Text Extraction: OCR quality depends on PDF structure
  • Images in PDFs: Embedded images are processed by the model
  • Formatting: Complex layouts may not be perfectly preserved