Skip to content

Image Preprocessing Example

This example demonstrates the image preprocessing capabilities of the PromptKit SDK for optimizing images before sending them to vision models.

Features Demonstrated

WithAutoResize: Simple option to automatically resize large images
WithImagePreprocessing: Full control over preprocessing configuration
Quality optimization: Configure JPEG quality for the best balance of size and clarity
Streaming support: Preprocessing works with both Send() and Stream()

Why Image Preprocessing?

Cost reduction: Smaller images = fewer tokens = lower API costs
Faster responses: Less data to transmit and process
Consistent quality: Ensure images meet model requirements
Automatic handling: No manual image manipulation needed

Usage

export GEMINI_API_KEY=your-key
go run .

Configuration Options

Simple: WithAutoResize

conv, err := sdk.Open(
    "./pack.json",
    "vision",
    sdk.WithAutoResize(1024, 1024), // Max dimensions
)

Advanced: WithImagePreprocessing

conv, err := sdk.Open(
    "./pack.json",
    "vision",
    sdk.WithImagePreprocessing(&stage.ImagePreprocessConfig{
        Resize: stage.ImageResizeStageConfig{
            MaxWidth:  800,
            MaxHeight: 600,
            Quality:   90,  // JPEG quality (1-100)
        },
        EnableResize: true,
    }),
)

How It Works

When you call Send() or Stream() with an image, the SDK:
Downloads or reads the image data
Checks if resizing is needed based on your configuration
Resizes maintaining aspect ratio if the image exceeds limits
Re-encodes as JPEG with the specified quality
Sends the optimized image to the vision model

Best Practices

1024x1024 is a good default for most vision models
Quality 85 provides a good balance of size and clarity
For detailed analysis, use higher max dimensions (2048x2048)
For quick classification tasks, smaller sizes (512x512) work well