v0.6.0 Released 🚀

Stop Leaking Vision Tokens.

The LLM-native image optimization middleware. It mathematically snaps your images to Claude, GPT, and Gemini's exact internal grid boundaries to slash token usage by up to 90% — without losing visual detail.

Get started

View on GitHub

cargo run -- data/image.jpg --model gpt4o

// Squeezer simulating OpenAI's short-side algorithm...
Input:  4096×3072  (2.2 MB)
Output: 4095×2048  (1.2 MB)

Tokens Saved: 5,595 (33.3% cheaper)

The Math Behind the Magic

Every provider tokenizes images differently. Squeezer simulates each provider's internal grid math and snaps your images to the cheapest valid boundary.

Claude (Area-Based)

Claude bills strictly by pixel area (W × H / 750). Every pixel of padding costs you. Squeezer aggressively crops solid borders, shaving thousands of tokens instantly.

GPT-4o (Tiling System)

OpenAI forcefully scales the shortest side to 768px, then tiles it. Squeezer simulates this backwards to snap your image right under the exact 512px tile threshold.

Gemini (Massive Tiles)

Gemini uses huge 768×768 blocks. A slightly overlapping image costs you double. We snap images securely down to the nearest tile boundary.

Think in Code (Sandbox)

Let your agent execute custom crops, binarization, or filters locally. Extract only the context you need to save up to 99.9% tokens.

Persistent Analytics

Locally tracks every optimization in a SQLite database. View your cumulative USD savings directly from your terminal or AI agent.

Universal MCP

Works natively with Claude Code, Cursor, Zed, and VS Code. No complex setup — just plug it into your favorite AI tool.

AVIF Output

--format avif encodes ~20–50% smaller than WebP at equal quality, ~3× smaller than JPEG. Same tokens, less bandwidth.

Smart Crop & Auto-Quality

--smart-crop uses edge-energy (Sobel-lite) to keep high-information regions. --auto-quality 0.95 binary-searches quality to hit a perceptual SSIM target.

Batch & JSON Output

Pass a directory + --recursive to squeeze a whole tree at once. --json emits a structured record for pipelines. --dry-run reports without writing.

Interactive Savings Calculator

Real numbers from the Squeezer pipeline. Pick an image source and an optimization target to see token and file-size savings per provider.

Select Image Source

Optimization Target

Optimizing a standard 2400x1670 screenshot. Target: Agnostic.

Tokens Saved (Claude)

1,150 (-21.5%)

5,344 → 4,194 tokens

Tokens Saved (GPT-4o)

340 (-30.8%)

1,105 → 765 tokens

File Size Reduced

28.6% Smaller

0.5MB → 0.3MB

When no target is specified, Squeezer reduces file size and mathematically optimizes boundaries to be generally efficient across all models.

Universal MCP Integration

Select your agent or editor. Thanks to npx -y, zero global installation is required — just paste the configuration.

# Zero-config one-liner for Claude Code

claude mcp add vision-squeezer -- npx -y vision-squeezer

GPT-5: What changes?

GPT-5 handles up to 10.24 Megapixels natively (hard cap 1536 tokens). Because of these massive architectural limits, grid-tiling optimization is rarely needed. However, Squeezer still strips heavy padding and compresses file sizes (MBs → KBs) for much faster API uploads and drastically reduced latency.

Stop paying for padding.

Install VisionSqueezer in one command and start cutting vision token costs across every provider.

Get started View on GitHub