GPT-4o & GPT-5 (Tiling)
GPT-4o / GPT-4.5 (Tiling)
OpenAI processes images in three steps:
- Fit the image within 2048×2048.
- Rescale the short side to 768px.
- Chop the result into 512px tiles — each tile is billed.
The trap is spill-over: a few extra pixels on the long side can push the image into an entire new tile row, doubling the cost for almost no extra information.
Optimization strategy
Reverse-calculate the 768px scaling, then snap the long side to a 512px boundary after the internal scale factor is applied. This lands the image exactly on a tile boundary with no spill-over.
vision-squeezer image.png --model gpt4o
Example
A 4096×3072 photo (12MP) targeted at GPT-4o snaps perfectly into a contained grid:
| Metric | Before | After |
|---|---|---|
| Tokens | 16,777 | 11,182 |
| Savings | — | 5,595 tokens (-33.3%) |
| File size | 2.2 MB | 1.2 MB |
--model gpt4o so Squeezer accounts for the tiling math instead of optimizing agnostically.GPT-5 / GPT-5.5 (Tiling, Capped)
GPT-5 raises the limits dramatically:
- 6000px max dimension
- 10.24M total pixel cap
- 512×512 tiles
- 1536 token hard cap
Because tokens above the cap are wasted, a 1536-token image and a 5000-token image are billed identically. Grid-tiling optimization is rarely needed.
Optimization strategy
Snap to 512px boundaries where it helps, but the real win for GPT-5 is stripping heavy padding and compressing file size (MBs → KBs) for faster uploads and lower latency — not token reduction.
vision-squeezer image.png --model gpt5
