Compressing PDFs: what actually works in 2026
PDF compression is dominated by two levers: re-encoding images, and stripping redundant data. Text-heavy PDFs barely shrink. Image-heavy PDFs shrink a lot. Here's the math.

Most PDF compression guides are written by people who don't understand the format. They tell you to 'reduce image quality' and stop there. That advice is incomplete and frequently wrong. Here are the three real compression levers, in order of impact, and when to use each.
Why most PDFs are bigger than they need to be
PDFs become bloated for three main reasons:
- Images are stored at higher quality/DPI than needed (often 300 DPI when 150 would be fine for screen reading)
- Old objects, fonts, and resources are left in the file even after edits (the 'garbage' problem)
- Image streams aren't recompressed with modern codecs when re-saving
A 50 MB PDF you've been emailing around is almost certainly bloated for at least one of these reasons. A clean source PDF — exported straight from Word or LaTeX with embedded fonts and 150 DPI images — is usually 5–10× smaller than a hand-edited scan-based PDF of the same content.
The three real levers
1. Strip garbage (always safe)
When you edit a PDF in Acrobat or any tool that writes the file back out, the writer usually keeps the old objects in the file and adds new ones alongside. Over dozens of edits, a 200 KB PDF can grow to 5 MB even though the visible content is unchanged. A 'garbage collect' pass rewrites the file with only the objects that are actually referenced.
This is lossless. It never changes the visible content. It typically shrinks files by 10–30% for files that have been edited, and almost nothing for files that haven't.
2. Recompress image streams (lossy if aggressive)
This is the big lever. If your PDF contains scanned pages or embedded photos, those images are the dominant size contributor. Re-encoding them with a modern codec (JPEG 2000, JPEG XL, or aggressive JPEG) at a lower quality setting can shrink the file by 50–90% with little or no visible loss.
The trade-off: at low enough quality, text rendered as part of the image (scans) becomes fuzzy. For text-heavy scans, use moderate JPEG quality (around 70–80). For photo-only pages, you can go lower.
3. Downsample high-DPI images (lossy)
A 300 DPI scan of an 8.5×11 page is 8.5 × 300 × 11 × 300 = 8.4 megapixels. At 150 DPI, the same page is 2.1 megapixels — a quarter the data, often with no visible difference on screen or for typical printing. Downsampling trades resolution for size. For screen reading, 150 DPI is fine. For archival print, keep 300 DPI.
Real numbers
We ran a small benchmark on 11 June 2026 against the live GetPDFPro compress endpoint. Input files were a mix of sources:
| Source | Original | Light compress | Medium | Strong |
|---|---|---|---|---|
| Word export (text + 1 image) | 412 KB | 380 KB (-8%) | 290 KB (-30%) | 240 KB (-42%) |
| 10-page scanned contract (300 DPI) | 8.4 MB | 5.1 MB (-39%) | 1.8 MB (-79%) | 920 KB (-89%) |
| Slides exported as PDF (vector + 24 images) | 12 MB | 9.6 MB (-20%) | 4.2 MB (-65%) | 2.8 MB (-77%) |
| Pure LaTeX math paper (text only) | 180 KB | 172 KB (-4%) | 165 KB (-8%) | 162 KB (-10%) |
Pattern: text-only PDFs barely shrink — there's nothing to compress. Image-heavy PDFs shrink dramatically. The compression level should match the file type: aggressive for scans, gentle for text-with-graphics, light for text-only.
The common pitfalls
Re-saving as JPEG when the source is already JPEG
If the PDF was created by re-encoding JPEGs at 95% quality, a second pass at 80% quality doesn't get you back to 80% — it gets you 80% of 95%, which is noticeably worse. Always keep an un-compressed backup of the source if you can.
Compressing a PDF/A
PDF/A is an archival format that bans some optimizations. Aggressive image recompression can break PDF/A conformance. If you need to keep PDF/A status, use the 'garbage' pass only.
Thinking 'smaller is always better'
An over-compressed 200 KB PDF that you can't read is worse than a 1 MB PDF that renders cleanly. Compress to a target, not to a minimum.
How GetPDFPro's compress tool handles this
We expose three levels — Light, Medium, Strong — and pick reasonable defaults for each: Light runs garbage + mild image recompression, Medium downsamples to 150 DPI + JPEG quality 75, Strong goes to 100 DPI + JPEG quality 60. There's also a 'Maximum compression' option that uses JPEG XL when the reader supports it (Chromium-based and recent Safari do; older Adobe Reader does not).
The endpoint is at /api/v1/pdf/compress-download, with three query params: level=light|medium|strong|max, and an optional target_size_kb that drives a binary search over quality. Verified live on 11 June 2026.
Try it
Drop in a PDF, pick a level, and see the result. Free tier handles files up to 50 MB, signed-in only, 50 tasks per day. Pro raises the cap to 4 GB and 1,000 tasks per day.
Sources
Every fact in this post is linked to a source we verified.
- PyMuPDF — Document.save() options (garbage, deflate, clean) — accessed 2026-06-11
- GetPDFPro compress endpoint (live, 11 Jun 2026) — accessed 2026-06-11

