pdftk compression option

CompressionPdftk

Compression Problem Overview


I use pdftk to compress a pdf using the following command line

pdftk file1.pdf output file2.pdf compress

It works as the weight of my file decreased.

Are there [options] to change the compression???

Or maybe other solutions to compress my file? It is heavy because some graphics have a lot of points. Is there a way to convert these graphs to jpg for instance and adapt the compression?

Compression Solutions


Solution 1 - Compression

I had the same problem and found two different solutions (see this thread for more details). Both reduced the size of my uncompressed PDF dramatically.

  • Pixelated (lossy):

     convert input.pdf -compress Zip output.pdf
    
  • Unpixelated (lossless, but may display slightly differently):

     gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dBATCH  -dQUIET -sOutputFile=output.pdf input.pdf
    

Edit: I just discovered another option (for lossless compression), which avoids the nasty gs command. qpdf is a neat tool that converts PDFs (compression/decompression, encryption/decryption), and is much faster than the gs command:

qpdf --linearize input.pdf output.pdf

Solution 2 - Compression

Trying to compress a PDF I made with 400ppi tiffs, mostly 8-bit, a few 24-bit, with PackBits compression, using tiff2pdf compressed with Zip/Deflate. One problem I had with every one of these methods: none of the above methods preserved the bookmarks TOC that I painstakingly manually created in Acrobat Pro X. Not even the recommended ebook setting for gs. Sure, I could just open a copy of the original with the TOC intact and do a Replace pages but unfortunately, none of these methods did a satisfactory job to begin with. Either they reduced the size so much that the quality was unacceptably pixellated, or they didn't reduce the size at all and in one case actually increased it despite quality loss.

pdftk compress:

no change in size
bookmarks TOC are gone

gs screen:

takes a ridiculously long time and 100% CPU
errors:
	sfopen: gs_parse_file_name failed.                                 ? 
	| ./base/gsicc_manage.c:1651: gsicc_set_device_profile(): cannot find device profile
74.8MB-->10.2MB hideously pixellated
bookmarks TOC are gone

gs printer:

takes a ridiculously long time and 100% CPU
no errors
74.8MB-->66.1MB
light blue background on pages 1-4
bookmarks TOC are gone

gs ebook:

errors:
	sfopen: gs_parse_file_name failed.
	  ./base/gsicc_manage.c:1050: gsicc_open_search(): Could not find default_rgb.ic 
	| ./base/gsicc_manage.c:1651: gsicc_set_device_profile(): cannot find device profile
74.8MB-->32.2MB
badly pixellated
bookmarks TOC are gone

qpdf --linearize:

very fast, a few seconds
no size change
bookmarks TOC are gone

pdf2ps:

took very long time
output_pdf2ps.ps 74.8MB-->331.6MB

ps2pdf:

pretty fast
74.8MB-->79MB
very slightly degraded with sl. bluish background
bookmarks TOC are gone

Solution 3 - Compression

this procedure works pretty well

pdf2ps large.pdf very_large.ps

ps2pdf very_large.ps small.pdf

give it a try.

Solution 4 - Compression

If file size is still too large it could help using ps2pdf to downscale the resolution of the produced pdf file:

pdf2ps input.pdf tmp.ps
ps2pdf -dPDFSETTINGS=/screen -dDownsampleColorImages=true -dColorImageResolution=200 -dColorImageDownsampleType=/Bicubic tmp.ps output.pdf

Adjust the value of the -dColorImageResolution option to achieve a result that fits your needs (the value describes the image resolution in DPIs). If your input file is in grayscale, replacing Color through Gray or using both options in the above command could also help. Further fine-tuning is possible by changing the -dPDFSETTINGS option to /default or /printer. For explanations of the all possible options consult the ps2pdf manual.

Solution 5 - Compression

pdf2ps large.pdf small.pdf is enough, instead of two steps

pdf2ps large.pdf very_large.ps 
ps2pdf very_large.ps small.pdf

However, ps2pdf large.pdf small.pdf is a better choice.

  • ps2pdf is much faster
  • without additional parameters specified, pdf2ps sometimes produces larger file.

Solution 6 - Compression

The one-line pdf2ps option (by Lee) actually increased the pdf size. However, the two steps one did better. And it can be combined in a single one using redirection from & to standard input/output and pipes:

pdf2ps large.pdf - | ps2pdf - small.pdf

did reduce a PDF generated by xsane from 18 Mo to 630 ko!

Links are lost, but for the present example, it's not a concern... and was the easiest way to achieve the desired result.

Solution 7 - Compression

After trying gpdf as nullglob suggested, I found that I got the same compression results (a ~900mb file down to ~30mb) by just using the cups-pdf printer. This might be easier/preferred if you are already viewing a document and only need to compress one or two documents.

In Ubuntu 12.04, you can install this by

sudo apt-get install cups-pdf

After installation, be sure to check in System Tools > Administration > Printing > right-click 'PDF' and set it to 'enable'

By default, the output is saved into a folder named PDF in your home directory.

Solution 8 - Compression

I didn't see a lot of reduction in file size using qpdf. The best way I found is after pdftk is done use ghostscript to convert pdf to postscript then back to pdf. In PHP you would use exec:

$ps = $save_path.'/psfile.ps';
exec('ps2ps2 ' . $pdf . ' ' . $ps);
unlink($pdf);
exec('ps2pdf ' .$ps . ' ' . $pdf);
unlink($ps);

I used this a few minutes ago to take pdftk output from 490k to 71k.

Solution 9 - Compression

Okular's Print to PDF

I just turned a 140MB PDF produced with Keynote into 2.8Mb using Okular's Print to PDF. Text was converted to raster and zooming-in too much cleary shows pixels, but images were kept pretty sharp and its useable for messaging apps.

Solution 10 - Compression

After trying all the answers listed here, the best results I have obtained for a pdf with lots of graphics is

pdftocairo input.pdf output.pdf -pdf

I discovered this by opening a pdf with Evince in Gnome and then printing to file. This resulted in better file compression and better file quality compared to all the other answers for my pdf file. It seems cairo graphics is used in the background when printing to a file this way: running pdfinfo on the resulting file reveals

> Producer: cairo 1.16.0 (https://cairographics.org)

Solution 11 - Compression

I had the same issue and I used this function to compress individual pages which results in the file size being compressed by upto 1/3 of the original size.

for (int i = 1; i <= theDoc.PageCount; i++)
{
       theDoc.PageNumber = i;
       theDoc.Flatten();
}

Solution 12 - Compression

In case you want to compress a PDF which contains a lot of selectable text, on Windows you can use NicePDF Compressor - choose "Flate" option. After trying everything (cpdf, pdftk, gs) it finally helped me to compress my 1360 pages PDF from 500 MB down to 10 MB.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRockScienceView Question on Stackoverflow
Solution 1 - CompressionnullglobView Answer on Stackoverflow
Solution 2 - Compressionhmj6jmhView Answer on Stackoverflow
Solution 3 - CompressionjortizromoView Answer on Stackoverflow
Solution 4 - CompressionDominikView Answer on Stackoverflow
Solution 5 - CompressionddzzbbwwmmView Answer on Stackoverflow
Solution 6 - CompressionE. CurisView Answer on Stackoverflow
Solution 7 - CompressionryanjdillonView Answer on Stackoverflow
Solution 8 - CompressionTomView Answer on Stackoverflow
Solution 9 - CompressionAriel M.View Answer on Stackoverflow
Solution 10 - CompressionPaul BryanView Answer on Stackoverflow
Solution 11 - CompressionGabbarView Answer on Stackoverflow
Solution 12 - CompressionsolfView Answer on Stackoverflow