Get a preview JPEG of a PDF on Windows?

PythonWindowsImagePdf

Python Problem Overview


I have a cross-platform (Python) application which needs to generate a JPEG preview of the first page of a PDF.

On the Mac I am spawning sips. Is there something similarly simple I can do on Windows?

Python Solutions


Solution 1 - Python

ImageMagick delegates the PDF->bitmap conversion to GhostScript anyway, so here's a command you can use (it's based on the actual command listed by the ps:alpha delegate in ImageMagick, just adjusted to use JPEG as output):

gs -q -dQUIET -dPARANOIDSAFER -dBATCH -dNOPAUSE -dNOPROMPT \
-dMaxBitmap=500000000 -dLastPage=1 -dAlignToPixels=0 -dGridFitTT=0 \
-sDEVICE=jpeg -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -r72x72 \
-sOutputFile=$OUTPUT -f$INPUT

where $OUTPUT and $INPUT are the output and input filenames. Adjust the 72x72 to whatever resolution you need. (Obviously, strip out the backslashes if you're writing out the whole command as one line.)

This is good for two reasons:

  1. You don't need to have ImageMagick installed anymore. Not that I have anything against ImageMagick (I love it to bits), but I believe in simple solutions.
  2. ImageMagick does a two-step conversion. First PDF->PPM, then PPM->JPEG. This way, the conversion is one-step.

Other things to consider: with the files I've tested, PNG compresses better than JPEG. If you want to use PNG, change the -sDEVICE=jpeg to -sDEVICE=png16m.

Solution 2 - Python

You can use ImageMagick's convert utility for this, see some examples in http://studio.imagemagick.org/pipermail/magick-users/2002-May/002636.html :

> Convert taxes.pdf taxes.jpg > > Will convert a two page PDF file into [2] jpeg files: taxes.jpg.0, > taxes.jpg.1 > > I can also convert these JPEGS to a thumbnail as follows: > > convert -size 120x120 taxes.jpg.0 -geometry 120x120 +profile '' thumbnail.jpg > > I can even convert the PDF directly to a jpeg thumbnail as follows: > > convert -size 120x120 taxes.pdf -geometry 120x120 +profile '' thumbnail.jpg > > This will result in a thumbnail.jpg.0 and thumbnail.jpg.1 for the two > pages.

Solution 3 - Python

Is the PC likely to have Acrobat installed? I think Acrobat installs a shell extension so previews of the first page of a PDF document appear in Windows Explorer's thumbnail view. You can get thumbnails yourself via the IExtractImage COM API, which you'll need to wrap. VBAccelerator has an example in C# that you could port to Python.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionGareth SimpsonView Question on Stackoverflow
Solution 1 - PythonChris Jester-YoungView Answer on Stackoverflow
Solution 2 - PythonFederico BuilesView Answer on Stackoverflow
Solution 3 - PythonDominic CooneyView Answer on Stackoverflow