What is the smallest possible valid PDF?

PdfOptimizationPdf Generation

Pdf Problem Overview


Out of simple curiosity, having seen the smallest GIF, what is the smallest possible valid PDF file?

Pdf Solutions


Solution 1 - Pdf

This is an interesting problem. Taking it by the book, you can start off with this:

%PDF-1.0
1 0 obj<</Type/Catalog/Pages 2 0 R>>endobj 2 0 obj<</Type/Pages/Kids[3 0 R]/Count 1>>endobj 3 0 obj<</Type/Page/MediaBox[0 0 3 3]>>endobj
xref
0 4
0000000000 65535 f
0000000010 00000 n
0000000053 00000 n
0000000102 00000 n
trailer<</Size 4/Root 1 0 R>>
startxref
149
%EOF

which is 291 bytes of PDF joy. Acrobat opens it, but it complains somewhat. There is one page in it and it is 3/72" square, the minimum allowed by the spec.

However, Acrobat X doesn't even bother with the cross reference table anymore, so we can take that out:

%PDF-1.0
1 0 obj<</Type/Catalog/Pages 2 0 R>>endobj 2 0 obj<</Type/Pages/Kids[3 0 R]/Count 1>>endobj 3 0 obj<</Type/Page/MediaBox[0 0 3 3]>>endobj
trailer<</Size 4/Root 1 0 R>>

Acrobat complains, but opens it. Now we're at 178 bytes. Turns out that you don't need that /Size in the trailer. Now we're at 172:

%PDF-1.0
1 0 obj<</Type/Catalog/Pages 2 0 R>>endobj 2 0 obj<</Type/Pages/Kids[3 0 R]/Count 1>>endobj 3 0 obj<</Type/Page/MediaBox[0 0 3 3]>>endobj
trailer<</Root 1 0 R>>

Turns out you don't need all those pesky /Type elements in your dictionaries:

%PDF-1.0
1 0 obj<</Pages 2 0 R>>endobj 2 0 obj<</Kids[3 0 R]/Count 1>>endobj 3 0 obj<</MediaBox[0 0 3 3]>>endobj
trailer<</Root 1 0 R>>

Now we're at 138 bytes.

It also turns out that when the spec says "shall be an indirect reference" and /Count is required, and the header "must" be %PDF-1.0, they're making loose suggestions. This is the smallest I could make it and have it openable in Acrobat X:

%PDF-1.
trailer<</Root<</Pages<</Kids[<</MediaBox[0 0 3 3]>>]>>>>>>

70 bytes.

Now, my editor uses Windows newline discipline, but Acrobat accepts Windows, Mac, or Unix conventions, so by using a hex editor, I replaced the \r\n with \r and removed the last newline altogether, which leaves me with 67 bytes

25 50 44 46 2D 31 2E 0D 74 72 61 69 6C 65 72 3C 
3C 2F 52 6F 6F 74 3C 3C 2F 50 61 67 65 73 3C 3C 
2F 4B 69 64 73 5B 3C 3C 2F 4D 65 64 69 61 42 6F 
78 5B 30 20 30 20 33 20 33 5D 3E 3E 5D 3E 3E 3E 
3E 3E 3E 

I tried taking off the last end dictionary (>>), but Acrobat wouldn't have that. The PDF reading built-in to Google Chrome (FoxIt) won't open it.

As a PostScript (HA! See what I did there?), if you consent to Acrobat "repairing" the file, it bumps up to 3550 bytes, most of it optional metadata, but it leaves behind a number of clear spec violations.

Solution 2 - Pdf

I could not get the hello world example to open.

For a small-ish file with text content :

%PDF-1.2 
9 0 obj
<<
>>
stream
BT/ 9 Tf(Test)' ET
endstream
endobj
4 0 obj
<<
/Type /Page
/Parent 5 0 R
/Contents 9 0 R
>>
endobj
5 0 obj
<<
/Kids [4 0 R ]
/Count 1
/Type /Pages
/MediaBox [ 0 0 99 9 ]
>>
endobj
3 0 obj
<<
/Pages 5 0 R
/Type /Catalog
>>
endobj
trailer
<<
/Root 3 0 R
>>
%%EOF

Solution 3 - Pdf

Based on all the answers here, here's the smallest PDF with text:

SMALL_PDF = (
    b"%PDF-1.2 \n"
    b"9 0 obj\n<<\n>>\nstream\nBT/ 32 Tf(  YOUR TEXT HERE   )' ET\nendstream\nendobj\n"
    b"4 0 obj\n<<\n/Type /Page\n/Parent 5 0 R\n/Contents 9 0 R\n>>\nendobj\n"
    b"5 0 obj\n<<\n/Kids [4 0 R ]\n/Count 1\n/Type /Pages\n/MediaBox [ 0 0 250 50 ]\n>>\nendobj\n"
    b"3 0 obj\n<<\n/Pages 5 0 R\n/Type /Catalog\n>>\nendobj\n"
    b"trailer\n<<\n/Root 3 0 R\n>>\n"
    b"%%EOF"
)

As base64. Copy this and test in Chrome:

> data:application/pdf;base64,JVBERi0xLjIgCjkgMCBvYmoKPDwKPj4Kc3RyZWFtCkJULyAzMiBUZiggIFlPVVIgVEVYVCBIRVJFICAgKScgRVQKZW5kc3RyZWFtCmVuZG9iago0IDAgb2JqCjw8Ci9UeXBlIC9QYWdlCi9QYXJlbnQgNSAwIFIKL0NvbnRlbnRzIDkgMCBSCj4+CmVuZG9iago1IDAgb2JqCjw8Ci9LaWRzIFs0IDAgUiBdCi9Db3VudCAxCi9UeXBlIC9QYWdlcwovTWVkaWFCb3ggWyAwIDAgMjUwIDUwIF0KPj4KZW5kb2JqCjMgMCBvYmoKPDwKL1BhZ2VzIDUgMCBSCi9UeXBlIC9DYXRhbG9nCj4+CmVuZG9iagp0cmFpbGVyCjw8Ci9Sb290IDMgMCBSCj4+CiUlRU9G

To make the page bigger, adjust the MediaBox dimensions :)

> /MediaBox [ 0 0 250 50 ]

Solution 4 - Pdf

I thought I'd make a smallest pdf that displays "Hello World". The text is in the lower left corner. Sorry about the 9-point font, any larger would cost an extra byte :)

172 bytes for Adobe Reader X (if saved with linefeed-only newlines and no trailing newline or null-byte):

%PDF-1.
1 0 obj<</Kids[<</Parent 1 0 R/Resources<<>>/Contents 2 0 R>>]>>endobj 2 0 obj<<>>stream
BT/ 9 Tf(Hello World)' ET
endstream
endobj trailer<</Root<</Pages 1 0 R>>>>

120 bytes for Chrome's builtin PDF viewer:

%PDF 1 0 obj<</Pages<</Kids[<</Contents<<>>stream
BT 9 Tf(Hello World)' ET endstream>>]>>>>endobj trailer<</Root 1 0 R>>

To easily see this in Chrome, paste this URI in the address bar (SO won't let me link to it, and it won't work at all in other browsers):

data:application/pdf,%25PDF%201%200%20obj%3C%3C%2FPages%3C%3C%2FKids%5B%3C%3C%2FContents%3C%3C%3E%3Estream%0ABT%209%20Tf(Hello%20World)'%20ET%20endstream%3E%3E%5D%3E%3E%3E%3Eendobj%20trailer%3C%3C%2FRoot%201%200%20R%3E%3E

Solution 5 - Pdf

According to this Ange Albertini lecture, the smallest possible valid PDF is 36 bytes:

%PDF-(NULL)trailer<>>>>>

Where (NULL) is the unprintable ASCII 0 character.

However, as Ange notes, while this PDF is technically valid, most PDF reader apps will regard it as invalid based on the size alone, thus failing to open it.

Solution 6 - Pdf

I was going to give an example of what I thought was the minimal valid "universal" PDF. until I noticed that the whole ethos of using a PDF is to ensure it will render exactly the same on all devices and their PDF readers. However on cross checking my "perfectly small well formed PDF" I spotted this.

enter image description here

So the ground rule was "smallest possible valid PDF" but I consider this shortage should count as an invalid PDF since it does not adhere to the concept of "Fit for Purpose" thus the minimum PDF must itself as a minimum contain a minimum of one means of fixing a working font.

To explain my proposed solution and why its less than perfect here it is in a rough form because of cut and paste.

%PDF-1.0
%µ¶

1 0 obj
<</Type/Catalog/Pages 2 0 R>>
endobj

2 0 obj
<</Kids[3 0 R]/Count 1/Type/Pages/MediaBox[0 0 595 792]>>
endobj

3 0 obj
<</Type/Page/Parent 2 0 R/Contents 4 0 R/Resources<<>>>>
endobj

4 0 obj
<</Length 58>>
stream
q
BT
/ 96 Tf
1 0 0 1 36 684 Tm
(Hello World!) Tj
ET
Q

endstream
endobj

xref
0 5
0000000000 65536 f 
0000000016 00000 n 
0000000062 00000 n 
0000000136 00000 n 
0000000209 00000 n 

trailer
<</Size 5/Root 1 0 R>>
startxref
316
%%EOF

Whilst not defined by the rules of the question I have included some past experience of user problems.

The first difference you might note is media box in 2nd obj is a hybrid MediaBox[0 0 595 792] which is a minimax A4 width and minimax US Letter high, since otherwise the "universal page" in most countries would force a second sheet @ 100% scale printing either for too wide or too high a page definition for the locale defaults.

And the current problem is evidenced in 3rd obj as no fonts have been set for resources, thus in aiming for minimal the PDF, I contest without a font defined, will be Invalid.

Thus none of the answers so far including my own, appear to produce a PDF that will "WORK" as a "VALID" means to produce the same printout, regardless of platform or viewer.

@mkl are you up for producing your best shot ?

Solution 7 - Pdf

I needed a PDF version which is usable by a PDF converter (A4 format issue.. all the above constructs worked with Adobe Reader and Chrome, but not with the PDF converter which required DIN A4). I found this site and this PDF worked fine with the PDF converter I'm using: https://help.callassoftware.com/m/73261/l/798383-how-to-create-a-simple-pdf-file

Solution 8 - Pdf

In Java, use this:

 private static String samplepdf = "255044462D312E0D747261696C65723C3C2F526F6F743C3C2F50616765733C3C2F4B6964735B3C3C2F4D65646961426F785B302030203320335D3E3E5D3E3E3E3E3E3E";

and then

byte[] bytes = hexStringToByteArray(samplepdf);

...

public byte[] hexStringToByteArray(String s) {
    int len = s.length();
    byte[] data = new byte[len / 2];
    for (int i = 0; i < len; i += 2) {
        data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4)
                + Character.digit(s.charAt(i + 1), 16));
    }
    return data;
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionmeshyView Question on Stackoverflow
Solution 1 - PdfplinthView Answer on Stackoverflow
Solution 2 - PdfAlan RiddellView Answer on Stackoverflow
Solution 3 - PdfkolyptoView Answer on Stackoverflow
Solution 4 - PdfHugh AllenView Answer on Stackoverflow
Solution 5 - PdfPancakeView Answer on Stackoverflow
Solution 6 - PdfK JView Answer on Stackoverflow
Solution 7 - PdfDubZView Answer on Stackoverflow
Solution 8 - PdfMartin ŠimonView Answer on Stackoverflow