Understanding the PDF format: DRM and Wookies
Recently, my friend Dave and I were talking about the Adobe PDF DRM mechanism for eBooks. He was one of many people I’ve talked to who have bought an Adobe eBook without realizing it included DRM sludge. You can’t move it around (easily) or copy excerpts or print any of it. The discussion got me thinking about how Adobe went about their DRM system. I work for a company that does the equivalent for satellite cable systems and these things interest me once in a while. I went to Google and researched Adobes system.
In the end I decided to just take a look at the eBooks I own to see how the DRM mechanism fits with the PDF format. I downloaded the latest free PDF specification and opened up the PDF in vim to follow along. It seems the eBook I was looking at was using a version of the EBX system. After digesting some of that specification, I found a nice (if somewhat old) presentation outlining some reversing done on the various eBook protections. The comments about the passwords protections seem bit out of date, so I’m not sure how accurate this still is. In the end, I got completely distracted by the huge number of features implemented by Adobe. A whole JS engine? embedded Postscript support too? I thought it must be hard to have any confidence in the security of such a sprawling system. Less than a week later, there was a security advisory (some vulns provided remote code execution).
Anyway, I realized (or strongly suspected) that I needed a tcpdump of the eBook buying experience to make any more progress and put the eBook DRM circumvention (for academics! honest) on hold. I decided to play with the PDF format instead. My first exercise was adding a wookie sound (which turned into R2D2 due to lack of good internet accessible wookie sounds) to my (OLD!)resume (obviously something all employers want when opening a PDF document). It will play when opened by Adobe Reader (does anyone know if it works on Preview for OS X?). Adding this was fairly easy; PDFs can be modified by simply appending the modifications to the end of the file. Let’s walk through the robotification of my PDF.
PDF documents are specified by a list of PDF objects, followed by a lookup table giving the byte offsets for each object. These objects are one of: number, array, dictionary, boolean, null, string, stream, name, or indirect reference. You can see the PDF spec for full details, but most are exactly what you imagine. An array is written as “[ obj1 obj2 obj3 … objn ]”. Names are written “/name” and act as identifiers. A dictionary is written “<>” and maps names to objects. A stream is written “dict stream EOL bytes EOL endstream” and is a chunk of data which is not limited (as strings are) to a smaller length. An indirect reference is like a pointer to an identifiable object (an indirect object, which is given a unique pair of id numbers). To modify an existing PDF, you can add another list of objects (which, if the same identifier is used, override previous definitions of objects) and corresponding lookup table.
The PDF format also includes a dictionary after the lookup table. This dictionary includes a reference to the root object of the object tree. The root object is a dictionary containing the key Type of value Catalog. The Catalog object must also contain a entry for the Pages object (usually an indirect reference), but it has many other optional keys. We want to add an event that takes place when the document opens; the Catalog provides a key (OpenAction) we can define for this event. That is the first step.
We will lookup the current Catalog object:
53 0 obj <> endobj
So, starting at the end of the “victim” PDF, we append the new Catalog entry:
53 0 obj <> endobj
I chose to add the event as an indirect reference to object 55 (of generation 0). The number was chosen by looking at the trailer dictionary at the end of the PDF. The Size entry contains the highest unused object number — which was 55 for my document.
Now we must define our action. For this example, I decided to play a sound, but looking at the PDF spec, section 12.6.4 lists a bunch of action types. An action is a dictionary with the Type set to Action. It must contain an S entry giving the action type. For us, that is Sound. A Sound Action also requires a Sound entry giving a Sound object. Let’s add the Sound Action first:
55 0 obj <> endobj
The last object we must add is the actual Sound object. The Sound object is a stream. The stream dictionary (in addition to the usual stream dictionary entries such as Filter and Length) has the Type set to Sound. It is also required to set the Rate (R) — given in samples per second. Optionally, if the sound is not a 8-bit mono sample, bits per sample (B) and channels (C) can be set. Also, the encoding (E) can be set — see the specification for details (you can also just embed a whole WAV or AIFF file). The R2D2 sample was given as:
56 0 obj <>
I left out the 60k hexdump of raw samples. I had a really simple python script to convert from WAV to this, but I can’t find it now, sorry.
Now that all the new objects have been appended, we can write the crossref table and the trailer:
0000000000 65535 f
0000069494 00000 n
0000069564 00000 n
0000069624 00000 n
The xref table is a fixed format to allow quick and easy access for the PDF viewer. It contains a list of entries. Each entry begins with a line specifying the object to start at and the number of objects given. This is followed by a line for each object. The line contains two numbers: the byte offset of the start of that object and the generation of the object. Generations come into play when updates to a PDF delete an object. For us, all our objects are of generation 0. Also, object 0 is always of generation 65535 and is used when keeping the list of deleted object numbers available for reuse. The line must be of a strict format: the ten digit zero padded decimal number specifying the byte offset for the current object, a single space, the 5 digit zero padded decimal generation, a single space, a single character (‘n’ for regular object or ‘f’ for a freed object), and a 2 char end of line sequence (space + carriage return, space + line feed, or CR + LF).
After the xref table, the trailer dictionary is written. The trailer dictionary contains the Root reference, the new Size entry (our highest object was 56, so the Size is 57), and the byte offset to the last xref table (prior to this one — it will be given in the startxref near the end of the original file).
Finally, the startxref is given. startxref on a line by itself, then the byte offset of the new xref table in decimal on a line by itself.
Really finally, the ‘%%EOF’ comment ends the file.
Note: I did this with vim and while it wasn’t too bad, manually adding the byte offsets required by the PDF format was a bit error prone. I ended up writing a Python script to build PDFs to do some further experiments, but I didn’t add any parsing to do modification of existing PDFs. I know these things exist, but it was worth it to work it out myself.
I was going to talk about breaking PathWords on Facebook to give a friend 6200 points. That will have to be next time. The length got away from me.
Other things: I’m reading “Ynot: Reasoning with the Awkward Squad” and I like the writing style. It is surprising to grok this stuff without too much trouble. I’d like to chalk that up to having more experience, but I think it’s just the clarity of the writing. I spent some time on the 0x41414141 reversing challenge. It was fun — I hope they add more. I encourage anyone looking for a job to take a crack at it.
TODO: Clean up the rest of this site (markup and finish the SHA notes). Write up the PathWords experience. Write up my experiences reversing the SNES version of “Zelda: A Link to the Past”.