Something Security Related: PDF Exploit De-Obfuscation

The Gist

In the exploit code mentioned in my previous blog, there is code to embed a couple PDFs into the page which will exploit different vulnerabilities. Since I have a copy of one of the PDFs that exploits a vulnerability, I figure I'll show the process of de-obfuscating the exploit and determining the vulnerability. I can provide the malicious PDF, just ask. This is mainly just about de-obfuscating the JS, not really about analyzing the actual exploit TIFF and what it does.

Dump the Data Stream(s)

The easiest tool to use for analyzing PDFs is PDFStreamDumper from HERE. All you have to do is load the PDF into PDFStreamDumper and it will give you the header info and any data streams in the PDF. In this case we only have one data stream

Without much effort you can see the opening <script> tags indicating embedded javascript. The javascript is what builds and executes the exploit, so you basically just grab all the script from the opening tag until the closing </script> tag. Then just click the Javascript_UI button and it will open a javascript decoder right in PDFStreamdumper

once the Javascript_UI comes up you can simply format the code and you have your neatly formatted, yet obfuscated exploit code

De-Obfuscate JS

The nice thing about this PDF is that the javascript is obfuscated almost identically to the exploit code that served it. As I stated previously in my previous blog post, there is one function doing all the decrypting of the obfuscated actions. In this case the function's name is 'ask'. One of the tricks used to throw off javascript decompilers is to grab data in XML fields by using rawValue. I'll highlight where the function is grabbing the other data.

All it's doing in this case is grabbing everything in the XML field "rye" and replacing any occurrence of Z,h,P,w,I,v,g,F,l, or P with '' aka, nothing. So it's just setting a variable equal to the value of that XML field while removing those particular letters (case sensitive).

Now to show you where the XML field "rye" is located and what it contains.

With that value, you should now be able to use the function 'ask' to decode a good portion of the javascript.

I basically just continue that process and follow the code until the final exploit code is constructed. At which point it is all concatenated and assigned into the 'rye' XML tags.

Since we're just decoding javascript I just simply document.write(g); to write the exploit code to the window.( I did this in malzilla because PDFStreamdumper didn't like document.write - screw it)