Archive for the dex bytecode Category
0
Loose Documentation Leads to Easy Disassembler Breakages

As people have seen in the past, I tend to have a fun time finding edge-cases which break tools. Often you can find these types of edge-cases while reading documentation and cross referencing the implementation of that in the systems your validating. A pretty good example of this is highlighted in my BlackHat 2012 talk, where I was looking at the header section, which is described as always have the value of 0×70. When looking at the open source tools, some checked to make sure this was true – others ignored it. The actual code in the Dex Verifier is as follows;

From this we can see the actual implementation doesn’t care what the size of, as long as it is larger than the current structure size, which is 0×70. This allows for the verifier to be forward compatible, though if anyone was creating a tool and only read the documentation – this might not be fully understood or assumed.

This leads me to two extremely easy breakages which I never mentioned in my talk, but noticed IDA Pro 6.4 and Radare would fail against. The issue that IDA Pro and Radare broke against, was a bad file magic. According to the documentation the magic bytes are the following;

DEX_FILE_MAGIC
embedded in header_itemThe constant array/string DEX_FILE_MAGIC is the list of bytes that must appear at the beginning of a .dex file in order for it to be recognized as such. The value intentionally contains a newline (“\n” or 0x0a) and a null byte (“\0″ or 0×00) in order to help in the detection of certain forms of corruption. The value also encodes a format version number as three decimal digits, which is expected to increase monotonically over time as the format evolves.

ubyte[8] DEX_FILE_MAGIC = { 0×64 0×65 0×78 0x0a 0×30 0×33 0×35 0×00 }
= “dex\n035\0″

Note: At least a couple earlier versions of the format have been used in widely-available public software releases. For example, version 009 was used for the M3 releases of the Android platform (November–December 2007), and version 013 was used for the M5 releases of the Android platform (February–March 2008). In several respects, these earlier versions of the format differ significantly from the version described in this document.

So one might assume that the currently accepted magic bytes will be exactly “dex\n035\00″ – though, they would be wrong in assuming this. If we take a look at the code in DexFile.h;

We can see that there are constant magic bytes of “dex\n”, but the versioning afterwards – which is loosely explained in the documentation, has multiple options. Since API level 14 on, the verifier has accepted both “036\00″ and “035\00″ as valid versioning parts of the magic bytes. Since the magic bytes are not part of the checksum or the signature of the dex file, one can simply bump the version number without any specialized tools, just doing it with a hex editor would be fine. This lead to Radare failing to load the file and IDA Pro to thinking the file was corrupt with the following dialog and log output;

Corrupt File Dialog

I originally reported this issue to January 22nd, 2013 and received a thank you and a fix back from them only two days later on the 24th. I’m unsure if they sent this out to all their customers or have it totally bundled into their latest packages, but you should easily be able to request it if not. For Radare I submitted a patch for this issue which was quickly merge upstream by the extremely proactive author of the tool.

The second breakage, which only directly effected IDA Pro, was revolving around the file size as dictated by the dex_header vs the actual file size. IDA Pro was comparing the two, and if they where not actually equal – assumes the file is corrupt. The documentation states, “size of the entire file (including the header), in bytes”, though the implementation of the code doesn’t actually care – as seen from the DexSwapVerify.cpp file;

As we can see from above, if the actual length must be at least as large as the expected length, most likely to avoid any truncated files. Though it can easily be larger, which will just produce a warning – though processing of the dex file will continue. However, the same corrupt file dialog with this logging message comes up when loaded in IDA Pro;

Corrupt File Dialog

This was also fixed on the same timeline as the other issue I reported to Hex-Rays, so if you run across any files like this you will be prompted with this dialog;

Extra data

Just two small little issues that came about when looking at the implementation of the file format. These edge-cases always seem to exist in ever system, especially when creating reversing/disassembling/analyzing tools.

1
Dexploration: What a default Dex looks like

During the research phase of my Blackhat talk, I was digging into detecting the default layout of a dexfile, as generated by the normal dx tool. Originally, my concept was that I wanted my tool to “stack” things inside the file the same way that the dalvik compiler would, though I couldn’t find any actual resources on what this actually looked like. After a few hours of digging through code on AOSP and tearing apart an actual dex file to look at the innards, I came up with the quick little ASCII diagram below;

The result of the APKfuscator actually ended up being quiet different than the above mappings. It’s definitely possibly to retain the structure, however the sections can easily be interchanged. The resulting sections from my tool look like the following;

The patterns for the normal dx compiler appear to always lay out the same, so if someone has developed a post-compilation modification tool (i.e. – APKfuscator or (bak)smali), it might be possible to see that a dex file has been “changed”. If someone was to develop a tool to look for patterns about how this data is laid out, it could lead to some interesting results. Being able to detect these changes and patterns, run on a large enough scale, could be an interesting tactic to finding out whether or not someone has messed with a file quickly. Hopefully I’ll have more time to research this area and either prove or disprove this theory. Though, until then – hopefully the small ASCII layouts might help someone else with whatever work they’re doing on dalvik research.

1
A Lesson in Safe Dex
Presenting at Blackhat 2012

Presenting at Blackhat 2012

It’s been almost a full week since my talk, Dex Education: Practicing Safe Dex, though I think I’m only now beginning to recover. The past few months have truly been a whirlwind of both working on dissecting malware at Lookout and working on putting together a solid presentation for BlackHat. So far I’ve been unable to draw a crowd like Charlie, though maybe someday I’ll have people sitting in the aisles fighting for a seat during a presentation. Until then the people who went will just have to deal with the extra legroom. Over all the presentation seemed to go over pretty well, some interesting chats afterwards with some smart people. A few people where interested in the slides and proof of concept code, so I told them I would tweet it and also make a blog post about it.

My slides are available here with the proof of concept code being hosted on my github page here. The proof of concept crackme code on the same github page as well shortly.

I’ve got some extra content that I wasn’t able to fit into the slide-deck, heck it was 96 slides as is after trimming some things out. While I didn’t intend to try and cover everything possible to break most analysis tools, I wanted to attempt to cover as much as possible. Over the course of a few days or weeks, I’ll try to roll out details in my blog about how certain things worked, mainly for people who where unable to attend the presentation, hear my explanations or ask me things at the conference. Feel free to reach out to me if there is anything I’ve missed or you would live a better explanation about.

A few people asked me about Blackhat and Defcon – wondering if it’s worth attending. So to step on a soap box just for a minute, I’ll give the mini speech that I normally tell people. Conferences are only worth what you put into them, go to talks that seem interesting and are outside of your direct field of work. Why attend talks outside the direct field of work? I’ve found it’s a great way to try and find different perspectives, which often can be related back into your own work and field. It is also quiet hard to appreciate a talk on something that you deal with daily, definitely very important to try and keep this in mind if you do see those types of talks. As a presenter myself, I found it exceptionally hard to not go too low level while still feeling like I can add value to everyone in the audience. After attending the talks you chose, meet the presenters and pick their brains, this is honestly where you can learn the most. As I have said, it’s really hard to make a presentation accessible for a whole audience, talking directly with these people will give you so much more information than the slides often do. The people you meet at the bars (for Blackhat @ Caesars goto the Galleria bar) are often people you talk to online already. Make friends, go outside that comfort zone and buy some people drinks. Most everyone is friendly, if they aren’t – don’t drink with them. Almost all conferences are worth going to, Blackhat and Defcon included, mainly due to the talent it attacks that you can find hanging out at the bars.

Probably the greatest thing about Blackhat for me was to meet some really great people I’ve only had the pleasure of talking to online. Talking with Mila, the mind behind Contagio Dump, was really great – able to pay her back a little for all the hard work she does with a beer or two. Got to talk with some of the original DroidSecurity (now AVG) guys, Elad and Oren, it’s never a dull moment talking to an Israeli reverse engineer – just look at Zuk. Another interesting person who I got to hang out with was along side me in the malware talk track, @snare. He did some crazy things with EFI rootkits for OSX, pretty scary and interesting stuff all in the same talk.

People often say it isn’t what you know, but who you know. I’d argue the security space is a ying and yang of both; to be a valuable (reverser) engineer you need to know your stuff and the people to help you succeed.

Enough on this soapbox, hopefully you enjoy the slides and code. If you ever run into me at a conference – let’s have a beer or two and chat.

2
Dex File Header Dump

I’ve been working on the header a little more – so I figured I’d post some code I just finished throwing together quickly. It’s not all the code, since most of it is experimental and I’m not finished doing it, but this will provide people with the information on how to dump the dex file header information.

This now dumps all the header information from the original file, and will recalculate the signature and checksum in case something has changed. A version should be available shortly to check for differences in all the values, hopefully soon being able to calculate the correct values if something is wrong.

Maybe this will be useful for someone? Otherwise, oh well it’s just here in case I delete my files. Working on functions to find the new values after patching and to allow patching/injection of code. I’ll have to write up more later as I don’t have an overwhelming amount of time right now, busy day and I’m exhausted. Saw Sara play some volleyball, finished up solo campaign in COD5, spent a few hours reading and researching some dex related things and trying to get some more injection to work. Tomorrow I probably won’t have time to post – but trust me, this stuff will be up sooner or later. It’s a big puzzle I’m chipping away at, and it’s bugging the heck out of me not having the answers.

1
Promising result for injecting code…

It’s coming along, but it doesn’t seem to be as easy as I’d have hoped. Sort of have a working example but I don’t want to release it until I can definitely identify what needs to be patched and why and other things like exactly by how much etc for things to be injected. Just a little output of some of my notes from the tests I’ve been running. Nothing to mind blowing but some notes incase someone is interested, slash incase I lose the piece of paper;

Things you must patch to successively inject code:

Length of file in bytes (0×20)
Absolute offet of string table (0×34)
type of checksum? (0×38)
number of fields in field table (0×44)
Absolute offet to field table (0×48)
number of methods in method table (0x4C)
absolute offset of method table (0×50)
another checksum? (0×54)
absolute offset of class definition? (0×58)

3
Injecting code into DEX Files

Injecting code plausible and possible!

Injecting code plausible and possible!


Success! It seems completely possible, though quiet a pain to inject new code into existing dex files. This doesn’t not appear like it would easily be done ON a device, though in the development setting it seems perfectly possible and completely do-able.

I’m working on a nice proof-of-concept example to show, though I don’t think this is a “backdoor” to malware. Android has been set up well enough that to properly inject things it would require many things to be done, making it in my opinion extremely hard to do it on the fly on the device. I had to inject the code directly to the dex, resigned both the signature and hash makings for the file, then resign the whole package before reinstalling (after a complete uninstall since we don’t have the same keys as the original package) onto the device. This is a long way away from actually being able to do nasty things with it, which is clearly a good thing, since we don’t want that to happen. This does have practical uses of course, though it seems Google has done security rather well so that this process would most likely only be done by an actual developer for a user to not notice an injected file… Otherwise they would have to allow unknown sources, packages would complain about key, so on and so on…

Hopefully more to come on this subject soon!

0
Inlining an Android programming?

Been doing some experimentation with some extremely interesting results. Looks like inlining a program is possible, though it does get a little messy… I’ve been doing mostly everything by hand and guessing – but it looks like I might be able to write up a program to do it for me. I don’t have a whole lot of time right now as I’m time crunched with some exams, so I can’t do into explicit detail, though if you understand the DEX file system and the Android OS it’s rather similar to injecting in to normal java vm’s. This process is well described here.

Hopefully I’ll have some time later to post to tests and results of what I’ve been doing and how it’s being done.

0
Dalvik VM Internals presentation

A great wealth of information I just stumbled upon given by Dan Bornstein, one of the creators of the Dalvik VM. It’s a long watch – but it definitely helps explain some things that are better than anything I could do;

The slides for further review and handouts can also be found here;

http://sites.google.com/site/io/dalvik-vm-internals

Coming up soon I will have the patched dex file and the tutorial to go along with it!

0
Ah-ha! Success in patching DEX file!
Successfully patching an android application

Successfully patching an android application

It had been bugging me tons since the application kept crashing. I knew the signature and checksum where correct since it wasn’t barfing on installation of the .apk file. So I kept thinking and thinking, finally I decided to do something useful… Look at the traces log! Here we can clearly see that an exception is being thrown… But why?

I decided to do yet another smart thing, that I should’ve done – and redumped the dex file and see if it was making any sense… Of course! I edited the wrong opcode. Apparently in my overwhelming dumbness I tried changing the registers and a exception thrown for the statement. This is something the Dalvik-VM did not agree with, thus the barfing.

I’m going to recreate a nice little example with source code of a simple patch performed on a dex file, and I’ll outline the process used to do so. Hopefully I’ll have this posted sometime tomorrow!

5
DEX File signature and checksums

DEX files, which are the compiled bytecode files that run on the Android OS. Essentially they are like the java bytecode, except they use a modified VM which is called “Dalvik” that was developed for the Android OS.

So long story short – I want to know the structure of these files, and how to edit or patch them. Why? I enjoy reverse engineering things! Anyway there is a wealth of information that I could find on the following sites;

Shane Isbell’s Weblog : http://www.jroller.com/random7/entry/android_s_dex_class_structure
RetroCode, Dex File Formate : http://www.retrodev.com/android/dexformat.html

These site both have great information, thoughmore so on the second link. What intrigued me was that they must have a ‘checksum’ and a ‘SHA-1′ signature. The best information I could find though only eluded to this (Retrocode);

[quote]Notes: All non-string fields are stored in little-endian format. It would appear that the checksum and signature fields are assumed to be zero when calculating the checksum and signature.[/quote]

Well that doesn’t really help me if I’d like to patch a DEX file and then redo the signature and checksum. Especially since we don’t know what exactly is being used to calculate either or what checksum is being used to well, calculate the checksum!

Good thing we can extract .jar files and decompile class files, even more so thank goodness google hasn’t used obstufication on any of the classes. An except from “dx\com\android\dx\dex\file\DexFile.class” after being decompiled into DexFile.jar.

Ah hah! Now we know how to calculate the signature and the checksum. Essentially is reads in all the bytes of the program, and it disreguards the first 32 bytes and calculates the signature using SHA-1. Then it calculates the checksum disreguarding the first 12 bytes (so it includes the signature). Excellent, lets drop this code into our own program so we can recalculate those values;

Excellent, now I can easily recalculate the signature and the checksum of the dex files. Note that the checksum is actually in little-endian so you need to reverse it when entering it in a hex editor.

So, now the Android OS will allow you to install this dex file, after signing it in the correct package (apk). Though as of right now the program still crashes upon launching the file… Hhhmmmmm we’ll have to look into this more I guess.

1