Posts Tagged with dex bytecode
Dexploration: What a default Dex looks like

During the research phase of my Blackhat talk, I was digging into detecting the default layout of a dexfile, as generated by the normal dx tool. Originally, my concept was that I wanted my tool to “stack” things inside the file the same way that the dalvik compiler would, though I couldn’t find any actual resources on what this actually looked like. After a few hours of digging through code on AOSP and tearing apart an actual dex file to look at the innards, I came up with the quick little ASCII diagram below;

The result of the APKfuscator actually ended up being quiet different than the above mappings. It’s definitely possibly to retain the structure, however the sections can easily be interchanged. The resulting sections from my tool look like the following;

The patterns for the normal dx compiler appear to always lay out the same, so if someone has developed a post-compilation modification tool (i.e. – APKfuscator or (bak)smali), it might be possible to see that a dex file has been “changed”. If someone was to develop a tool to look for patterns about how this data is laid out, it could lead to some interesting results. Being able to detect these changes and patterns, run on a large enough scale, could be an interesting tactic to finding out whether or not someone has messed with a file quickly. Hopefully I’ll have more time to research this area and either prove or disprove this theory. Though, until then – hopefully the small ASCII layouts might help someone else with whatever work they’re doing on dalvik research.

A Lesson in Safe Dex
Presenting at Blackhat 2012

Presenting at Blackhat 2012

It’s been almost a full week since my talk, Dex Education: Practicing Safe Dex, though I think I’m only now beginning to recover. The past few months have truly been a whirlwind of both working on dissecting malware at Lookout and working on putting together a solid presentation for BlackHat. So far I’ve been unable to draw a crowd like Charlie, though maybe someday I’ll have people sitting in the aisles fighting for a seat during a presentation. Until then the people who went will just have to deal with the extra legroom. Over all the presentation seemed to go over pretty well, some interesting chats afterwards with some smart people. A few people where interested in the slides and proof of concept code, so I told them I would tweet it and also make a blog post about it.

My slides are available here with the proof of concept code being hosted on my github page here. The proof of concept crackme code on the same github page as well shortly.

I’ve got some extra content that I wasn’t able to fit into the slide-deck, heck it was 96 slides as is after trimming some things out. While I didn’t intend to try and cover everything possible to break most analysis tools, I wanted to attempt to cover as much as possible. Over the course of a few days or weeks, I’ll try to roll out details in my blog about how certain things worked, mainly for people who where unable to attend the presentation, hear my explanations or ask me things at the conference. Feel free to reach out to me if there is anything I’ve missed or you would live a better explanation about.

A few people asked me about Blackhat and Defcon – wondering if it’s worth attending. So to step on a soap box just for a minute, I’ll give the mini speech that I normally tell people. Conferences are only worth what you put into them, go to talks that seem interesting and are outside of your direct field of work. Why attend talks outside the direct field of work? I’ve found it’s a great way to try and find different perspectives, which often can be related back into your own work and field. It is also quiet hard to appreciate a talk on something that you deal with daily, definitely very important to try and keep this in mind if you do see those types of talks. As a presenter myself, I found it exceptionally hard to not go too low level while still feeling like I can add value to everyone in the audience. After attending the talks you chose, meet the presenters and pick their brains, this is honestly where you can learn the most. As I have said, it’s really hard to make a presentation accessible for a whole audience, talking directly with these people will give you so much more information than the slides often do. The people you meet at the bars (for Blackhat @ Caesars goto the Galleria bar) are often people you talk to online already. Make friends, go outside that comfort zone and buy some people drinks. Most everyone is friendly, if they aren’t – don’t drink with them. Almost all conferences are worth going to, Blackhat and Defcon included, mainly due to the talent it attacks that you can find hanging out at the bars.

Probably the greatest thing about Blackhat for me was to meet some really great people I’ve only had the pleasure of talking to online. Talking with Mila, the mind behind Contagio Dump, was really great – able to pay her back a little for all the hard work she does with a beer or two. Got to talk with some of the original DroidSecurity (now AVG) guys, Elad and Oren, it’s never a dull moment talking to an Israeli reverse engineer – just look at Zuk. Another interesting person who I got to hang out with was along side me in the malware talk track, @snare. He did some crazy things with EFI rootkits for OSX, pretty scary and interesting stuff all in the same talk.

People often say it isn’t what you know, but who you know. I’d argue the security space is a ying and yang of both; to be a valuable (reverser) engineer you need to know your stuff and the people to help you succeed.

Enough on this soapbox, hopefully you enjoy the slides and code. If you ever run into me at a conference – let’s have a beer or two and chat.

Android Turrent Defense “Badge System” under the hood

Pew pew! Take that green circles!

Pew pew! Take that green circles!

A recent addition to the android market has been ATD, Android Turret Defense. This is a Plox-like game, though it has the “maze” strategy element combined in it. Strangely — it reminds me of a few old maps I used to play with friend for starcraft… Anyway I finally got around to beating it which isn’t too difficult once you get the hang of placing turrets and a get a decent strategy. At the end it awards you with a “badge code” — not sure exactly what the author intends to use this for, but I decided to take a look at how these are created. I was interested in how they where generated, and to see if people could easily replicate them, or if there would be any deterrents to keep people from just sharing them. Again, this is possibly completely useless information, since we have no idea what these codes will be used for. The could be used for tournaments, downloads, prizes – or maybe to just “give” you an image of a badge… As of right now we just don’t know.

Below is a dump of the function we will be analyzing with my comments in it (highlighted green), they should be pretty easy to follow:

.method private createBadgeCode()Ljava/lang/String;
// Date now = New Date();
new-instance v2,java/util/Date
invoke-direct {v2},java/util/Date/ ; ()V

// SimpleDateFormat dateFormat = new SimpleDateFormat(“yyMMddhhmm”);
new-instance v5,java/text/SimpleDateFormat
const-string v7,”yyMMddhhmm”
invoke-direct {v5,v7},java/text/SimpleDateFormat/ ; (Ljava/lang/String;)V

// StringBuilder raw = new StringBuilder();
new-instance v7,java/lang/StringBuilder
invoke-direct {v7},java/lang/StringBuilder/ ; ()V

// raw.append(dateFormat.format(now));
invoke-virtual {v5,v2},java/text/SimpleDateFormat/format ; format(Ljava/util/Date;)Ljava/lang/String;
move-result-object v8
invoke-virtual {v7,v8},java/lang/StringBuilder/append ; append(Ljava/lang/String;)Ljava/lang/StringBuilder;
move-result-object v7

// raw.append(difficulty);
iget v8,v12,tx/games/atd_world.difficulty I
invoke-virtual {v7,v8},java/lang/StringBuilder/append ; append(I)Ljava/lang/StringBuilder;
move-result-object v7

// raw.append(“tensaix2j”);
const-string v8,”tensaix2j”
invoke-virtual {v7,v8},java/lang/StringBuilder/append ; append(Ljava/lang/String;)Ljava/lang/StringBuilder;
move-result-object v7

// Bytes[] rawbytes = raw.toString.getBytes;
invoke-virtual {v7},java/lang/StringBuilder/toString ; toString()Ljava/lang/String;
move-result-object v4
invoke-virtual {v4},java/lang/String/getBytes ; getBytes()[B
move-result-object v0

/* Below code refined;
int sum = 0;

for(int i = 0; i < rawbytes.length(); i++) sum += rawbytes[i]; */
const/4 v6,0
const/4 v3,0
// length = rawbytes.length();
array-length v7

// if( v3 > v7 ) goto: l3c30
if-ge v3,v7,l3c30

// v7 = rawbytes(v0);
aget-byte v7,v0,v3

// v6 += v7;
add-int/2addr v6,v7

// v3 ++;
add-int/lit8 v3,v3,1
goto l3c1e


// StringBuilder badge = new StringBuilder();
new-instance v7,java/lang/StringBuilder
invoke-direct {v7},java/lang/StringBuilder/ ; ()V

// v8 = Math.random();
invoke-static {},java/lang/Math/random ; random()D
move-result-wide v8

// v10 = 4652007308841189376;
const-wide v10,4652007308841189376 ; 0x408f400000000000

// v8 = Math.round(v8*v10);
mul-double/2addr v8,v10

// I thought it only took one variable??
invoke-static {v8,v9},java/lang/Math/round ; round(D)J
move-result-wide v8

// v10 = 1000
const-wide/16 v10,1000

// v8 += v10;
add-long/2addr v8,v10

// badge.append(v8);
invoke-virtual {v7,v8,v9},java/lang/StringBuilder/append ; append(J)Ljava/lang/StringBuilder;
move-result-object v7

// badge.append(dateFormat.format(now));
invoke-virtual {v5,v2},java/text/SimpleDateFormat/format ; format(Ljava/util/Date;)Ljava/lang/String;
move-result-object v8
invoke-virtual {v7,v8},java/lang/StringBuilder/append ; append(Ljava/lang/String;)Ljava/lang/StringBuilder;
move-result-object v7

// badge.append(difficulty);
iget v8,v12,tx/games/atd_world.difficulty I
invoke-virtual {v7,v8},java/lang/StringBuilder/append ; append(I)Ljava/lang/StringBuilder;
move-result-object v7

// badge.append(sum);
invoke-virtual {v7,v6},java/lang/StringBuilder/append ; append(I)Ljava/lang/StringBuilder;
move-result-object v7

// return badge.toString();
invoke-virtual {v7},java/lang/StringBuilder/toString ; toString()Ljava/lang/String;
move-result-object v1
return-object v1
.end method

An example of the output of this function is; 1310090403121501473

Broken down the output looks like this;

1310090403121501473, (round(random * const)+1000

1310090403121501473, Date in yyMMddhhmm format.

1310090403121501473, “0” Difficulty, Noob = 0, Normal = 1, Pro = 3

1310090403121501473, sum of bytes (date + difficulty + “tensaix2”)

I’ll post more later if the “badge system” is every finished and released. Hopefully this serves as a decent example on how to reverse simple android programs… Enjoy!

Dexdump alternatives…

Thanks to my friend Gabor, over at has created a really well done dex file dissembler. The direct link for the post is here and the source code is all free and located at

It’s nice as it outputs the format in jasmin like the following;

Opposed to the normal;

Great work Gabor, and keep up the good work!

Dex File Header Dump

I’ve been working on the header a little more – so I figured I’d post some code I just finished throwing together quickly. It’s not all the code, since most of it is experimental and I’m not finished doing it, but this will provide people with the information on how to dump the dex file header information.

This now dumps all the header information from the original file, and will recalculate the signature and checksum in case something has changed. A version should be available shortly to check for differences in all the values, hopefully soon being able to calculate the correct values if something is wrong.

Maybe this will be useful for someone? Otherwise, oh well it’s just here in case I delete my files. Working on functions to find the new values after patching and to allow patching/injection of code. I’ll have to write up more later as I don’t have an overwhelming amount of time right now, busy day and I’m exhausted. Saw Sara play some volleyball, finished up solo campaign in COD5, spent a few hours reading and researching some dex related things and trying to get some more injection to work. Tomorrow I probably won’t have time to post – but trust me, this stuff will be up sooner or later. It’s a big puzzle I’m chipping away at, and it’s bugging the heck out of me not having the answers.

Updated Dalvik VM Dex File Format
(lame dex file photoshopping joke huh?)

(lame dex file photoshopping joke huh?)

In my quest to writing a successful injector I’ve had to do a ton of digging into the dex file format. While mostly everything is open source, it’s not exactly easy to find all of the information – let alone understand it. A great resource I’ve mentioned previously was the “Dalvik VM Dex File Format” over at This resource is sadly out dated and no longer updated by pavone, but it does provide a wealth of information. I figured I’d post my results just as pavone has done so that anyone looking for the information will hopefully find it. Note that pavone’s version of the dex file he was examining was ‘dex 009’ according to the magic. The current one as of this posting is ‘dex 035’. I’ll repost this data as I figure out more about it and exactly how it is modified.

Magic – 8 bytes – “dex\n035\0”
Checksum – 4 bytes – Adler32 checksum from bytes offset 12 and on
Signature – 20 bytes – SHA-1 of bytes from 32 on
File Size – 4 bytes – Exactly what it sounds like, the file size
Header Size – 4 bytes – Will always be “70”
Endian Tag – 8 bytes – Will always be “78563412”
Zeros – 8 bytes – Exactly that, eight bytes of zeros
Map Offset – 4 bytes – Leads to below, need more research on this though
String Table Size – 4 bytes – Size of the string’s table
String Table Offset – 4 bytes – Offset to the string table
TypeTable Size – 4 bytes – Size of the type’s table
Type Table Offset – 4 bytes – Offset to the type table
Prototype Table Size – 4 bytes – Size of the prototype’s table
Prototype Table Offset – 4 bytes – Offset to the prototype table
Field Table Size – 4 bytes – Size of the field’s table
Field Table Offset – 4 bytes – Offset to the field table
Method Table Size – 4 bytes – Size of the method’s table
Method Table Offset – 4 bytes – Offset to the method table
Class Table Size – 4 bytes – Size of the class’s table
Class Table Offset – 4 bytes – Offset to the class table

You can easily note that all the sizes of these fields end up adding up to 0x70, which is the “Header Size”. Also if above isn’t clear enough, after a dex file is created, the signature is applied – which is a SHA-1 digest of all the bytes below it’s position. The checksum is an Alder32 hash of all the bytes below itself, including the signature. I actually discussed this in a previous post where I posted the code for “ReDEX”, the post was entitled “DEX File signature and checksums“.

I’m actually revamping the “ReDEX” code to check and spit out this relevant information and more, though it’s not fully done. I’m also doing more research into the “Map” field and will hopefully be able to explain more about what is store, how it is stored and what not – more like the information originally presented on retrodev. Until then, this information will have to suffice, enjoy!