Entry Language – DefCamp CTF Quals 2015 (RE100)

–[ Introduction ]

Hello all! It has been a while since I wrote anything so I thought I’d write about a quick challenge I solved for DefCamp qualifiers 2015. Unfortunately I had only a few hours to spare, so I only managed to solve exploit1 and re1.

This was also a really good opportunity to get my PIN-fu in action, as I’ve been wanting for a while to fire a PINtool at something!

–[ The challenge ]

We’re given the following:

$ file ./r100
  r100: ELF 64-bit LSB  executable, x86-64, version 1 (SYSV), 
  dynamically linked (uses shared libs), for GNU/Linux 2.6.24, 
  BuildID[sha1]=0f464824cc8ee321ef9a80a799c70b1b6aec8168, stripped

Huh, a stripped file. I think the first order of business should be to find main and take it from there! [1] document will come in pretty handy now, as it explains how a program is loaded into memory under Linux (and [2] is an even more exhaustive discussion), but let’s review it here as well.

As a programmer you get taught that main is the first procedure that will get called when your program runs. Well, yes and no. It is the first user-defined piece of code that will run, but in order to actually bring our programs in a state in which they will be able to run the compiler needs to insert some initialization code first. The OS needs to allocate and load into memory our binary, and the OS loader needs to resole any external dependencies of our binary and all of these happen before main.

But the question remains; where is main? We could attempt to start searching through our binary’s entry point [2], which is the first location in the binary that the OS starts executing when it has finished with the loading! Let’s do that!

$ readelf --headers ./r100 | grep Entry
Entry point address: 0x400610
$ objdump -d ./r100 | grep -A 10 -e "400610"  400610:    31 ed                    xor    ebp,ebp
  400612:    49 89 d1                 mov    r9,rdx
  400615:    5e                       pop    rsi
  400616:    48 89 e2                 mov    rdx,rsp
  400619:    48 83 e4 f0              and    rsp,0xfffffffffffffff0
  40061d:    50                       push   rax
  40061e:    54                       push   rsp
  40061f:    49 c7 c0 00 09 40 00     mov    r8,0x400900
  400626:    48 c7 c1 90 08 40 00     mov    rcx,0x400890
  40062d:    48 c7 c7 e8 07 40 00     mov    rdi,0x4007e8
  400634:    e8 97 ff ff ff           call   4005d0 <__libc_start_main@plt>

What is __libc_start_main? It’s also located at the PLT, which means it’s an external reference. [4] tells us that it’s the function called before main, and that it transfers control to main! So main is it’s first argument and as a result the last thing pushed before the call __libc_start_main instruction, and so we can conclude that main lives at 0x4007e8!

–[ Main analysis ]

Below is a a graph of the main function, which looks quite straightforward:

Figure 1: The main function.
Figure 1: The main function.

It looks like all it’s doing is taking our input, storing it in [rbp + s] and then passing that to a function (renamed here to check_pw) and then based on the result of that printing the appropriate success or failure message. So what does check_pw look like?

Figure 2: Check password function.
Figure 2: Check password function.

So it looks like check_pw loops over our string and compares it to some strings defined in the function, but these strings of course do not make any sense!

Now if you are like me, you will have noticed that the inside the loop’s body there are some hard-to-follow manipulations going on. However, we don’t really have to understand any of it, as we notice that towards the end of the loop rax is restored with the pointer to our string and then compared to another register, without our string having gone through any sort of transformation. In particular, the instruction sequence:

mov eax, [rbp+index]
movsxd rcx, eax
mov rax, [rbp+our_string]
add rax, rcx
movzx eax, byte prt[rax]
movsx eax, al
sub edx, eax
mov edx, eax
cmp eax, 1
jz increment_index_jump

Will take a character from their string, take a character from our string, subtract those and increment the loop index only if the difference is 1. This means that the binary compares the differences between the secret and our input, that difference must 1 and we can retrieve the correct value (+1) during the sub edx, eax instruction! Let’s gdb this!

–[ Anti-debug Tricks ]

Loading this binary in gdb will make it behave strangely. The binary loads fine and we can start it, but it never prompts us for the password. Could it be that it is refusing to run inside a debugger? It is entirely likely that there are anti-debugging mechanisms, but i have been tracing the binary’s every action since main and didn’t see any calls to strange functions that look like they are trying to detect their environment.

How does it know then? At this point I started looking at all the other functions in the binary, not the ones that could just be reached from main’s trace. Not long after, this came up:

Figure 3: An unexpected surprise!
Figure 3: An unexpected surprise!

Huh, this looks interesting. It’s trying to detect if LD_PRELOAD is in the environment variables and also trying to detect ptrace. LD_PRELOAD is usually there to enable programmers to hook library functions and do all sorts of fun things ([7]) and ptrace is a means to control another process’s execution (mostly used by debuggers). The catch is that only one process is allowed to ptrace another process at a time, and during debugging this means the debugger and so this call should fail, implying the presence of a debugger [8]. If either of those tricks succeed then our binary enters an infinite loop, an behavior that we did observe during execution in gdb! So I think the author intended to detect if we are running in a debugger and also prevent us from hooking ptrace and returning a fake result! Nice one 🙂

But the question still remains. How are these tricks getting loaded? Who calls them? Grabbing the cross references of the anti-debugging routine, we end up in the following section of the binary:

Figure 4: Init segment with pointer to the anti-debugging routines.
Figure 4: Init segment with pointer to the anti-debugging routines.

What is the init_array and .init section by extension? It turns out that, as [2] informs us, you can have code from your binary run during the start-up loading phase of the binary, just before main is called. This is why I couldn’t find it! This is all really cool! We can defeat the anti-debugging mechanism by breaking just before the second test rax, rax call and setting it to 0 (on fail ptrace returns -1, setting the sign flag which the jns instruction utilizes)!

–[ Solving with GDB ]

Solving this with GDB turned out to be really easy and simple to script up. The following script will dump out the flag. The script’s first breakpoint deals with bypassing the anti-debug tricks and the second breakpoint deals with setting the registers to the appropriate values so that execution carries on as it would normally:

break *0x4007DF
break *0x400787

commands 1
    set $rax=0

commands 2
    set $eax=$edx-1
    printf "%c",$edx-1

And then doing

% gdb -x solver.gdb -q ./r100
Reading symbols from ./r100...(no debugging symbols found)...done.
Breakpoint 1 at 0x4007df
Breakpoint 2 at 0x400787

Breakpoint 1, 0x00000000004007df in ?? ()
Enter the password: wrfsfser

[Inferior 1 (process 3571) exited normally]

The flag is Code_Talkers! Woop woop 🙂 But there’s a more exciting trick in store! And certainly more general purpose, involving side-channels!

–[ Solving with PIN ]

The main advantage of this method is that we can just let it run and we don’t need to be bothered with the anti-debug checks! The main disadvantage is that, well, PIN. Fortunately I didn’t have to write a full pintool, but just modified inscount2.cpp to look like this:

VOID Fini(INT32 code, VOID *v)
    printf("%lu\n", icount);

This way we can use the output and determine the quality of our guess with the following python script. On a high level the script tries to infer the correct character at each position based on the number of instructions executed. Since the password is checked on a character-by-character basis it means that if the character at position n is correct then the binary is going to move to check if the character at position n+1 is correct and hence execute more instructions. Similar approaches have been followed [5] and [6].

The script is looping over 11 characters, as that’s the length of our flag (given away at the comparison in location 0x40079B):


import subprocess
from subprocess import PIPE
import array
import os
import sys
import operator

string_so_far = ""#"Code_Talker"
flag_len = 0xC 

def invoke_pin(string):
    p1 = subprocess.Popen([PINDIR+'pin', '-t', TOOLDIR+PINTOOL,'--', './r100'], stdout=PIPE, stdin=PIPE )
    return p1.communicate(input = string)[0] # get output of command line

def run_through_letters(current_str):
    counts = []
    for i in xrange(0x20,0x7E):
        output = invoke_pin(current_str+chr(i))
        output = int(output.split("\n")[1])
    return counts

for i in xrange(len(string_so_far),flag_len):
    res = run_through_letters(string_so_far)
    max_index, max_value = max(enumerate(res), key=operator.itemgetter(1))
    string_so_far += max_value[1]
    print string_so_far

print string_so_far

Running this produces the following output:


For some reason the } at the end doesn’t look very convincing! Turns out that, after some manual analysis, the correct flag is Code_Talkers 🙂 Indeed:

% ./r100 
Enter the password: Code_Talkers

–[ The End ]

This was a fun challenge. Easy, but I really enjoyed the anti-debugging tricks! I was not aware of the internals and sequence involved in running an ELF file, so that was really handy to learn! Thanks to the creator 🙂

–[ Interesting Reading ]

[1] http://bottomupcs.sourceforge.net/csbu/x3564.htm

[2] http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html

[3] http://bottomupcs.sourceforge.net/csbu/x3300.htm

[4] http://refspecs.linuxbase.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/baselib—libc-start-main-.html

[5] http://gaasedelen.blogspot.co.uk/2014/09/solving-fireeyes-flare-on-six-via-side.html

[6] http://blog.trailofbits.com/2015/09/09/flare-on-reversing-challenges-2015/

[7] http://haxelion.eu/article/LD_NOT_PRELOADED_FOR_REAL/

[8] https://www.aldeid.com/wiki/Ptrace-anti-debugging

Entry Language – DefCamp CTF Quals 2015 (RE100)

One thought on “Entry Language – DefCamp CTF Quals 2015 (RE100)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s