hack.lu CTF 2019 — Shiny Code

We got a video leaked of a new transmission technique working only with lights and colors. The company using it says it is completely secure because of its secret encoding. However, the whistleblower also says that the secret encoding is somehow encoded in the message itself. It uses an old, historical technique based on short and long signals. I have no fucking clue what he is talking about. Perhaps you can help us?

TL;DR

Each of the six segments of the logo is a separate channel on which Morse code is transmitted by the LEDs comprising the segment blinking on and off. Decoding each of the channels reveals the following instructions:

• THE COLOR CODING IS AS FOLLOWS
• YELLOW IS NEXT CHARACTER
• RED IS 01 AND GREEN IS 10
• CYAN IS END OF STREAM
• BLUE IS 11 AND VIOLET IS 00
• GOOD LUCK

Using these instructions, we translate the sequence of colors into a stream of 0s and 1s, which, when decoded as ASCII, reveals the flag:

FLAG{sH1nY_LiGht5}

It really is easy — in hindsight…

How we got there

A bunch of us spent a lot of time looking at this problem on and off. The transmission consists of a ~10 minute movie showing a custom-built board covered in LEDs forming the FluxFingers logo. The LEDs are divided into six groups, or segments, each of which blinks on and off independently, in various colors (all segments have the same color at any point in time).

The problem description mentions that the transmission technique works “only with lights and colors”, and additionally hints at Morse code being used for the encoding (“an old, historical technique based on short and long signals”). However, it is not clear how the multi-dimensional signal being observed (six groups of LEDs, plus color, plus time) maps onto the two-dimensional space (on/off, time) of Morse. The problem description also mentions that the encoding is itself somehow encoded in the transmission. And finally, it also notes that the flag has a non-standard format, using upper-case FLAG{.*} rather than the usual lower-case format, which would seem to hint at a limited character-set.

The following salient characteristics of the signal were observed in the initial analysis:

• The lighting configuration changes every half-second (we called each such time-slot a “pulse”);
• The color of all the LEDs changes every three seconds (every six pulses).

Initially we separated the movie into still frames, one for each pulse:

ffmpeg -i public/shiny_code.mp4 -vf fps=2 frames/pulse%04d.png

We observed 1248 pulses. As mentioned, the color changes every six pulses, which means that we had 208 color slots.

We couldn’t decide whether each pulse represented a character (if so, how does the color, which lasts for much longer than a single pulse, fit in?), or whether each color represented a character (if so, there’s an awful lot of information — six pulses of six bytes each, plus the color — used to encode each character). Also, 208 (and certainly 1248) seemed like a lot of characters for encoding a flag. One thought was that maybe the entire alphabet was encoded in the transmission before the flag itself (recall the hint that the encoding itself was encoded in the transmission).

Some ideas which were floated:

• Maybe it really is the pulses which encode the character, but only one pulse in each color-group is used, with the color acting as a selector for which pulse from within the group to use? (That would make use of both the pulses and the color, while leaving us with the more-manageable number of 208 characters rather than 1248.)
• The six-segment configuration reminded us of Braille, we tried various ways of fitting the data with that…
• We looked at various 6-bit ASCII encodings, base64, and such.

But none of these ideas was really prescriptive enough regarding how to go about cracking the message.

So the next phase was to start to apply traditional analysis tools — first and foremost, frequency analysis — to our message. This required transforming the message into a more manageable format — textual rather than visual. This called for some image processing…

We are far from experts in image processing, and had a few false starts in trying to use Python/OpenCV for extracting the information from the images. Firstly, it took us a while to settle on which color-space to work in (do the entire analysis in RGB? extract brightness information from grayscale and separately the color from RGB? Use HSV space for part or all of the analysis?) Another issue which proved challenging was that the LEDs themselves were very saturated when lit, to such a degree that often they appeared white rather than being able to distinguish a color. Conversely, choosing a pixel which is not part of a LED itself proved to be sometimes too dark even when lit (at least, dark enough that choosing a non-dynamic threshold didn’t work very well, and choosing a dynamic threshold was beyond our abilities).

We ultimately settled on dealing with brightness (on/off) separately from color. The brightness information was analyzed using only the R channel of the image (this was a bit of an accident, we actually thought we were in a different color-space; the point is that because the LEDs are so saturated when on, any of the single channels individually would probably do…). We focused on a single LED (the entire square surrounding one LED) and averaged its brightness, and were able to determine a threshold which worked well for distinguishing between on and off.

Regarding the color, we actually decided to put it aside for a while (mainly because it seemed a bit more difficult to extract, and because we were itching to start analyzing the information we had already gotten from the brightness).

Initially we treated each LEDs segment as a single “bit”, and grouped the bits of each pulse arbitrarily into characters. The frequency analysis of the resulting characters initially looked promising: there were two characters which appeared much more frequently than the others (all-0 and all-1), and among the others there were more-frequent and less-frequent characters. However, looking deeper, we were unable to match the histogram to English letter frequencies; the spacing of the characters within the text did not seem to fit (e.g., assuming the most-frequent character is some kind of separator, do the lengths of the resulting “words” make sense?); and searching for repetitions or patterns did not fit with what you would expect from an English text. By this point we had also noticed that every second color was yellow, so we tried repeating these analyses separately on each of the only-yellow and only-non-yellow channels, but within similar results. All of this seemed to indicate that chunking the bits of each pulse into characters was not leading anywhere…

The breakthrough came when one of us noticed that whenever two consecutive pulses each had some “bits” lit, there was always at least one bit which remained lit from the first to the second pulse. This ultimately proved to only be an artifact, but it got us focusing more on the runs of on/off within each channel separately. Once we started focusing on that, we noticed that a given bit was always off for either one, three or seven consecutive pulses. Having already read up on Morse code, the numbers 1, 3 and 7 rang a bell: those are exactly the lengths of the spaces used for separating symbols within a character, characters, or words, respectively. Now we finally realized that each channel had to be looked at independently. From here forward it was mostly clear sailing.

We wrote some code to translate the runs of ones or zeros in each channel into Morse code, and then into English. It turns out that our data was really noisy (due to image processing errors? real errors in the transmission? we’re not sure…), but by writing the code to be a little robust to noise (and thanks to the fact that the message on each stream repeats itself over and over until the end of the stream), we quickly got the message out of each of the channels:

def morsify(channel):
'''@param channel - a list of 0s and 1s (as integers)
'''
morse = []
current = 0
count = 0
for p,c in enumerate(channel):
if c == current:
count+=1
else:
if current == 0:
if count == 0:
pass # initial
elif count == 1:
pass # symbol spacer
elif count == 3:
morse.append('/')
elif 5 < count < 9:
morse.append(' ')
else:
morse.append('/?/') # raise Exception("0 of length %d at %d" % (count, p))
else: # current = 1
if count == 1:
morse.append('.')
elif count == 3:
morse.append('-')
else:
morse.append('/?/') # raise Exception("1 of length %d at %d" % (count, p))
current = c
count = 1
return morse

CODE = {'A': '.-',     'B': '-...',   'C': '-.-.',
'D': '-..',    'E': '.',      'F': '..-.',
'G': '--.',    'H': '....',   'I': '..',
'J': '.---',   'K': '-.-',    'L': '.-..',
'M': '--',     'N': '-.',     'O': '---',
'P': '.--.',   'Q': '--.-',   'R': '.-.',
'S': '...',    'T': '-',      'U': '..-',
'V': '...-',   'W': '.--',    'X': '-..-',
'Y': '-.--',   'Z': '--..',

'0': '-----',  '1': '.----',  '2': '..---',
'3': '...--',  '4': '....-',  '5': '.....',
'6': '-....',  '7': '--...',  '8': '---..',
'9': '----.', '?':'?',
}

EDOC = {v:k for k,v in CODE.items()}

def morse_decode(morse):
output = []
morse = ''.join(morse)
words = morse.split()
for w in words:
chars = w.split('/')
for c in chars:
if c:
if c in EDOC:
output.append(EDOC[c])
else:
output.append(c)
output.append(' ')
return ''.join(output)

>>> for c in channels:
...     print(morse_decode(morsify(c)))
...

THE C?TLOR COD?ING IS AS F?TLLOWS TS?EE COLOR ?EODING IS A?S FOLLOE?S THE COE
YELR???MW IS NEXT CHARACTER YELLOW IE?I NEXT CHAE?ACTER YELA?IOW IS NED??? CHARACT
RED?SS 01 AND T?...-.EEN IS 1M?T RED IS 0E?---- AND G.-...E? IS 10 .-...?D IS 01 E?NDEGREF IE
C-.--..-? IS END OF I?ETREAM CYA?E IS END OF STREAM CYE?N IS END OF STREAM CY?N IS END O
BLUE ?S 11 AND VIOLET IS ?O0 BLUE IS 11 AN? VIOLET IS 00 BLUE IS 11 A
GOO?E LUCK GM?OD LUCK ?NOOD LUK?EK GOOD R???ACK GOOT?E LUCK GM?OD LUCN

Meanwhile, another team member had extracted the color information from the images. So we now already had on hand a textual representation of the colors, and were quickly able to translate that into a series of bits. Translating that series of bits into ASCII resulted in the beginning of the flag, though we again ran into trouble at some point along the way due to some image processing errors. These were, however, fixed by manually viewing the video, resulting in the flag:

FLAG{sH1nY_LiGht5}