CVE: CVE-2021-2321

Tested Versions:

  • Oracle VirtualBox 6.1.18 revision r142142

Product URL(s):

Description of the vulnerability

When the e1000 driver is sending data to e1000 device, it will send frame by frame, there are context frame and data frame, usually one context frame followed by one or multiple data frames. We can prepare by setting TDH (Transfer Head), TDBAL (first 32 bit physical address of frames), TDBAH (last 32 bit physical address of frame) register, We can make device doing transfer by writing TDT (Transfer Tail) register and then will call e1kXmitPending to do the transfer.

/**
 * Transmit pending descriptors.
 *
 * @returns VBox status code.  VERR_TRY_AGAIN is returned if we're busy.
 *
 * @param   pDevIns             The device instance.
 * @param   pThis               The E1000 state.
 * @param   fOnWorkerThread     Whether we're on a worker thread or on an EMT.
 */
static int e1kXmitPending(PPDMDEVINS pDevIns, PE1KSTATE pThis, bool fOnWorkerThread)
{
	...
        while (!pThis->fLocked && e1kTxDLazyLoad(pDevIns, pThis))
        {
            while (e1kLocateTxPacket(pThis))
            {
                fIncomplete = false;
                /* Found a complete packet, allocate it. */
                rc = e1kXmitAllocBuf(pThis, pThisCC, pThis->fGSO);
                /* If we're out of bandwidth we'll come back later. */
                if (RT_FAILURE(rc))
                    goto out;
                /* Copy the packet to allocated buffer and send it. */
                rc = e1kXmitPacket(pDevIns, pThis, fOnWorkerThread);
                /* If we're out of bandwidth we'll come back later. */
                if (RT_FAILURE(rc))
                    goto out;
            }
			...
        }
   ...
}

e1kTxDLazyLoad will load frame from physical memory. e1kLocateTxPacket will prepare packet or frame information. e1kXmitDesc, will process the frame.

The first logic bug is in the e1kFallbackAddSegment, this function will be called from e1kXmitDesc->e1kXmitPacket->e1kFallbackAddToFrame->e1kFallbackAddSegment.

static int e1kFallbackAddSegment(PPDMDEVINS pDevIns, PE1KSTATE pThis, RTGCPHYS PhysAddr, uint16_t u16Len, bool fSend, bool fOnWorkerThread)
{
    int rc = VINF_SUCCESS;
    PE1KSTATECC pThisCC = PDMDEVINS_2_DATA_CC(pDevIns, PE1KSTATECC);
    /* TCP header being transmitted */
    struct E1kTcpHeader *pTcpHdr = (struct E1kTcpHeader *)(pThis->aTxPacketFallback + pThis->contextTSE.tu.u8CSS);
    /* IP header being transmitted */
    struct E1kIpHeader *pIpHdr = (struct E1kIpHeader *)(pThis->aTxPacketFallback + pThis->contextTSE.ip.u8CSS);

    E1kLog3(("%s e1kFallbackAddSegment: Length=%x, remaining payload=%x, header=%x, send=%RTbool\n",
             pThis->szPrf, u16Len, pThis->u32PayRemain, pThis->u16HdrRemain, fSend));
    AssertReturn(pThis->u32PayRemain + pThis->u16HdrRemain > 0, VINF_SUCCESS);

    if (pThis->u16TxPktLen + u16Len <= sizeof(pThis->aTxPacketFallback))
        PDMDevHlpPCIPhysRead(pDevIns, PhysAddr, pThis->aTxPacketFallback + pThis->u16TxPktLen, u16Len);
    else
        E1kLog(("%s e1kFallbackAddSegment: writing beyond aTxPacketFallback, u16TxPktLen=%d(0x%x) + u16Len=%d(0x%x) > %d\n",
                pThis->szPrf, pThis->u16TxPktLen, pThis->u16TxPktLen, u16Len, u16Len, sizeof(pThis->aTxPacketFallback)));
    E1kLog3(("%s Dump of the segment:\n"
             "%.*Rhxd\n"
             "%s --- End of dump ---\n",
             pThis->szPrf, u16Len, pThis->aTxPacketFallback + pThis->u16TxPktLen, pThis->szPrf));
    pThis->u16TxPktLen += u16Len; // [1]
    ...
}

You can see that there’s a bounds check to prevent buffer overflow, even when we pass large value and make execution execute the else code it is still add the u16TxPktLen with our own controlled value (u16Len).

But at first glance, seems there’s no way to pass u16Len large than sizeof(pThis->aTxPacketFallback), you can see in this code below, which will call e1kFallbackAddSegment.

static int e1kFallbackAddToFrame(PPDMDEVINS pDevIns, PE1KSTATE pThis, E1KTXDESC *pDesc, bool fOnWorkerThread)
{
#ifdef VBOX_STRICT
    PPDMSCATTERGATHER pTxSg = PDMDEVINS_2_DATA_CC(pDevIns, PE1KSTATECC)->CTX_SUFF(pTxSg);
    Assert(e1kGetDescType(pDesc) == E1K_DTYP_DATA);
    Assert(pDesc->data.cmd.fTSE);
    Assert(!e1kXmitIsGsoBuf(pTxSg));
#endif

    uint16_t u16MaxPktLen = pThis->contextTSE.dw3.u8HDRLEN + pThis->contextTSE.dw3.u16MSS;
    /* We cannot produce empty packets, ignore all TX descriptors (see @bugref{9571}) */
    if (u16MaxPktLen == 0)
        return VINF_SUCCESS;

    /*
     * Carve out segments.
     */
    int rc = VINF_SUCCESS;
    do
    {
        /* Calculate how many bytes we have left in this TCP segment */
        uint16_t cb = u16MaxPktLen - pThis->u16TxPktLen; // [2]
        if (cb > pDesc->data.cmd.u20DTALEN)
        {
            /* This descriptor fits completely into current segment */
            cb = (uint16_t)pDesc->data.cmd.u20DTALEN; /* u20DTALEN at this point is guarantied to fit into 16 bits. */
            rc = e1kFallbackAddSegment(pDevIns, pThis, pDesc->data.u64BufAddr, cb, pDesc->data.cmd.fEOP /*fSend*/, fOnWorkerThread);
        }
        else
        {
            rc = e1kFallbackAddSegment(pDevIns, pThis, pDesc->data.u64BufAddr, cb, true /*fSend*/, fOnWorkerThread);
            /*
             * Rewind the packet tail pointer to the beginning of payload,
             * so we continue writing right beyond the header.
             */
            pThis->u16TxPktLen = pThis->contextTSE.dw3.u8HDRLEN;
        }
        pDesc->data.u64BufAddr    += cb;
        pDesc->data.cmd.u20DTALEN -= cb;
    } while (pDesc->data.cmd.u20DTALEN > 0 && RT_SUCCESS(rc));
    ...
}

There’s a check to make sure, the cb is passed to e1kFallbackAddSegment is not large than u16MaxPktLen. pDesc->data.cmd.u20DTALEN is user controlled, pThis->contextTSE.dw3.u8HDRLEN + pThis->contextTSE.dw3.u16MSS is never large than 0x3fa0 (it have properly check in the e1kUpdateTxContext function).

Turn’s out we can make integer underflow at [2], it will happens if u16TxPktLen large than u16MaxPktLen. But in the first call seems the u16TxPktLen is always zero, because every multiple frame must end with End Of Packet flag, by setting fEOP field in the data frame structure. And we can see in the e1000 code that always doing like this.

    if (pDesc->data.cmd.fEOP)
    {
       ...
       pThis->u16TxPktLen = 0;
    }

And it makes every time e1kXmitPacket is called again with the value of u16TxPktLen is zero, but turns out it can be bypassed. We can bypass this by setting the last data frame with fEOP flag is set, and fDD is set, fDD flag is used to skip the process the current frame. This is what happens when e1kXmitDesc called in the last frame which have fEOP and fDD is set.

static int e1kXmitDesc(PPDMDEVINS pDevIns, PE1KSTATE pThis, PE1KSTATECC pThisCC, E1KTXDESC *pDesc,
                       RTGCPHYS addr, bool fOnWorkerThread)
{
    int rc = VINF_SUCCESS;

    e1kPrintTDesc(pThis, pDesc, "vvv");

    if (pDesc->legacy.dw3.fDD)
    {
        E1kLog(("%s e1kXmitDesc: skipping bad descriptor ^^^\n", pThis->szPrf));
        e1kDescReport(pDevIns, pThis, pDesc, addr);
        return VINF_SUCCESS;
    }
    ...
}

If we set fDD, execution return immediately, without execute the common code which check the fEOP and clear the u16TxPktLen.

static int e1kXmitPacket(PPDMDEVINS pDevIns, PE1KSTATE pThis, bool fOnWorkerThread)
{
    ...
    while (pThis->iTxDCurrent < pThis->nTxDFetched)
    {
        E1KTXDESC *pDesc = &pThis->aTxDescriptors[pThis->iTxDCurrent];
        E1kLog3(("%s About to process new TX descriptor at %08x%08x, TDLEN=%08x, TDH=%08x, TDT=%08x\n",
                 pThis->szPrf, TDBAH, TDBAL + TDH * sizeof(E1KTXDESC), TDLEN, TDH, TDT));
        rc = e1kXmitDesc(pDevIns, pThis, pThisCC, pDesc, e1kDescAddr(TDBAH, TDBAL, TDH), fOnWorkerThread);
        if (RT_FAILURE(rc))
            break;
        if (++TDH * sizeof(E1KTXDESC) >= TDLEN)
            TDH = 0;
        ...
        if (e1kGetDescType(pDesc) != E1K_DTYP_CONTEXT && pDesc->legacy.cmd.fEOP) //[3]
            break;
    }

    LogFlow(("%s e1kXmitPacket: RET %Rrc current=%d fetched=%d\n",
             pThis->szPrf, rc, pThis->iTxDCurrent, pThis->nTxDFetched));
    return rc;
}

After e1kXmitDesc executed with our last frame which have fDD enabled, and it will return and execute the code [3], which check if there’s fEOP enabled and then return.

            while (e1kLocateTxPacket(pThis))
            {
                fIncomplete = false;
                /* Found a complete packet, allocate it. */
                rc = e1kXmitAllocBuf(pThis, pThisCC, pThis->fGSO);
                /* If we're out of bandwidth we'll come back later. */
                if (RT_FAILURE(rc))
                    goto out;
                /* Copy the packet to allocated buffer and send it. */
                rc = e1kXmitPacket(pDevIns, pThis, fOnWorkerThread);
                /* If we're out of bandwidth we'll come back later. */
                if (RT_FAILURE(rc))
                    goto out;
            }

And e1kXmitPacket will return and processing another context frame (followed by data frame). Later we have e1kXmitPacket called with u16TxPktLen non zero.

So our plan is:

First, we send first packet containing context frame followed by two data frame, the last data frame will have fDD and fEOP enabled, so this packet is used to make u16TxPktLen is non zero when the device processing next packet (our second packet).

And second, we send the second packet, this packet contain context frame followed by one data frame, this packet is used to triggering integer overflow (by making u16MaxPktLen less than u16TxPktLen in e1kFallbackAddToFrame) and then we can pass large value to e1kFallbackAddSegment.

static int e1kFallbackAddSegment(PPDMDEVINS pDevIns, PE1KSTATE pThis, RTGCPHYS PhysAddr, uint16_t u16Len, bool fSend, bool fOnWorkerThread)
{
    ...
	if (pThis->u16TxPktLen + u16Len <= sizeof(pThis->aTxPacketFallback))
        PDMDevHlpPCIPhysRead(pDevIns, PhysAddr, pThis->aTxPacketFallback + pThis->u16TxPktLen, u16Len);
    else
        E1kLog(("%s e1kFallbackAddSegment: writing beyond aTxPacketFallback, u16TxPktLen=%d(0x%x) + u16Len=%d(0x%x) > %d\n",
                pThis->szPrf, pThis->u16TxPktLen, pThis->u16TxPktLen, u16Len, u16Len, sizeof(pThis->aTxPacketFallback)));
    E1kLog3(("%s Dump of the segment:\n"
             "%.*Rhxd\n"
             "%s --- End of dump ---\n",
             pThis->szPrf, u16Len, pThis->aTxPacketFallback + pThis->u16TxPktLen, pThis->szPrf));
    pThis->u16TxPktLen += u16Len; // [1]
    ...
    if (fSend)
    {
        /* Leave ethernet header intact */
        /* IP Total Length = payload + headers - ethernet header */
        pIpHdr->total_len = htons(pThis->u16TxPktLen - pThis->contextTSE.ip.u8CSS);
        E1kLog3(("%s e1kFallbackAddSegment: End of packet, pIpHdr->total_len=%x\n",
                 pThis->szPrf, ntohs(pIpHdr->total_len)));
        /* Update IP Checksum */
        pIpHdr->chksum = 0;
        e1kInsertChecksum(pThis, pThis->aTxPacketFallback, pThis->u16TxPktLen, //[4]
                          pThis->contextTSE.ip.u8CSO,
                          pThis->contextTSE.ip.u8CSS,
                          pThis->contextTSE.ip.u16CSE);

        /* Update TCP flags */
        /* Restore original FIN and PSH flags for the last segment */
        if (pThis->u32PayRemain == 0)
        {
            pTcpHdr->hdrlen_flags = pThis->u16SavedFlags;
            E1K_INC_CNT32(TSCTC);
        }
        /* Add TCP length to partial pseudo header sum */
        uint32_t csum = pThis->u32SavedCsum
                      + htons(pThis->u16TxPktLen - pThis->contextTSE.tu.u8CSS);
        while (csum >> 16)
            csum = (csum >> 16) + (csum & 0xFFFF);
        Assert(csum < 65536);
        pTcpHdr->chksum = (uint16_t)csum;
        /* Compute final checksum */
        e1kInsertChecksum(pThis, pThis->aTxPacketFallback, pThis->u16TxPktLen, // [5]
                          pThis->contextTSE.tu.u8CSO,
                          pThis->contextTSE.tu.u8CSS,
                          pThis->contextTSE.tu.u16CSE);
        ...
    }
    ...
}

Now we we can make u16TxPktLen larger than size of aTxPacketFallback, and in [4] and [5] we can use it for OOB access in e1kInsertChecksum by give cse larger than size of aTxPacketFallback.

static void e1kInsertChecksum(PE1KSTATE pThis, uint8_t *pPkt, uint16_t u16PktLen, uint8_t cso, uint8_t css, uint16_t cse, bool fUdp = false)
{
    RT_NOREF1(pThis);

    ...
    if (cse == 0 || cse >= u16PktLen) // [6]
        cse = u16PktLen - 1;
    ...
    uint16_t u16ChkSum = e1kCSum16(pPkt + css, cse - css + 1);
    if (fUdp && u16ChkSum == 0)
        u16ChkSum = ~u16ChkSum;     /* 0 means no checksum computed in case of UDP (see @bugref{9883}) */
    E1kLog2(("%s Inserting csum: %04X at %02X, old value: %04X\n", pThis->szPrf,
             u16ChkSum, cso, *(uint16_t*)(pPkt + cso)));
    *(uint16_t*)(pPkt + cso) = u16ChkSum;
	...
}

By increasing 2 byte repeatedly for cse, we can leak the memory behind the packet buffer.

Timeline

  • 2021-02-27 Reported to Vendor
  • 2021-04-19 Vendor assign CVE