Mach-O (Mach Object) is a binary file format used by macOS and iOS for executables, libraries, and object code. A Mach-O file consists of a header, followed by a series of load commands, and then a series of segments.
The 4GB Limitations
The original 32-bit format used 32-bit offsets, which can address a maximum of 2^32 bytes, or 4GB of memory. The 64-bit variant uses 64-bit offsets, vastly increasing the addressable memory space. However, somewhat surprisingly, the on-disk size of a Mach-O file is still limited to 4GB due to the use of 32-bit file offsets.
Section Commands
Directly following the segment load command is an array of section data structures. Notably, the 64-bit variant of this structure still uses 32 bits to encode its file offset.
From mach-o/loader.h
:
struct section_64 { /* for 64-bit architectures */
char sectname[16]; /* name of this section */
char segname[16]; /* segment this section goes in */
uint64_t addr; /* memory address of this section */
uint64_t size; /* size in bytes of this section */
uint32_t offset; /* file offset of this section */
uint32_t align; /* section alignment (power of 2) */
uint32_t reloff; /* file offset of relocation entries */
uint32_t nreloc; /* number of relocation entries */
uint32_t flags; /* flags (section type and attributes)*/
uint32_t reserved1; /* reserved (for offset or index) */
uint32_t reserved2; /* reserved (for count or sizeof) */
uint32_t reserved3; /* reserved */
};
The Symbol Table
The symbol table load command (LC_SYMTAB
) contains the offsets and sizes of
the link-edit stab-style symbol table. Both the symbol table and string table
offsets are encoded as 32-bit values.
From mach-o/loader.h
:
struct symtab_command {
uint32_t cmd; /* LC_SYMTAB */
uint32_t cmdsize; /* sizeof(struct symtab_command) */
uint32_t symoff; /* symbol table offset */
uint32_t nsyms; /* number of symbol table entries */
uint32_t stroff; /* string table offset */
uint32_t strsize; /* string table size in bytes */
};
Fat Binaries
Multiple Mach-O files can be combined into a multi-architecture binary,
commonly referred to as a “universal binary” or “fat binary.” A fat binary
consists of a fat_header structure followed by multiple fat_arch
structures.
Notably, these structures still use 32-bit file offsets.
From mach-o/fat.h
:
struct fat_arch {
cpu_type_t cputype; /* cpu specifier (int) */
cpu_subtype_t cpusubtype; /* machine specifier (int) */
uint32_t offset; /* file offset to this object file */
uint32_t size; /* size of this object file */
uint32_t align; /* alignment as a power of 2 */
};
The individual architecture-specific components of a fat binary are called slices. Due to the use of 32-bit offsets, Universal Mach-O files cannot contain a slice that starts beyond the 4GB boundary. For example, if you have three binaries, each 3GB in size, you can create a fat binary with two slices—one starting near offset 0 and the second near 3GB. However, a third slice at approximately 6GB would exceed the 4GB limit and cause an overflow.
Debug Info
In practice, binaries rarely exceed the 4GB limit. The main exception is dSYMs.
A dSYM is a bundle that contains a “Mach-O companion file” — a stripped-down
binary that retains only the debug info sections. The .debug_info
section, in
particular, can be very large. As mentioned earlier, this issue is further
compounded when building for multiple architectures.
To address this, dsymutil
, the DWARF linker on macOS, uses a 64-bit variant
of the fat header:
struct fat_arch_64 {
cpu_type_t cputype; /* cpu specifier (int) */
cpu_subtype_t cpusubtype; /* machine specifier (int) */
uint64_t offset; /* file offset to this object file */
uint64_t size; /* size of this object file */
uint32_t align; /* alignment as a power of 2 */
uint32_t reserved; /* reserved */
};
Older versions of dsymutil required passing the -fat64
flag to enable this
format. In newer versions, a warning is issued when a binary exceeds 4GB
(unless -fat64
is explicitly provided), and the tool automatically switches
to the 64-bit fat header.
Starting with macOS Sequoia and Xcode 16, both LLDB (the debugger) and CoreSymbolication fully support dSYMs with a 64-bit fat header.