Enhanced Memory Tagging Extension (EMTE) is a hardware-based security technology to protect against memory corruption vulnerabilities. Memory allocations are tagged with a secret key. When memory is accessed, the hardware validates the tag, and if it doesn’t match, stops the process. EMTE is the foundation of Apple’s Memory Integrity Enforcement (MIE). It’s available on A19 and M5 processors or later.
Using MTE for Finding Bugs
(E)MTE is not just valuable in a security context. It was originally designed as a tool for hardware to help find memory corruption bugs. Compared to other tools for detecting memory safety issues, such as AddressSanitizer, MTE has several benefits. Most notably, its CPU and memory overhead are significantly smaller than software-assisted tools, usually somewhere in the single-digit percentage range. For comparison, ASan usually has a 2x slowdown and a 3x memory overhead. MTE also requires no recompilation and can be dynamically turned on and off at runtime.
While MTE isn’t as comprehensive as ASan, the value proposition is often high enough that it’s worth enabling by default on supported hardware, especially for testing. That’s why we adopted this for the LLDB test suite.
Enabling MTE
Signing a binary with the com.apple.security.hardened-process.checked-allocations entitlement enables memory tagging.
This approach works well for binaries that should always run with MTE enabled.
For example, in LLDB, when LLDB_ENABLE_MTE is enabled, we sign the command line driver with this entitlement.
However, a large part of our test suite is written in Python and uses it as a library through import lldb.
While we could resign the Python binary, that seemed rather suboptimal, and instead we took advantage of a posix_spawn attribute already used by LLDB.
A new tool named darwin-mte-launcher is used to launch another binary with MTE enabled. When enabled, the test suite is configured to use the launcher to invoke Python. When reproducing a crash, the launcher can be dropped easily from the shell invocation to compare a run with and without memory tagging.
Python & MTE
The Python allocator (pymalloc) is not aware of MTE.
Although this is a compile-time option, it’s still possible to use the system allocator at runtime by setting the PYTHONMALLOC=malloc environment variable.
Python may decide to adopt MTE in their allocator or detect this automatically, but for now the launcher explicitly sets the PYTHONMALLOC to work around this issue.
To support the case where the driver is built with memory tagging, LLDB does this programmatically as well.
Results
LLDB has a CI job that runs our test suite against an ASanified LLDB. The expectation was that enabling MTE wouldn’t find any new memory issues. This turned out to be the case indeed.
An interesting consequence of the launcher approach is that the MTE property is inherited by child processes. This means that when running the suite with MTE, not just LLDB, but also the binaries being debugged by LLDB, have MTE enabled. We had to update a few tests to account for the tags in the top byte of tagged addresses, for example when a test was performing pointer arithmetic.
Conclusion
Adopting memory tagging for the LLDB test suite was relatively straightforward. In the process it uncovered a handful of issues where our tests didn’t account for TBI on AArch64. Going forward, this will allow us to catch memory issues at desk with no recompilation rather than having to wait on the sanitized bot.
This effort was tracked by https://github.com/llvm/llvm-project/issues/186318. The issue contains links to the relevant PRs.