LLVM | Jonas Devlieghere

dsymutil's Lockstep Algortihm

I was recently answering questions about dsymutil’s multi-threading model and lockstep algorithm. I decided to write it down here for future reference. This article focuses on types and the .debug_info section. Background As a reminder, dsymutil is an optimizing DWARF linker. It only retains debug info for elements that appear in the final executable. It uses the One Definition Rule (ODR) to unique C++ types. As will become clear, both of these heavily shaped its design....

Mach-O File Format 4GB Limit

Mach-O (Mach Object) is a binary file format used by macOS and iOS for executables, libraries, and object code. A Mach-O file consists of a header, followed by a series of load commands, and then a series of segments. The 4GB Limitations The original 32-bit format used 32-bit offsets, which can address a maximum of 2^32 bytes, or 4GB of memory. The 64-bit variant uses 64-bit offsets, vastly increasing the addressable memory space....

LLDB Debug Adapter Protocol (DAP)

Debug Adapter Protocol The Debug Adapter Protocol (DAP) defines a generic protocol for editors to talk to a debugger. Popular editors with DAP support include Visual Studio Code, Sublime Text, (Neo)vim and Emacs. If you’re familiar with the Language Server Protocol (LSP) you can think of DAP like LSP, but for debugging. Implementations of the debug adapter protocol generally come in two forms: An extension tailored to a specific editor. A standalone DAP server that can be used with any editor that supports the protocol....

Rich Disassembler for LLDB

LLVM is once again participating in Google Summer of Code (GSOC). For 2024 we have an exciting project to enrich the disassembler in LLDB. The project consists of using the variable location information from the debug info (DWARF) to annotate LLDB’s disassembler (and register read) output with the location and lifetime of source variables. You can find all the details on LLVM’s Google Summer of Code Ideas & Projects page....

Google Summer of Code 2021

Today Google announced the list of open-source organizations participating in the 2021 Google Summer of Code program. Together with Raphael and Pedro, I’ll be mentoring the following two projects: A structured approach to diagnostics in LLDB Lua scripted watchpoints in LLDB If you’re interested in either of these projects or have questions, feel free to reach out. For more information about GSoC itself check out the Summer of Code website.

Statistics in dsymutil

To make incremental builds fast on macOS, the static linker (ld) ignores the debug information. It can easily be a magnitude bigger than the rest of the program and slow down link time. Instead the linker emits a debug map which contains the location of all the object files it relocated so that debug info consumers (such as the debugger) know where to find the DWARF debug info. This approach works great during development and greatly speeds up the build-debug cycle....

LLDB Column Breakpoints

If you’ve ever used the debugger, chances are you’ve used a file and line number to set a breakpoint. Most of the time this provides enough granularity. Sometimes, though, more fine grained control would be helpful. Consider the following example: int foo() { return 1; } int bar() { return 2; } int baz() { return 3; } int main(int argc, char** argv) { return foo() + bar() + baz(); } Line Breakpoints Let’s say we want to step into the function baz on line 11 and can’t set a breakpoint on baz itself....

Lua Scripting in LLDB

LLDB is the debugger developed as part of the LLVM project. It is probably most known as the debugger in Xcode, but many use it as an alternative to GDB. Scripting in LLDB One thing that makes LLDB really powerful is how scriptable it is. It has a stable C++ API, called the SB API or Scripting Bridge API, which is accessible through Python. Following LLVM’s model of reusable components, most of LLDB constitutes a debugger library and the SB API is how tools like the command line driver interface with it....

Sanitizing C++ Python Modules

Python has great interoperability with C and C++ through extension modules. There are many reasons to do this, such as improving performance, accessing APIs not exposed by the language, or interfacing with libraries written in C or C++. Unlike Python however, C and C++ are not memory safe. Luckily, great tools exist to help diagnose these kind of issues. One of those tools is ASan (Address Sanitizer) which uses compiler instrumentation to detect memory errors at runtime....

One Year At GuardSquare

Today marks my one year anniversary at GuardSquare. On the one hand it feels like yesterday (as cliché as that might sound) but on the other hand it feels like ages ago, considering how much we have been able to do in just one year’s time. For those that don’t know GuardSquare yet, allow me to quickly introduce our company. If you have ever used Java, you probably know ProGuard, our open source optimizer for Java bytecode....