Since its inception, scripting has been an integral part of LLDB. At the time,
Python was chosen as the primary, and for a long time the only supported,
scripting language. Python is used within LLDB to automate things, write custom
data formatters, and extend core concepts like Processes and Threads with an
implementation written in Python. Another powerful way to use LLDB is as a
debugger library in Python, by importing the lldb module.
We can distinguish two ways LLDB and Python are used together:
- Python in LLDB: When using LLDB, either directly (e.g. from the command
line or with lldb-dap), or as a library, it comes with an embedded Python
interpreter. LLDB drops you into the embedded interpreter when you use the
scriptcommand, and it’s what powers LLDB’s ability to execute Python code for things like data formatters and breakpoint callbacks. In this mode, LLDB loads Python. - LLDB in Python: When using LLDB from Python, via
import lldb, Python loads LLDB. In this scenario, LLDB uses the existing interpreter so it can share state between the two.
Both use cases are critical and apply equally to other scripting languages such as Lua. Each introduces its own trade-offs that have shaped LLDB’s design.
The Source of the LLDB/Python Revlock
LLDB’s dependency on Python came with a long-standing caveat: the Python that LLDB ran against had to be the exact one it was built against.
Load-Time Dependency
Both use cases mentioned earlier need the Python library mapped into the process, which is why LLDB has historically linked against it at build time.
LLDB (specifically libLLDB) has a load-time dependency on its build-time
Python at a specific install name. If the Python library doesn’t exist at
runtime, the dynamic loader fails to load LLDB, often resulting in a crash.
Since this happens before LLDB gets a chance to run, there’s no way to fail
gracefully.
When you import lldb in an existing Python interpreter, it has already loaded
its own copy of the Python library. If that library doesn’t match the one LLDB
linked against, the dynamic loader will pull in a second copy. Having two copies
of the same library is generally dangerous, and not something Python supports.
Unstable Python C API
Besides the load-time issue, there’s a second problem. LLDB’s use of Python predates the Limited API, which was introduced by PEP 384 in Python 3.2. This is a subset of Python’s C API that can be compiled once and loaded on multiple versions of Python.
LLDB relies on SWIG to generate the Python bindings for its public Scripting Bridge API. Until SWIG version 4.2, it couldn’t limit the generated code to the Limited API. Later versions generate Limited-API-compatible code by default.
Adopting the Python Limited C API
The first step towards breaking the revlock was adopting the Python Limited C API in LLDB. This work was tracked by issue #151617 and was relatively unglamorous. Most replacements were mechanical, but some APIs needed real surgery, and a few had to be guarded by a minimum Python version when their stable equivalent only landed in a later release.
Adopting the Limited API also requires picking a floor: the Py_LIMITED_API
macro sets the oldest Python version the resulting binary can load against. We
kept the existing minimum: Python 3.8.
We also had to make a small change to the LLDB Python module’s native extension.
The .so file is a symlink to libLLDB.
PEP 3149 defines an ABI versioning scheme
for this file. For example, on Darwin, when building against Python 3.14, you
would end up with _lldb.cpython-314-darwin.so. When building against the
Python Limited API, we use the abi3 ABI tag.
The result is LLDB_ENABLE_PYTHON_LIMITED_API, a CMake option that’s now on by
default when SWIG is recent enough. By itself, this changes nothing
user-visible. LLDB still hard-links the Python library at the build-time path
(this is what the next section fixes).
Loading Python at Runtime
With the ABI guarantee in hand, we can now safely load a different version of Python than we linked against at build time. There are several potential ways to support that:
- We can continue to have the dynamic loader load Python and rely on search
paths to find different versions of the library. Since libraries are
identified by name, this requires a known install name. It also requires
either knowing the potential search paths (e.g.
RPATHs on Darwin) up front, or being able to set them before the LLDB library gets loaded (e.g. by using a shim that sets(DY)LD_LIBRARY_PATH). - We can rely on runtime loading using
dlopenanddlsymto load a library after the LLDB library has already been loaded. This gives a lot of flexibility, but requires intrusive code changes to cast the result ofdlsymfor every symbol. - We can pursue a hybrid between (1) and (2): runtime loading with symbol
resolution. This approach eliminates the need for
dlsymby using normal header files at compile time and telling the linker to allow unresolved symbols. At runtime, the symbols remain unresolved until the library is loaded withdlopen.
The first solution is the safest because both the compiler and linker verify that all symbols exist. It’s also the least flexible. The main reason it was discarded, though, is that the install name of the Python library can vary significantly.
The second option is by far the most flexible, but also the least safe (no
compiler or linker checking) and the most intrusive (all symbols need to go
through dlsym). It was discarded because the code generated by SWIG is outside
of our control. A common solution to this problem is to build a generated shim
library that exports the same symbols, but resolves them lazily at runtime. This
works for function symbols, but Python also exports data symbols, which can’t be
shimmed this way.
That leaves us with the third option, which offers a nice balance between the
other two. It requires no code changes, and you still benefit from compiler
checking. However, it poses a major risk: lazy binding crashes. If you call a
symbol that was not successfully loaded by dlopen, the dynamic loader will
fail to resolve it and crash your program. In other words, what was previously a
link-time failure now becomes a runtime crash.
The Script Interpreters as Dynamic Libraries
To limit the blast radius of the hybrid approach, we decided to minimize the
surface area that’s built with undefined symbols. The solution is to build the
ScriptInterpreter plugins (the parts of LLDB that actually depend on Python)
as shared libraries. Despite the name, plugins in LLDB are linked statically,
and are more about abstraction than modularity.
This poses its own set of challenges. There is no plugin interface, and all existing plugins build on top of various LLDB and LLVM libraries. Making a plugin a dynamic library means it needs access to these symbols. However, LLDB also needs to call into the script interpreter plugin, creating a cycle. Static libraries paper over this problem, but for dynamic libraries we need to break the cycle.
Luckily, LLDB already supports loading plugins at runtime (i.e. with dlopen).
This lets us break the cycle.
Here’s what that ends up looking like:
- A solid edge represents a link-time dependency
ScriptInterpreterPythondepends onlibLLDB. The plugin uses symbols from LLDB and LLVM, resolved at load time via an exports list.
- A dotted edge represents a runtime dependency
- The
PluginManagerdlopens the plugin instead of pulling it in as a link-time dependency. This runtime edge replaces the link-time dependency that the dynamic-plugin work breaks. - The
ScriptInterpreterdepends on Python. Python symbols are deliberately left undefined at link time and bind to whichever Python library is loaded into the process.
- The
Building the script interpreters as dynamic libraries is tracked by
#183791 and controlled by
LLDB_ENABLE_DYNAMIC_SCRIPTINTERPRETERS.
The Exports List
As mentioned earlier, the ScriptInterpreter plugins make use of LLVM and LLDB
private symbols. When built as a dynamic library, we want to continue using the
same symbols, rather than linking in another copy, which would lead to duplicate
global state like option registries and llvm::Error type IDs. This requires
re-exporting the LLVM and LLDB symbols used by the script interpreters. We want
to limit the number of exported symbols, and maintaining an export list by hand
is tedious and error-prone. Instead, we use llvm-nm to collect the necessary
symbols at build time. New dependencies get picked up automatically.
Delay Load on Windows
So far, the discussion has focused primarily on Unix-like systems, though the
revlock isn’t unique to them. Windows has a feature called
delay loading
that lets you programmatically change the loader search path before a symbol is
used. This gives us another hybrid solution that looks more like (1) and
eliminates the complexity of dlopen‘ing Python and the dynamic
ScriptInterpreter libraries.
The Runtime Loader
The ScriptInterpreterRuntimeLoader, specifically its Python implementation, is
the cross-platform abstraction that loads the Python library.
By default, the loader first tries the Python library LLDB was built against, falling back to platform-specific search only if that path is unavailable:
- On Darwin, we use
dlopento loadPython3.frameworkfrom Xcode (located viaDEVELOPER_DIR, the Command Line Tools, orxcrun), orPython.frameworkinstalled from python.org or Homebrew. - On Linux, we use
dlopento loadlibpython3.so, falling back through stable-ABI SONAMEs in descending order. - On Windows, we keep hard-linking Python and rely on the existing delay-load support.
It’s worth noting that loading Python happens in libLLDB, not in the Python
ScriptInterpreter plugin. One reason is that the delay-load setup must happen
before we load the shared library that uses the symbols. But even in the Unix
scenario, there were several reasons to centralize this in LLDB proper,
including layering, better error reporting, and avoiding two instances of Python
in the same process.
Before searching, the loader checks whether Python is already mapped into the
process by probing for Py_IsInitialized. If it is, we skip the search entirely
and bind to the existing runtime. This is what makes the LLDB in Python case
work: when import lldb runs, libpython is already loaded, and LLDB simply
latches onto it.
The Final Result
With the dynamic plugins enabled, the revlock is broken: a single LLDB binary
now works against any Python from 3.8 up, regardless of which one it was built
against. It’s now possible to import lldb into any Python interpreter (>=
3.8), as well as use it from within LLDB. When no Python runtime can be found,
LLDB now reports an error instead of crashing.
Python in LLDB
Here’s what the new design looks like for the Python in LLDB use case, for
example, when running lldb -o 'script'.
LLDB in Python
Here’s what the new design looks like for the LLDB in Python use case, for
example, when running python3 -c 'import lldb'.
As always, this wouldn’t have been possible without the help of our amazing community. Thank you to everyone who chimed in on the RFC, reviewed the patches and provided feedback.
A variant of this was posted as an Update on Breaking the LLDB/Python Revlock on LLVM’s Discouse.