Hi,
Last year there was an effort led by Tom Tromey to add Rust language support into LLDB. He had implemented a fairly complete language plugin, however it was not accepted into mainline because of supportability concerns. I guess these concerns had some merit, because this change did not survive even in Rust’s private branch due to the difficulty of rebasing on top of LLVM 9.
I am wondering if there’s a more limited version of this, that can be merged into mainline:
In terms of its memory model, Rust is not that far off from C++, so treating Rust types is if they were C++ types basically works. There is only one major problem: currently LLDB cannot deal with tagged unions, which Rust code uses quite heavily. When such a type is encountered, LLDB just emits an empty struct, which makes it impossible to examine the contents.
My tentative proposal is to modify LLDB’s DWARFASTParserClang to handle DW_TAG_variant et al, and create a C++ approximation of these types, e.g. as a polymorphic class, or just an untagged union. This would provide at least a minimal level of functionality for Rust (and possibly other languages) and be a much lesser maintenance burden on LLDB core team.
What would y’all say?
So if Rust actually uses llvm and clang and Rust is supported by llvm and clang, this shouldn’t be an issue and should already work. But if you are having problems, then I am guessing that you have a compiler that isn’t based on llvm and clang? If that is the case, the best thing you can do is write a new TypeSystem subclass. Everywhere in LLDB, anytime we want to get type information or run an expression, we grab a TypeSytem for a given language enumeration. When we are stopped in a Rust stack frame, we will ask for the type system for the Rust language and hopefully we get something back.
For viewing types in a variable view, you can go the route of letting LLDB convert DWARF into clang AST types and letting that infrastructure display those types. But you can often run into issues, like you have seen with your DW_TAG_variant. If a user then types “p foo->bar”, it will invoke the clang expression parser and it will then play with the types that you have created. Clang has a lot of asserts and other things that can crash your debug session if you do anything to weird in your clang AST context.
So if Rust doesn’t use clang in its compiler
- create a new TypeSystem for Rust that would convert DWARF into Rust AST types that are native to your Rust compiler that uses as much of the Rust compiler sources as possible
- write a native Rust expression parser which hopefully uses your Rust compiler sources to evaluate and run your expression
It is good to note how the Swift language decided to do things differently. Swift decided that they would have the compiler/linker generate a blob of data that is embedded into the executable or stand alone debug information that contains a serialized AST of the program. The benefit of this approach is that when you debug your program, LLDB will hand this serialized blob back to the compiler. The DWARF information for Swift doesn’t need to encode the full type information in this case. It just has mangled names that uniquely identify the types. LLDB can then pass this mangled name to the compiler and say “please give me a type the ‘_SC3FooS3Bar’”. The other benefit if this approach is that the compiler can rapidly change language features and the debugger can keep up by recompiling. Any new language features or types are encoded in the data blob and the compiler can then extract them. The serialized Swift AST contexts are not portable between compiler versions though, and this is the drawback of this approach. The LLDB must be perfectly in sync with the tools that produce the binaries. Another benefit of this approach is that the entire AST of all types gets encoded. Many compilers will limit the amount of DWARF debug info they emit which means that they don’t emit every type, they try to only emit the types that are used. DWARF also doesn’t have great template support, so any templates that aren’t used, or code that is only inlined (std::vector::size() for example) won’t be callable in an expression. If you have the entire AST, you can synthesize these inlined functions and use all types that your program knew about when it was compiled. If you convert reduced DWARF into ASTs, you only have the information that is represented by the DWARF itself and nothing more.
All other languages convert DWARF back into clang AST types and then let the clang compiler evaluate expression using native clang AST types. The C and C++ languages have been pretty stable so this approach works well for C/C++/ObjC and more.
So the right answer depends on what the Rust community wants. Is the language changing rapidly where the debugger must be in sync to take advantage and use the latest language features? Or is it stable?
The other nice things about creating a new TypeSystem, is that it is a plugin and you don’t need to compile it in. cmake can be taught to conditionally compile in your type system if you want it. It would also allow you to have more than one Rust type system if needed for different Rust language versions that each could be exclusively compiled in. Having your sources be in a new TypeSystem plug-in ensure easy merging when it comes to different repositories.
Let us know about which approach sounds better to the Rust community and we can proceed from there!