Supporting new source languages

Hi everyone,
I was wondering if it is a goal of LLDB to allow for supporting programming languages other C/C++ and Objective-C? I've been browsing the source code a bit and found that references to clang::Type are hardwired into lldb::Type. Is there an extension path that allows to sidestep Clang and use one's own type representation? Is it even feasible to try and support non-Clang languages within LLDB's architecture? That would be very interesting to me.

Thanks for reading!
-Michael

Nobody care to comment? Let me elaborate a bit on my reasons for asking. Since last summer I've been working on debuginfo generation for the Rust compiler (www.rust-lang.org). As the Rust compiler is LLVM based this worked out pretty well and we produce DWARF good enough to satisfy basic debugging needs. However, obviously debuggers don't recognize Rust yet and print out values in the wrong syntax among other things. Some of this can be alleviated with Python extensions in LLDB and GDB, but the information available through those APIs seems to be limited (eg. it seems hard get modifiers, such as 'const' or 'volatile'). Also it would be great to allow for parsing Rust expressions in the debugger, call functions, in short: make Rust a first-class citizen of the given debugger. I'm currently trying to find out what the possibilities in this area are.

GDB seems to have a story for supporting new source languages but for LLDB I couldn't find anything about the topic yet. It would be great if somebody could elaborate on this, even if only saying "not a goal for LLDB" or "too early to ask for something like this".

Thanks again,
Michael

This is something that we'll be interested in as well in the Dylan (
http://opendylan.org) community sometime in 2014 once we get our LLVM
backend functional.

- Bruce

We haven’t done any research on it yet, but at some point we were going to look at some minimal support for showing useful backtrace info when on Android and the back trace hops over a JNI boundary (i.e. into/out of Java, possibly multiple times), particularly when lldb is stopped in native C/C++ code. I’m not sure how much (if any) Java language handling we’ll do, since the most basic application of back-trace context probably only needs to handle looking up symbols on the JVM side. If we wanted to handle variable lookup and similar handling that needs to deal with expressions, we’ll probably start caring about this as well.

Hi everyone,
I was wondering if it is a goal of LLDB to allow for supporting programming languages other C/C++ and Objective-C? I've been browsing the source code a bit and found that references to clang::Type are hardwired into lldb::Type. Is there an extension path that allows to sidestep Clang and use one's own type representation? Is it even feasible to try and support non-Clang languages within LLDB's architecture? That would be very interesting to me.

Thanks for reading!
-Michael
_______________________________________________
lldb-dev mailing list
lldb-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Nobody care to comment? Let me elaborate a bit on my reasons for asking. Since last summer I've been working on debuginfo generation for the Rust compiler (www.rust-lang.org). As the Rust compiler is LLVM based this worked out pretty well and we produce DWARF good enough to satisfy basic debugging needs.

You say you have LLVM creating Rust binaries, did you write a whole new backend?

We currently use clang as our expression parser, but that doesn't mean we can't use a different expression parser for other languages.

However, obviously debuggers don't recognize Rust yet and print out values in the wrong syntax among other things. Some of this can be alleviated with Python extensions in LLDB and GDB, but the information available through those APIs seems to be limited (eg. it seems hard get modifiers, such as 'const' or 'volatile'). Also it would be great to allow for parsing Rust expressions in the debugger, call functions, in short: make Rust a first-class citizen of the given debugger. I'm currently trying to find out what the possibilities in this area are.

You would need to do the following:

1 - Modify ClangASTType() to support different types. Right now it contains to member variables:

    lldb::clang_type_t m_type;
    clang::ASTContext *m_ast;

But this could easily be expanded to contain other things. The "m_ast" could be switched over to a pointer union so it could be a "clang::ASTContext" or a "rust::ASTContext" (or how ever you want to represent your type. Then "m_type" can be any "void *" that can represent types in your new type system.

2 - Modify the expression parser to recognize the current language of the current compile and use your new expression parser that knows how to interact with the new type in ClangASTType()

#1 is pretty easy (few weeks), #2 is a big job (few months) unless the expression parser is still the clang expression parser we are using today.

GDB seems to have a story for supporting new source languages but for LLDB I couldn't find anything about the topic yet. It would be great if somebody could elaborate on this, even if only saying "not a goal for LLDB" or "too early to ask for something like this".

LLDB is ready for this, we just haven't done it yet. Let me know if you have any other questions.

Greg

Over in the Julia (julialang.org), we’re interested in this too. Our currently planned approach is to have a custom frontend to LLDB that uses LLDB’s C++ API to do the heavy lifting, so I’m not sure how much of this is relevant. Nevertheless, it would be great to coordinate any efforts on this front.

To follow up:

ClangASTType should really be renamed CompilerType. And then everyone can place their types into this class and all type inspection will just work.

Each new language that doesn't use clang for its compiler would need a new expression parser that will need to know how to play with the CompilerType. We will probably need to add accessors to ClangASTType like:

bool ClangASTType::IsClangType() const;
bool ClangASTType::IsJuliaType() const;
bool ClangASTType::IsRustType() const;

Then the expression parsers would need to make sure to check any types they find via lookups to make sure the ClangASTType is the correct type.

Hi everyone,
I was wondering if it is a goal of LLDB to allow for supporting programming languages other C/C++ and Objective-C? I've been browsing the source code a bit and found that references to clang::Type are hardwired into lldb::Type. Is there an extension path that allows to sidestep Clang and use one's own type representation? Is it even feasible to try and support non-Clang languages within LLDB's architecture? That would be very interesting to me.

Thanks for reading!
-Michael
_______________________________________________
lldb-dev mailing list
lldb-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Nobody care to comment? Let me elaborate a bit on my reasons for asking. Since last summer I've been working on debuginfo generation for the Rust compiler (www.rust-lang.org). As the Rust compiler is LLVM based this worked out pretty well and we produce DWARF good enough to satisfy basic debugging needs.

You say you have LLVM creating Rust binaries, did you write a whole new backend?

I am not sure I understand the question. The Rust compiler (github.com/mozilla/rust) uses LLVM's C API to generate LLVM IR internally and then let's LLVM produce machine code from that. I don't know when they started using LLVM, I've only been following the project for a year now.

We currently use clang as our expression parser, but that doesn't mean we can't use a different expression parser for other languages.

However, obviously debuggers don't recognize Rust yet and print out values in the wrong syntax among other things. Some of this can be alleviated with Python extensions in LLDB and GDB, but the information available through those APIs seems to be limited (eg. it seems hard get modifiers, such as 'const' or 'volatile'). Also it would be great to allow for parsing Rust expressions in the debugger, call functions, in short: make Rust a first-class citizen of the given debugger. I'm currently trying to find out what the possibilities in this area are.

You would need to do the following:

1 - Modify ClangASTType() to support different types. Right now it contains to member variables:

     lldb::clang_type_t m_type;
     clang::ASTContext *m_ast;

But this could easily be expanded to contain other things. The "m_ast" could be switched over to a pointer union so it could be a "clang::ASTContext" or a "rust::ASTContext" (or how ever you want to represent your type. Then "m_type" can be any "void *" that can represent types in your new type system.

2 - Modify the expression parser to recognize the current language of the current compile and use your new expression parser that knows how to interact with the new type in ClangASTType()

#1 is pretty easy (few weeks), #2 is a big job (few months) unless the expression parser is still the clang expression parser we are using today.

OK, that sounds very promising. Thanks for the pointers!

GDB seems to have a story for supporting new source languages but for LLDB I couldn't find anything about the topic yet. It would be great if somebody could elaborate on this, even if only saying "not a goal for LLDB" or "too early to ask for something like this".

LLDB is ready for this, we just haven't done it yet. Let me know if you have any other questions.

Great. I'm sure there'll be a bunch of questions in the future.

Ignoring time and effort, do you see any conceptual obstacles in supporting different source languages (expression parsers, type representations) as plugins like new object file or symbol parsers?

Given all of these pretty much use DWARF, and even if they used something like CodeView, it would be the same idea. Wouldn't it make sense for the "caller side" of LLDB to have access to the debug data structures, and remote symbol lists, and build "expressions" from that that lldb turns into IR and runs?

Not sure if it's feasible to do that but at the moment it seems the clang evaluator works by:
setting up a dummy .c file with the expression embedded inside it. Compile that to IR with clang for the right triple. JIT that. Transfer it to the "other side", evaluate it, then get back the result.

Maybe it would be cleaner to have the evaluator be able to explore the dwarf debug info, build a simple expression tree from that, and get back the result?

Just an idea anyway. I've been trying to see how to fit my (pascal) language into lldb, but due to the way it's designed (the compiler), it's not feasible to compile a single expression/method into IR and transfer it over (Besides that, it's written in another language than c++).

No, that is feasible. When running an expression we should be calling a static function that is something like:

ExpressionParser *parser = ExpressionParser::FindPlugin(ExecutionContext &exe_ctx);

The execution context contains a target, process, thread and frame, and the frame contains the current location so we can check our the language of the compile unit.

Greg

That means creating your own expression parser that won't be language compliant, and is the whole reason we are generating clang AST types from DWARF: to avoid having top write an expression parser when the compiler is the best expression parser we could ever have.

Things we can do in our expression parser that other debuggers can't:
- create expression local variables ("int i = 12; int j = 23; i+j")
- use flow control (if/then/else, while, do/while, switch, etc)
- create types and use them in the expression ("struct foo { int x; float y; }; foo = { 123, 2.3 }; foo")
- much much more!

So the one great thing we can do, is just re-compile clang and we get all the new language features (C++11, etc) for free. If we maintain a separate expression parser that isn't in the compiler, then you do all this work manually. We also use the clang expression parser to avoid running into errors when calling functions as it is very complex to figure out which arguments go where in expressions. When we call a function in the JIT, we give clang a structure in memory that contains all arguments and let it call the function correctly. So there is way too much stuff we would need to re-code, so the hard fight to translate DWARF into compiler types ends up being well worth it on the long run.

Greg

Hrmm true yes. Maybe there could be a way to generate clang expressions from code (not string) and accessing the "clang" type info tables the clang expressions have access to? So languages could start with (or stay with?) using clang structures as a fairly simple api for expression evaluators?