Writing my own debugger... use __builtin_frame_address or is there something better?

Hi LLVM list,

Thanks for having me here.

I'm writing my own debugger for my (secret) language...

I don't know anything about LLVM beyond the general "big picture"... I haven't any real practical experience working with it... beyond just using XCode...

So... really the problem is... I'm generating some functions... by "compiling to C"... so my compiler just writes a plain ".cpp" text file. I've tried debugging the output in Xcode but it's a horrible experience. It's like a C++ developer having to stepping through ASM instead of C...

So I thought "it won't be too hard to make my own debugger"... "all I need is the ability to get the current variables off the current function... and probably the variables from the calling functions... as long as I can do THAT... I can do the rest myself".

Something like this:

int Func(Type1* self, Type2* P) {
    DB_GetStackPointer(1); // "DB_" means its a function for debugging the code
                           // So we save the current stack pointer to a global variable
                           // and tell the debugger that we are on line 1 of the current file's source code.
    int N = 0;
    DB_Line(2); // Tell the debugger that we have advanced to line 2 of our source code.
    Type2* Curr = GetFirst(P);
    DB_Line(3);
    while (Curr) {
        DB_Line(4);
        Type3* Tmp = SubFunc(Curr, self, nil);
        if (!Tmp) {
            PrintError("Error");
            return 0;
        }
        DB_Line(5);
        N++;
        DB_Line(6);
        Item = GetNext(Curr);
        DB_Line(7);
    };

    DB_Line(8);
    return N + 1;
}

All I need to do then... is implement DB_Line and DB_GetStackPointer. The idea is that DB_GetStackPointer will save the current stack pointer... and DB_Line will go to a func that lets me read the current variables off that stack pointer, and then send them via a socket/TCP-connection... to my debugger.

However... I've fooled around with __builtin_frame_address and... I can't figure out how to properly use it.

    int* FA1 = (int*)__builtin_frame_address(1);
    int P0_ = FA1[0];
    int P1_ = FA1[1];
    int P2_ = FA1[2];
    int P3_ = FA1[3];

Something like this... but NONE of these int variables contain the actual pointers stored in the calling function!

I'm looking for the value "Type2* Value = 0x25b1160"

But I just see values like this:

P0_ int 0xbffff758 0xbffff758
P1_ int 0x00008ec3 0x00008ec3
P2_ int 0x004040f8 0x004040f8
P3_ int 0x00426638 0x00426638

Any ideas?

Or is there some kind of inbuilt LLVM things to help me write my own debugger? Something better than __builtin_frame_address

I don't mind relying on LLVM... It doesn't need to be used outside of LLVM...

Finding the functions on the stack is easy. Finding which auto variables
are where is *much* harder.

Joerg

That explains why I couldn't find anything.

I was hoping there might be an LLVM compiler switch to say "allocate these variables in a linear straightforward fashion"...

If theres no such switch then... what is considered the "right approach with LLVM" to creating your own debugger?

Or anything that helps with my original question.

"all I need is the ability to get the current variables off the current
function...

Finding the functions on the stack is easy. Finding which auto variables
are where is *much* harder.

That explains why I couldn't find anything.

I was hoping there might be an LLVM compiler switch to say "allocate these variables in a linear straightforward fashion"...

If theres no such switch then... what is considered the "right approach with LLVM" to creating your own debugger?

Or anything that helps with my original question.

Depends on what problem you are trying to solve. Emitting appropiate
#line data and naming the CC variables as in your original language
would help improve the experience. If that is not good enough, I'm not
sure if there is any alternative to emitting IR directly with the
associated tighter control about debug data getting created.
Alternatively, you could try to hook into lldb.

Joerg

Is there an LLDB file that can help me figure out where my variables are located?

Like a DWARF file or something?

I've never used DWARF explicitly so I don't know if that's the right thing...

If I could parse a DWARF file... and get the stack pointer... could that be used to get the variables?

I would strongly suggest you read this:
http://llvm.org/docs/SourceLevelDebugging.html

This should give you an idea of what llvm is expecting at the IR level and how everything would work there. Once you understand that, you can try to map it back to the original C code. I don't know if the appropriate builtins have been added to clang, but in principle*, that shouldn't be too hard to do.

I would recommend you take a small hand written C or IR fragment, get that working in LLDB or GDB. Then try to map that back to what your source language needs to do.

* Note: Debug information is an area where 'in principle' and 'in practice' differ substantially. I don't know enough to give you any better guidance.

p.s. Don't underestimate how much information you can get by simply naming your generated C variables reasonably and providing good line info. This may be more than sufficient for most of your users.

p.p.s. Fair warning, debug info has not settled down yet. There are still substantial changes happening on a regular basis. This means that a) you'll need to adapt going forward, b) you'll really want to stay current w/TOT to get fixes and c) any documentation you find may be out of date.

Philip

Doing this will significantly impact the performance of the register allocator and various mid-level optimisations. If you are willing to take this hit, you can do it relatively easily by generating a single C structure containing all of your local variables. You can also follow the approach taken in the 'garbage collection in uncooperative environments' paper and create a linked list of these structures that can be walked by your debugger as well as by the garbage collector (if you need one).

Your best bet, however, is to ditch the idea of generating C and generate LLVM IR, with the relevant debug info, directly. This will also allow you to use all of the existing infrastructure in lldb for debugging, simply requiring you to write language-specific parts.

David

"So I thought “it won’t be too hard to make my own debugger”… "

From someone who has spent most of his career writing debuggers: “Errrr, no”.

I am currently working on adding debug information generation to the front end of an LLVM-based compiler, and that has taken over a year of full-time work so far (*) - and that’s just to generate the information for use by an existing debugger based on gdb. If you are planning on developing a new debugger from scratch, be prepared to devote several years of your life to it.

(*) though in large part that’s because I’ve had to implement all aspects of functions in the generated IR, as the LLVM DIBuilder API does not support inlined functions, and the front end inlines all calls.