Source file information.

Hi,

I am new to LLVM, and need to find the line number and cpp source file name for each instruction in a .bc file. I suppose llvm debugger might have that feature but there is no documentation on it. Would you please give me some help how to do it?

Thanks,
::Saman Zonouz
University of Illinois

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu]

On

Behalf Of Saman Aliari Zonouz
Sent: Thursday, July 09, 2009 11:44 AM
To: llvmdev@cs.uiuc.edu
Subject: [LLVMdev] Source file information.

Hi,

I am new to LLVM, and need to find the line number and cpp source file
name for each instruction in a .bc file. I suppose llvm debugger might
have that feature but there is no documentation on it. Would you

please

give me some help how to do it?

Compile the original .cpp file with clang -g option.
The file/line is maintained in SDNodes with DebugLoc field.

- Sanjiv

Can you also get this information in LLVM ?

And what about with llvm-gcc ?

Many thanks in advance,

Aaron

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu]
      

On
    

Behalf Of Saman Aliari Zonouz
Sent: Thursday, July 09, 2009 11:44 AM
To: llvmdev@cs.uiuc.edu
Subject: [LLVMdev] Source file information.

Hi,

I am new to LLVM, and need to find the line number and cpp source file
name for each instruction in a .bc file. I suppose llvm debugger might
have that feature but there is no documentation on it. Would you
      

please
    

give me some help how to do it?

Compile the original .cpp file with clang -g option.
The file/line is maintained in SDNodes with DebugLoc field.
    
Can you also get this information in LLVM ?
  
See Analysis/DebugInfo.h, and opt -print-dbginfo for an example of how
to use it.
And what about with llvm-gcc ?
  
Yes, if compiled with -g.

Best regards,
--Edwin

Thanks Edwin.

I will look into this when I get some spare time.

Aaron

Thanks for your reply. Is it not possible to do with llvm-g++ -g?
and furthermore, where are SDNode and DebugLoc fields stored? are they in a file which I have to parse myself? if so, is there any way that I use a library to get the file/line information for each instruction? since, I am writing a pass for opt tool that manipulates the callgraph and want to get the line number information in runOnModule() function.

Thanks,
::Saman

Thanks for your reply. Is it not possible to do with llvm-g++ -g?

Yes

and furthermore, where are SDNode and DebugLoc fields stored?

They are probably classes in the Clang API

are they in a file which I have to parse myself? if so, is there any way that I use a library to get the file/line information for each instruction?

Yes Clang API :-

http://clang.llvm.org/doxygen/classes.html
http://llvm.org/viewvc/llvm-project/cfe/trunk/

since, I am writing a pass for opt tool that manipulates the callgraph and want to get the line number information in runOnModule() function.

Thats LLVM, as Edwin says “See Analysis/DebugInfo.h, and opt -print-dbginfo for an example of how
to use it.”

http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/DebugInfo.h?revision=74920&view=markup
http://llvm.org/doxygen/classes.html#letter_D

Anyway theres some pointers,

Aaron

Thanks,
::Saman

Dear All,

To add to this, what you want to do is find the appropriate debug stop
point intrinsic and then use it to look up the information for that
instruction.

Here is some sample code from SAFECode that finds the debug information
associated with a CallInst (LLVM call instruction) held in the variable
CI. It uses the findStopPoint() function in llvm/Analyis/DebugInfo.h:

  //
  // Get the line number and source file information for the call.
  //
  const DbgStopPointInst * StopPt = findStopPoint (CI);
  Value * LineNumber;
  Value * SourceFile;
  if (StopPt) {
    LineNumber = StopPt->getLineValue();
    SourceFile = StopPt->getFileName();
  }

-- John T.

Török Edwin wrote:

Dear All,

To add to this, what you want to do is find the appropriate debug stop
point intrinsic and then use it to look up the information for that
instruction.

Here is some sample code from SAFECode that finds the debug information
associated with a CallInst (LLVM call instruction) held in the variable
CI. It uses the findStopPoint() function in llvm/Analyis/DebugInfo.h:

  //
  // Get the line number and source file information for the call.
  //
  const DbgStopPointInst * StopPt = findStopPoint (CI);
  Value * LineNumber;
  Value * SourceFile;
  if (StopPt) {
    LineNumber = StopPt->getLineValue
    SourceFile = StopPt->getFileName();
  }

-- John T.

Hi John,

What I am after is to be able to emit line number information for COFF (Common Object File Format) object module files, basically it comes down to paired line numbers and virtual address offsets.

I have not really set out to look at this yet, just feeling ahead, and was prompted by Saman's question to have a look.

So any pointers or help are most welcome,

Aaron

Török Edwin wrote:

Aaron Gray wrote:

Dear All,

To add to this, what you want to do is find the appropriate debug stop
point intrinsic and then use it to look up the information for that
instruction.

Here is some sample code from SAFECode that finds the debug information
associated with a CallInst (LLVM call instruction) held in the variable
CI. It uses the findStopPoint() function in llvm/Analyis/DebugInfo.h:

  //
  // Get the line number and source file information for the call.
  //
  const DbgStopPointInst * StopPt = findStopPoint (CI);
  Value * LineNumber;
  Value * SourceFile;
  if (StopPt) {
    LineNumber = StopPt->getLineValue
    SourceFile = StopPt->getFileName();
  }

-- John T.

Hi John,

What I am after is to be able to emit line number information for COFF
(Common Object File Format) object module files, basically it comes down to
paired line numbers and virtual address offsets.

I have not really set out to look at this yet, just feeling ahead, and was
prompted by Saman's question to have a look.

So any pointers or help are most welcome,

Aaron
  

So you are digging into the code generator issues. That's beyond my ken.

The original question seemed to be interested in getting debug
information corresponding to an LLVM instruction at the LLVM IR level.
That's what the code above does. In your case, you need additional
information about what the code generator is doing. Unfortunately, I
can't help you with that, but hopefully someone else can.

Sorry.

-- John T.

Aaron Gray wrote:

What I am after is to be able to emit line number information for COFF
(Common Object File Format) object module files, basically it comes down to
paired line numbers and virtual address offsets.

I have not really set out to look at this yet, just feeling ahead, and was
prompted by Saman's question to have a look.

So any pointers or help are most welcome,

So you are digging into the code generator issues. That's beyond my ken.

The original question seemed to be interested in getting debug
information corresponding to an LLVM instruction at the LLVM IR level.
That's what the code above does. In your case, you need additional
information about what the code generator is doing. Unfortunately, I
can't help you with that, but hopefully someone else can.

Sorry.

Okay :slight_smile:

What I need to find or write is somethng that maps MachineBasicBlocks to line numbers.

Aaron

Every MachineInstruction holds a DebugLoc accessible through
MachineInstr::getDebugLoc().
http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/MachineInstr.h?view=markup

Then MachineFunction::getDebugLocTuple(DebugLoc)
(http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/MachineFunction.h?view=markup)
will transform that into a DebugLocTuple
(http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/DebugLoc.h?view=markup)
which gives you the CompileUnit, line number, and column number. Use
DebugInfo.h:DICompileUnit to interpret the CompileUnit.

I've been looking at the AsmPrinter to figure out how debug info works
so I can implement it in the JITEmitter (for oprofile). The AsmPrinter
inserts a label anywhere the DebugLoc changes
(AsmPrinter::processDebugLoc). The JITEmitter has the option of using
getCurrentPCValue() to record line boundaries, so I don't know whether
recording addresses directly or also using labels for the JIT will be
better.

As mentioned earlier in the thread, each MachineInstruction has
DebugLoc to record line number info.

Thanks for your reply. Is it not possible to do with llvm-g++ -g?

Yes

and furthermore, where are SDNode and DebugLoc fields stored?

They are probably classes in the Clang API

Nope. SDNode is part of llvm code generator. DebugLoc is defined in a
stand alone llvm support header (see
llvm/include/llvm/Support/DebugLoc.h )

are they in a file which I have to parse myself? if so, is there any way
that I use a library to get the file/line information for each instruction?

Yes Clang API :-

Clang API is wrong answer here for Saman's needs.

http://clang.llvm.org/doxygen/classes.html
http://llvm.org/viewvc/llvm-project/cfe/trunk/

since, I am writing a pass for opt tool that manipulates the callgraph and
want to get the line number information in runOnModule() function.

Saman, for now use findStopPoint() as suggested by John in the thread
for your needs. Just curious, why do your pass needs line number info
while manipulating call graph ?

Ick. So line number information is only available at debug stop points?
That's bad when compiling optimized code since the debug stops kill
optimization, don't they?

We have our own privately-maintained way of tracking line information even
in optimized code but it's a big hack and not something suitable for
pushing upstream.

I thought I saw something on the list recently about someone working on
integrating line number information directly into LLVM Instructions, SDNodes,
MachineInstructions, etc. Am I mis-remembering? If not, what's the status
of that? We'd like to switch over to a proper scheme if we can.

                              -Dave

Dear All,

To add to this, what you want to do is find the appropriate debug stop
point intrinsic and then use it to look up the information for that
instruction.

Ick. So line number information is only available at debug stop points?
That's bad when compiling optimized code since the debug stops kill
optimization, don't they?

Optimizers explicitly ignore the debug intrinsics.

I thought I saw something on the list recently about someone working on
integrating line number information directly into LLVM Instructions, SDNodes,
MachineInstructions, etc. Am I mis-remembering? If not, what's the status
of that? We'd like to switch over to a proper scheme if we can.

I'll let Devang describe the plan,

-Chris

Hi David,

Dear All,

To add to this, what you want to do is find the appropriate debug stop
point intrinsic and then use it to look up the information for that
instruction.

Ick. So line number information is only available at debug stop points?
That's bad when compiling optimized code since the debug stops kill
optimization, don't they?

We worked hard to avoid this. Now if a debug stop point threatens to
kill optimization then the llvm optimizer will kill the debug stop
point itself! We have updated optimizer in all cases we could find.
(Try running nightly tester using TEST=dbgopt and TEST=ipodbgopt and
report failures.) However this means, in optimized code you may lose
location info here and there.

We have our own privately-maintained way of tracking line information even
in optimized code but it's a big hack and not something suitable for
pushing upstream.

Would it be possible to describe the work here ?

I thought I saw something on the list recently about someone working on
integrating line number information directly into LLVM Instructions, SDNodes,
MachineInstructions, etc. Am I mis-remembering? If not, what's the status
of that? We'd like to switch over to a proper scheme if we can.

Now, each MI has line number info. The work is in progress to put
debug info into LLVM instructions. The first step is to move away from
all those pesky llvm.dbg.* GVs and use metadata.

> Ick. So line number information is only available at debug stop points?
> That's bad when compiling optimized code since the debug stops kill
> optimization, don't they?

We worked hard to avoid this. Now if a debug stop point threatens to
kill optimization then the llvm optimizer will kill the debug stop
point itself! We have updated optimizer in all cases we could find.
(Try running nightly tester using TEST=dbgopt and TEST=ipodbgopt and
report failures.) However this means, in optimized code you may lose
location info here and there.

Great! Then this should work for us.

> We have our own privately-maintained way of tracking line information
> even in optimized code but it's a big hack and not something suitable for
> pushing upstream.

Would it be possible to describe the work here ?

To avoid altering every single constructor signature in LLVM, we maintain a
global variable with the current line number information and have the
constructors look at that when creating instructions during translation
to LLVM.

Yucky. :-/

Now, each MI has line number info. The work is in progress to put
debug info into LLVM instructions. The first step is to move away from
all those pesky llvm.dbg.* GVs and use metadata.

Perfect. This should work great for us. Thanks for the explanation.

                              -Dave

Thanks a lot for the helpful info.
Devang, I need line number information for a project in which we want to find a feasible path through an application based on some bug-signature that is given. And the signature includes line number info for the frames on stack.

One Q regarding the mailing list: I have subscribed the list but still don’t receive the mails. Could someone please let me know why?

Thanks a lot,
::Saman

Hi,

Thanks John it solved my prob.
In particular, I need to find the file/line info. from .bc file about all locations that a specific Function is called. Is there any better way than searching all the instructions in runOnModule() during a pass?

Thanks a lot,
::Saman