Can I establish a relation between a SCoP in LLVM-IR and the corresponding line number in C-code?

Hi all,

I am using LLVM w/ Clang as front end. I use Clang to transform a C-file into LLVM-IR. Now I'm wondering if I can establish a relation between the SCoPs in this IR and the C-code. In other words, I want the line-number in the C-code of this SCoP (e.g. a loop nest) in LLVM-IR.

Is this possible, and if so, how do I obtain this?

Cheers,
Pieter

Hi Pieter,

it is not supported out of the box, but you may try to use the debugging information to get this information. However, keep in mind,
that debugging information is best afford and may not be available in all cases. If you really need accurate feedback based on C-code a source analysis tool like pet[1] may better fit your needs. pet uses clang and isl to extract a polyhedral model directly from the source code.

Tobi

[1] Public Git Hosting - pet.git/summary

I see now two possibilities:

  1. pet would be okay to use, but then I loose the ability to transform loops in a canonical form (now I use a LLVM pass for that). Right?

  2. I am looking to use CLang APIs in a C-program to construct the AST and try ‘Mapping between cursors and source code’ [2].

Cheers, Pieter

[2] http://clang.llvm.org/doxygen/group__CINDEX__CURSOR__SOURCE.html

Hi all,

I am using LLVM w/ Clang as front end. I use Clang to transform a
C-file into LLVM-IR. Now I'm wondering if I can establish a relation
between the SCoPs in this IR and the C-code. In other words, I want
the line-number in the C-code of this SCoP (e.g. a loop nest) in LLVM-IR.

Is this possible, and if so, how do I obtain this?

Hi Pieter,

it is not supported out of the box, but you may try to use the
debugging information to get this information. However, keep in mind,
that debugging information is best afford and may not be available in
all cases. If you really need accurate feedback based on C-code a
source analysis tool like pet[1] may better fit your needs. pet uses
clang and isl to extract a polyhedral model directly from the source code.

Tobi

[1] Public Git Hosting - pet.git/summary

I see now two possibilities:

1) pet would be okay to use, but then I loose the ability to transform
loops in a canonical form (now I use a LLVM pass for that). Right?

Yes. This is the drawback when using source to source tools. canonicalication and induction variable analysis is a lot simpler on LLVM-IR. Especially if you want to optimize through pointers, C++ iterators and C++ templates. Still, for direct user feedback working on the source code is a valuable option.

2) I am looking to use CLang APIs in a C-program to construct the AST
and try 'Mapping between cursors and source code' [2].

That is basically what pet does (though using the C++ API). It should allow you to map every statement and expression directly to the relevant source line. It can even highlight which part of the source code, blocks a loop to be represented in a polyhedral model.

Cheers
Tobi

No. pet tries very hard not to canonicalize loops in order
to maintain a direct relationship between the original loop
variables and the elements of the iteration domains, but sometimes
it is forced to do so anyway. If you really need your loops to
be in some canonical form, then it shouldn't be too hard
to add an option to pet to always canonicalize.

Now, there may be other reasons why you may prefer
working on IR over working at the source level, but loop
normalization should not be one of them.

skimo

  1. pet would be okay to use, but then I loose the ability to transform loops in a canonical form (now I use a LLVM pass for that). Right?

No. pet tries very hard not to canonicalize loops in order
to maintain a direct relationship between the original loop
variables and the elements of the iteration domains, but sometimes
it is forced to do so anyway. If you really need your loops to
be in some canonical form, then it shouldn’t be too hard
to add an option to pet to always canonicalize.

Well, its not that I really need them in a canonical form. This is needed for Polly to output information in a polyhedral form. What I need is polyhedral information about loops/loop-nests in the source code to analyze them. The analyzed information must stay in relation to the source code as this information is input to a tool that does a source2source compilation on the source-code. Can I use pet for this?

Now, there may be other reasons why you may prefer
working on IR over working at the source level, but loop
normalization should not be one of them.

skimo

Pieter

Sure. If you want to do source2source, pet is probably the tool you want to use.

Tobi

If you're doing source-to-source, then pet probably is the best
choice at this moment. At least that's what we argue in
http://impact.gforge.inria.fr/impact2012/workshop_IMPACT/verdoolaege.pdf

If you want to talk more about pet, then we should probably
take the discussion off cfe-dev.

skimo