How to break/iterate over nested instructions.

Dear All,

I wish to iterate over all instructions (where opCode == desired_opCode). I could iterate over all the instruction expect the nested instructions like:

%add.ptr229 = getelementptr inbounds i8* getelementptr inbounds ([4096 x i8]* @_Func1, i32 0, i32 0), i64 %idx.ext228

I wish to break this nested instruction in two instructions. Please let me know if there is already existing method in llvm to do the job.

Thanks,
Manish

Hi Manish,

I wish to iterate over all instructions (where opCode == desired_opCode). I
could iterate over all the instruction expect the nested instructions like:

  %add.ptr229 = getelementptr inbounds i8* getelementptr inbounds ([4096 x i8]*
@_Func1, i32 0, i32 0), i64 %idx.ext228

this is not a nested instruction. The inner getelementptr is a ConstantExpr
(a constant) not an instruction.

I wish to break this nested instruction in two instructions.

Why?

Ciao, Duncan.

  Please let me know

Hi Duncan,

Why to break?I wish to analyse all the operands of getelemetptr instructions. For which I iterate over the code and get the instruction of interest by
getopcode == getelementptr. But the getelementptrs in this form as in my last mail are getting away from my pass. So i wish to break it. I have a work around if breaking is not possible, but I think it may be a common requirement by other passes too.

Thanks!
Manish

Manish Gupta wrote:

Hi Duncan,

Why to break?
I wish to analyse all the operands of getelemetptr instructions.

The outer one is a getelementptr instruction. The inner one is not, as it is a constant expression, not an instruction. Why do you want to visit constant expressions? Constant expressions are known to be constant through the life of the program, but aren't known *yet* at compile time, for example the address of a function.

If you really want to visit instructions and constant expressions, you should:
   1. iterate over the BasicBlock to find all Instructions, cast them to llvm::Operator* and visit them
   2. the visitor receives Operator* O and checks O->getOpcode() to detect GEP instructions *and* constant expressions. Do your work on them.
   3. your visitor iterates over O's operands, dyn_casts them to Operator* and visit them if the cast succeeds.

But before you do that work, ask yourself whether visiting something that's fundamentally an integer number (merely one whose concrete value isn't known yet) is what your pass needs to do. Thus far, nothing in LLVM has needed to break apart constant expressions into instructions.

Nick

  For

Why to break?

Hi Manish,

I wish to analyse all the operands of getelemetptr instructions.

why? Also, these nested geps are *not* getelementptr instructions, they are
getelementptr constant expressions. To be sure of finding them all you will
need to recursively visit all constants, examining their operands if they have
any. But since they are computed at build time, not when the program runs
(they are constants), maybe they don't matter for you?

Ciao, Duncan.

  For which I

Dear Manish,

First, in answer to your original question: yes, there is a pass that will convert constant expression GEPs used within LLVM instructions into GEP instructions. SAFECode has a pass called BreakConstantGEPs in safecode/trunk/lib/ArrayBoundChecks/BreakConstantGEPs.cpp (). It works with LLVM 2.7; updating it to LLVM mainline should be trivial. Second, to reiterate what others have said, you need to determine whether doing this conversion is a good idea. Converting these constant expressions into instructions may very well hurt performance. Blindly doing the conversion is really a quick hack (SAFECode for LLVM 2.7 employed this hack more often than it should have; its performance was probably hurt as a result). If all you’re writing is an analysis pass, then you should make your analysis smart enough to analyze GEP constant expressions as Nick advised. Third, for the curious, SAFECode had to convert constant expression GEPs into GEP instructions due to its pointer rewriting feature. Basically, I needed to perform a run-time check on the constant expression GEP to make sure it was in bounds; if it was out-of-bounds, the check had to return a rewritten pointer value that would cause a fault on dereference (this is how SAFECode permits out-of-bound pointers that are never dereferenced; see the paper by Ruwase and Lam for more details). SAFECode converted all constant expression GEPs into GEP instructions. It really should have converted only those that went out of bounds (which were relatively few in real programs). That’s something I’ll have to fix when I reintroduce BreakConstantGEPs into mainline SAFECode. :slight_smile: – John T. On 8/27/2011 2:31 PM, Manish Gupta wrote:

Hey all,

Thanks all for such comprehensive feedback and apologies for delayed thanks :slight_smile: but every time i post on this mailing list i get so much more to know about llvm.

Thanks again!!
Manish