The GEP formats when generating IR.

Hi everyone,

I am learning llvm IR from official documents, especially the GEP instruction.

Now I have a problem about the GEP instructions generated by llvm.

As the example in say, for the function "foo" in the following code example:

struct RT {
  char A;
  int B[10][20];
  char C;
struct ST {
  int X;
  double Y;
  struct RT Z;

int *foo(struct ST *s) {
  return &s[1].Z.B[5][13];

the IR could be:

define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
  %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
  ret i32* %arrayidx


define i32* @foo(%struct.ST* %s) {
  %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1                        ; yields %struct.ST*:%t1
  %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2                ; yields %struct.RT*:%t2
  %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1                ; yields [10 x [20 x i32]]*:%t3
  %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5  ; yields [20 x i32]*:%t4
  %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13               ; yields i32*:%t5
  ret i32* %t5

I wonder when the llvm will generate the former one, and when it will generate the later one?

Thank you very much!


Hi Shulin,

Sorry for the late reply. The quick answer is: The first one will be generated when optimizations are on and the second one when not. What follows is peeling off the levels of detail in case you want to know more.

I wonder when the llvm will generate the former one, and when it will generate the later one?

“llvm” is a very vague term here. We should define what high-level component we’re talking about.

I guess you mean Clang, i.e., the C/C++ front-end, whose job we can assume ends with the generation of LLVM IR (then comes the “middle-end” which does a bunch of, let’s say, target-independent optimizations in this LLVM IR
and then comes the back-end which does a bunch more target-dependent optimizations and finally generates assembly).

Clang generally generates the simplest IR possible. So, Clang will generate the second version because it is simple and it goes something like this. This &s[1].Z.B[5][13]; is actually broken into many small expressions in the Abstract Syntax Tree (AST). Every dereference is basically one expression. When Clang generates code, for every sub-expression generates a getelementptr (GEP), which is used as input to the next expression and that’s about it. So, Clang doesn’t go one step further
to understand all these small sub-expressions as one big expression and create only one GEP.

Now, when optimizations are turned on, all these GEPs are folded into one [1].

You can even go one step further and try to find which transformation does the job, by using opt, the tool which helps experimenting with optimizations. You literally just copy and paste the LLVM IR you got from Clang again into Godbolt,
add -O1 and add the argument -print-changed=quiet [2]. This will show you the LLVM IR only after the passes that succeeded to do some change. You can see that the real work was done by Instruction Combining.
You can run it on its own to see the effect [3].

Finally, you can even go one step further and see where in the code of the pass this happens. That would require downloading LLVM, inspecting the debug output and setting a bunch of breakpoints
but after doing all that, you could find this [4].

Hope this helps!



Στις Κυρ, 30 Μαΐ 2021 στις 10:13 π.μ., ο/η 周书林 via llvm-dev <> έγραψε:

Hi Stefanos,

Thank you very much!

With your detailed explanation about the difference, and the examples you provided, I have a more clearer view about the procedure of how Clang front-end and transformer Passes deal with the source code into IR.

Thank you again!

Best regards,

Stefanos Baziotis <> 于2021年6月1日周二 上午4:50写道:

Hi Shulin,

No problem, glad it helped! Let us know if you have more questions.


Στις Τρί, 1 Ιουν 2021 στις 3:48 μ.μ., ο/η 周书林 <> έγραψε: