Obtaining an array size(?) from the GEP instruction

Sorry for the repost, I accidentally pressed submission before typing the question, so I immediately removed that post and made this new one.

Let me first provide an example and explain what I am trying to do:

So for a code that looks like this:

int main() {
  char buffer[50] = {0};
  char *c = "Hello World!\n";
  strcpy(buffer,c);
}

It will provide the following IR code (I am omitting a lot of parts for the sake of brevity):

@.str.2 = private unnamed_addr constant [14 x i8] c"Hello World!\0A\00", align 1
define dso_local i32 @main() #0 {
1: %2 = alloca [500 x i8], align 1
2: %3 = alloca i8*, align 8
3: store i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str.2, i64 0, i64 0), i8** %3, align 8
4: %4 = bitcast [500 x i8]* %2 to i8*
5: call void @llvm.memset.p0i8.i64(i8* align 1 %4, i8 0, i64 500, i1 false)
6: %6 = getelementptr inbounds [500 x i8], [500 x i8]* %2, i64 0, i64 0
7: %7 = load i8*, i8** %3, align 8
8: %8 = call i8* @strcpy(i8* %6, i8* %7) #4
}

So my goal at the moment is trying to obtain the array size(?) of GEP instruction at line 3 (basically the first operand of [14 x i8], more specifically I’m interested in the number 14:

store i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str.2, i64 0, i64 0), i8** %3, align 8;

So interestingly enough as getelementptr instruction is part of an operand of store instruction, if I were to do the following, it will never satisfy the if condition:

void InstVisitor::visitStoreInst(Instruction &I){
  if (dyn_cast<GetElementPtrInst>(I.getOperand(0))) {
     // should be here if operand is a GEP instruction
  } else {
     errs() << *I.getOperand(0) << "\n";
     errs() << *I.getOperand(0)->stripPointerCasts() << "\n";
  }
}

Therefore, I played around a bit with this instruction and figured out how to make it output like the following (I have posted this code in the above else condition):

i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str.2, i64 0, i64 0)
@.str.2 = private unnamed_addr constant [14 x i8] c"Hello World!\0A\00", align 1

However, from here, I am unable to get the value I need, which is either [14 x i8] or int value of 14.

I am a bit lost at this point… I looked through how GEP instruction works, but it seems like this is not a GEP instruction at the moment as it is a direct operand to store instruction, and tried looking at the API documentation, but still to no avail.

Thank you for any suggestions in advance,

You are absolutely correct about that detail.

The reason the GEP instruction is written inline here is because it is a constant expression. Constant expressions in LLVM are not instructions but instead just constants and hence written inline. That is also the reason as to why your dyn_cast<GetElementPtrInst> failed.

Instead you’ll want to do a dyn_cast to the corresponding ConstantExpr class.
For GEP that would be https://llvm.org/doxygen/classllvm_1_1GetElementPtrConstantExpr.html.
You should then be able to get the array size by looking at the source element type of it.

The usual way to handle both instruction and constexpr GEP is to cast to GEPOperator.

However, you should be aware that the GEP source element type is entirely arbitrary, and it is illegal to use array sizes it contains for optimization purposes. If you are interested in strings in particular, you may be interested in the llvm::GetStringLength() and llvm::getConstantStringInfo() helpers. If you are interested in more general object sizes, there is llvm::getObjectSize().

Thank you so much!!! Wow, if I didn’t ask this question, I would’ve never found out that inlined GEP is a ConstantExpr.

I couldn’t exactly get it to work with your proposed method, but for other people, here is what I have done:

if (auto cnst_Expr = dyn_cast<ConstantExpr>(I.getOperand(0))){
    Value *cnst_Expr_V = cnst_Expr->getOperand(0);
    if (auto cnst_GV = dyn_cast<GlobalVariable>(cnst_Expr_V)) {
        errs() << cnst_GV->getOperand(0)->getType()->getArrayNumElements() << "\n";
    }
}

There is definitely a better way to do it I know, but I’m quite desperate for the solution, so this helped me a lot.

Thank you again for your great suggestion!

@nikic Thank you so much for your response. I will definitely keep what you stated in mind. I tried looking / searching for the ways you have suggested using llvm::GetStringLength() and llvm::getConstantStringInfo(), but apparently, I couldn’t figure it out because I could not find the correct library to use those helper functions, and hard for me to integrate it to my current code :crying_cat_face:

Thank you, everyone, again for your help,
Sincerely,

1 Like