How to calculate the offset obtained via a GEP instruction

Hi all,
so I’m trying to understand how to manually calculate the offset calculated by a GEP instruction. I found that this question was asked over 6 years ago on stackoverflow[1] as well but never got a real answer.

Since I need exactly the same, is there anyone willing to help me to understand how to calculate the offset?

Thanks

[1] https://stackoverflow.com/questions/32444497/determine-byte-offset-of-getelementptr

The easiest way to do so in my experience is to change the base operand to a null pointer of the type and then convert the result of the GEP to an integer using ptrtoint. That will give you the offset in bytes and a pass using the target info will constant fold it as well.

Hi Markus,
Since I’m working on an llvm plugin and I have access to the GEP object but I’m not really sure how to do what you told me would you be able to show it to me with few lines of code?

Thanks

You can use GEPOperator::accumulateConstantOffset(). Or more generically, there is Value::stripAndAccumulateConstantOffsets(), which can look through multiple GEPs, bitcasts, etc.

Regards,
Nikita

Thanks Nikita,
I’ll try it and get back to you if I still have problem.

Thanks a lot for your help

Alberto

Hi Nikita,
so I think I made some progress but I’m not quite there yet.

So the GEP Instruction that I’m interested to analyze are:

%3 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 0
%4 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 1
%5 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 2

and the Point struct is declared in the following way:

struct Point
{
int x;
char y;
long z;
};

Reading some doc online I tried the following:

Module *M = I.getModule();
I.dump();
APInt ap_offset(32, 0, false);
std::cout << "ap_offset: " << ap_offset.getSExtValue() << “\n”;
std::cout << "Accumulated offset: " << I.accumulateConstantOffset(M->getDataLayout(), ap_offset) << “\n”;
std::cout << "ap_offset: " << ap_offset.getSExtValue() << “\n”;

The output is something like:
%3 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 0
ap_offset: 0
Accumulated offset: 1
ap_offset: 0
%4 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 1
ap_offset: 0
Accumulated offset: 1
ap_offset: 4
%5 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 2
ap_offset: 0
Accumulated offset: 1
ap_offset: 8
%3 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 0
ap_offset: 0
Accumulated offset: 1
ap_offset: 0
%4 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 1
ap_offset: 0
Accumulated offset: 1
ap_offset: 4
%5 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 2
ap_offset: 0
Accumulated offset: 1
ap_offset: 8
%3 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 0
ap_offset: 0
Accumulated offset: 1
ap_offset: 0
%4 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 1
ap_offset: 0
Accumulated offset: 1
ap_offset: 4
%5 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 2
ap_offset: 0
Accumulated offset: 1
ap_offset: 8

I think the output is almost correct because the offset is increased by 4 each time… but the struct has char and long elements so the offset does not always look right. I think it due to the fact that the GEP is referring only to i32.

How should I fix this situation? Few lines of code would be very helpful

Thanks
Alberto

Hello,
I did some more digging and I think that those numbers where indeed correct:

➜ /tmp cat struct.c
#include<stdio.h>

struct Point
{
int x;
char y;
char z;
long w;
};

int main()
{
struct Point p1;

// Accessing members of point p1
p1.x = 1;
p1.y = 2;
p1.z = 3;
p1.w = 4;

printf("p1: 0x%x\n", &p1);
printf("p1.x: 0x%x\n", &p1.x);
printf("p1.y: 0x%x\n", &p1.y);
printf("p1.z: 0x%x\n", &p1.z);
printf("p1.w: 0x%x\n", &p1.w);
return 0;
}
➜ /tmp

➜ /tmp ./struct
p1: 0x9791b30
p1.x: 0x9791b30
p1.y: 0x9791b34
p1.z: 0x9791b35
p1.w: 0x9791b38
➜ /tmp


%3 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 0
ap_offset: 0
Accumulated offset: 1
ap_offset: 0
%4 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 1
ap_offset: 0
Accumulated offset: 1
ap_offset: 4
%5 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 2
ap_offset: 0
Accumulated offset: 1
ap_offset: 5
%6 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 3
ap_offset: 0
Accumulated offset: 1
ap_offset: 8


Let me know if I'm missing anything.

Thanks
Alberto