What is the first parameter to IRBuilder::CreateGEP()? (or: do not attempt to create LLVM objects without a basic block!)

I’ve seen different things being passed for this parameter!

Sometimes it is the type the second argument is pointing too (in case of this is pointing to a structure).
Sometimes it is the type of an element of the array the second argument is pointing to.

Currently I’m getting a crash when trying to extract an element of a global/static array.

Looking at IR code generated from C++ code is of no help, as there are no types.
The IR code generated from C++ code is even storing a pointer parameter on the stack and reloading it instead of using it directly.

Here comes the code produced by chatgpt for me – as I said it is crashing.
I already tried passing arrayType instead of Type::getInt32Ty(context) to IRBuilder::CreateGEP().

#include "llvm/IR/Constants.h"
#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"

using namespace llvm;

int main()
{
        LLVMContext context;
        Module *module = new Module("my_module", context);

        // Create a static const array variable
        ArrayType *arrayType = ArrayType::get(Type::getInt32Ty(context), 3);
        Constant *arrayValues[] =
        {       ConstantInt::get(Type::getInt32Ty(context), 1),
                ConstantInt::get(Type::getInt32Ty(context), 2),
                ConstantInt::get(Type::getInt32Ty(context), 3)
        };
        Constant *array = ConstantArray::get(arrayType, arrayValues);
        GlobalVariable *myArray = new GlobalVariable(
                *module, arrayType, true, GlobalValue::ExternalLinkage, array, "myArray");

        // Create an IRBuilder
        IRBuilder<> builder(context);

        // Create a GetElementPtrInst instruction to access the first element of myArray
        Value *zero = ConstantInt::get(Type::getInt32Ty(context), 0);
        Value *indices[] = {zero, zero};
        Value *firstElementPtr = builder.CreateGEP(Type::getInt32Ty(context), myArray, indices);
        Value *firstElement = builder.CreateLoad(Type::getInt32Ty(context), firstElementPtr);
}

The problem applies to both, LLVM-15 (on ubuntu) and LLVM-16 (on windows).

Here is the stack-trace for windows:

Type parameter to checkGEPType() is a nullptr.

|>|global_array.exe!llvm::checkGEPType(llvm::Type * Ty) Line 933|C++|
| |global_array.exe!llvm::GetElementPtrInst::getGEPReturnType(llvm::Type * ElTy, llvm::Value * Ptr, llvm::ArrayRef<llvm::Value *> IdxList) Line 1090|C++|
| |global_array.exe!llvm::ConstantFoldGetElementPtr(llvm::Type * PointeeTy, llvm::Constant * C, bool InBounds, std::optional<unsigned int> InRangeIndex, llvm::ArrayRef<llvm::Value *> Idxs) Line 2025|C++|
| |global_array.exe!llvm::ConstantExpr::getGetElementPtr(llvm::Type * Ty, llvm::Constant * C, llvm::ArrayRef<llvm::Value *> Idxs, bool InBounds, std::optional<unsigned int> InRangeIndex, llvm::Type * OnlyIfReducedTy) Line 2423|C++|
| |global_array.exe!llvm::ConstantFolder::FoldGEP(llvm::Type * Ty, llvm::Value * Ptr, llvm::ArrayRef<llvm::Value *> IdxList, bool IsInBounds) Line 119|C++|
| |global_array.exe!llvm::IRBuilderBase::CreateGEP(llvm::Type * Ty, llvm::Value * Ptr, llvm::ArrayRef<llvm::Value *> IdxList, const llvm::Twine & Name, bool IsInBounds) Line 1797|C++|
| |global_array.exe!main() Line 31|C++|

I also tried using only a single index for the array:

Value *indices[] = {zero};

When doing this, IRBuilder::CreateGEP() succeeds but the following CreateLoad() fails due to a nullptr access:

|>|global_array.exe!llvm::BasicBlock::getParent() Line 112|C++|
|---|---|---|
| |global_array.exe!llvm::BasicBlock::getModule() Line 147|C++|
| |global_array.exe!llvm::BasicBlock::getModule() Line 123|C++|
| |global_array.exe!llvm::IRBuilderBase::CreateAlignedLoad(llvm::Type * Ty, llvm::Value * Ptr, llvm::MaybeAlign Align, bool isVolatile, const llvm::Twine & Name) Line 1749|C++|
| |global_array.exe!llvm::IRBuilderBase::CreateAlignedLoad(llvm::Type * Ty, llvm::Value * Ptr, llvm::MaybeAlign Align, const llvm::Twine & Name) Line 1744|C++|
| |global_array.exe!llvm::IRBuilderBase::CreateLoad(llvm::Type * Ty, llvm::Value * Ptr, const llvm::Twine & Name) Line 1725|C++|
| |global_array.exe!main() Line 32|C++|

I doubt ChatGPG can replace reading the documentation or googling problems:
LLVM GEP and google gives me The Often Misunderstood GEP Instruction — LLVM 17.0.0git documentation

1 Like

My previous problem was that LLVM cannot deal with return types of arrays larger than a certain size.
Please show me, where this is being mentioned in the documentation!

One thing that is not being mentioned and another one that is clearly being mentioned, are two clearly different things :slight_smile:

This seems to mostly duplicate what I told you in How to write C++ code for creating IR of a function returning a struct containing an array of real values? - #2 by jrtc27

1 Like

I figured out that there is a dump() method attached to most objects:

pFirstElementPtr->dump();

But also this is of no help.
I tried all permutations – passing integerType or arrayType to CreateGEP().
Passing more or less indicies.
I cannot get a pointer to an integer returned from CreateGEP().
Forgot – I formatted the code to suite my taste:

#include "llvm/IR/Constants.h"
#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"

using namespace llvm;

int main()
{
	LLVMContext sContext;
	Module *const pModule = new Module("my_module", sContext);
	IRBuilder<> sBuilder(sContext);
	const auto pIntType = Type::getInt32Ty(sContext);
	// Create a static const array variable
	ArrayType *const pArrayType = ArrayType::get(pIntType, 3);
	Constant *const aArrayValues[] =
	{	ConstantInt::get(pIntType, 1),
		ConstantInt::get(pIntType, 2),
		ConstantInt::get(pIntType, 3)
	};
	Constant *const pArray = ConstantArray::get(pArrayType, aArrayValues);
	GlobalVariable *const pArrayVar = new GlobalVariable(
		*pModule,
		pArrayType,
		true,
		GlobalValue::ExternalLinkage,
		pArray,
		"pArrayVar"
	);

	// Create a GetElementPtrInst instruction to access the first element of pArrayVar
	Value *const pZero = ConstantInt::get(pIntType, 0);
	Value *const aIndices[] = {pZero, pZero, pZero};
	Value *const pFirstElementPtr = sBuilder.CreateGEP(pArrayType, pArrayVar, aIndices);
	pFirstElementPtr->dump();
	//Value *const pFirstElementValue = builder.CreateLoad(pIntType, pFirstElementPtr);
	pModule->print(outs(), nullptr);
}

	Value *const aIndices[] = {pZero, pZero};
	Value *const pFirstElementPtr = sBuilder.CreateGEP(pArrayType, pArrayVar, aIndices);
	Value *const pFirstElementValue = builder.CreateLoad(pIntType, pFirstElementPtr);

is what I would expect to work (mirroring Compiler Explorer (using the deprecated non-opaque pointer syntax to get the type checking you’re having issues with using the C++ API).

I went through this again and I did not find anything which might explain the effects I’m seeing.
No permutation of arguments is working.
And there is nearly no assert() in the code firing if one does something really stupid.

crashes in CreateLoad() on both, Windows and ubuntu.
On Ubuntu using gdb (cannot use lldb due to issues with WSL2 and cisco anyconnect):

(gdb) where
#0  0x00007ffff9752ce0 in llvm::BasicBlock::getModule() const () from /lib/x86_64-linux-gnu/libLLVM-15.so.1
#1  0x00000000080044d5 in llvm::BasicBlock::getModule() ()
#2  0x0000000008004383 in llvm::IRBuilderBase::CreateAlignedLoad(llvm::Type*, llvm::Value*, llvm::MaybeAlign, bool, llvm::Twine const&) ()
#3  0x0000000008004303 in llvm::IRBuilderBase::CreateAlignedLoad(llvm::Type*, llvm::Value*, llvm::MaybeAlign, llvm::Twine const&) ()
#4  0x0000000008002d82 in llvm::IRBuilderBase::CreateLoad(llvm::Type*, llvm::Value*, llvm::Twine const&) ()
#5  0x000000000800279f in main ()
(gdb)

Well you haven’t created a basic block for your IRBuilder to put code into, so yes, it’s going to crash trying to look up that basic block. Create a function and a basic block within it, you can’t just have random free-standing instructions lying around outside a function, that doesn’t make any sense.

1 Like

thanks!

Works now!

Learned something new!

code to create .ir code:

#include "llvm/IR/Constants.h"
#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"

using namespace llvm;

int main()
{
	LLVMContext sContext;
	Module *const pModule = new Module("my_module", sContext);
	IRBuilder<> sBuilder(sContext);
	const auto pIntType = Type::getInt32Ty(sContext);
	const auto pFunction = Function::Create(
		FunctionType::get(
			pIntType,
			{	pIntType
			},
			false
		),
		Function::ExternalLinkage,
		"get",
		pModule
	);
	sBuilder.SetInsertPoint(
		BasicBlock::Create(
			sContext,
			"entry",
			pFunction
		)
	);
	// Create a static const array variable
	ArrayType *const pArrayType = ArrayType::get(pIntType, 3);
	Constant *const aArrayValues[] =
	{	ConstantInt::get(pIntType, 1),
		ConstantInt::get(pIntType, 2),
		ConstantInt::get(pIntType, 3)
	};
	Constant *const pArray = ConstantArray::get(pArrayType, aArrayValues);
	GlobalVariable *const pArrayVar = new GlobalVariable(
		*pModule,
		pArrayType,
		true,
		GlobalValue::ExternalLinkage,
		pArray,
		"pArrayVar"
	);

	// Create a GetElementPtrInst instruction to access the first element of pArrayVar
	Value *const pZero = ConstantInt::get(pIntType, 0);
	Value *const aIndices[] = {pZero, pFunction->args().begin()};
	Value *const pFirstElementPtr = sBuilder.CreateGEP(pArrayType, pArrayVar, aIndices);
	//pFirstElementPtr->dump();
	Value *const pFirstElementValue = sBuilder.CreateLoad(pIntType, pFirstElementPtr);
	sBuilder.CreateRet(pFirstElementValue);
	pModule->print(outs(), nullptr);
}

test code:

#include <iostream>
#include <cstdlib>
extern "C" int get(int);


int main(int argc, char**argv)
{       for (argv++; *argv; ++argv)
                std::cout << get(std::atoi(*argv)) << std::endl;
}
clang++-15 -ggdb `llvm-config-15 --cppflags --ldflags --libs all` global_array.cpp
./a.out > global_array.ll
clang++-15 -ggdb `llvm-config-15 --cppflags --ldflags --libs all` global_array.ll test_global_array.cpp
./a.out 1 2 0

To be clear, there isn’t a question here, this is your way of saying that you’ve got something that now works for you?

To help other people running into the same problem.
This is also the reason for adding to the title of the question.

Thanks again!