Can I add GlobalVariable in MachineFunctionPass ?


I want to add a global variable of arrayType in my MachineFunctionPass.
However, I only get const Module from MachineFunction.getMMI().getModule().
I can't add any global variable to a const Module.
Another way is to add a global variable in doInitialization in my
MachineFunctionPass, but I can't determine the size of my arrayType for
global variable in doInitialization.

Is there any suggestion that can help me achieve this?

Thanks in advance.
Antony Yu

What are you trying to accomplish in this case? I did something very similar in the AMDIL backend, but it was not the cleanest solution and you are correct it has to be do at doInitialization stage and not at runOnMachineFunction.


As you expected, I am trying to create local memory but in the NVPTX backend. It’s really not convenient that I can’t create local memory in runOnMachineFunction.
Since I should do it at doInitialization stage, I also need to do some tricks in global variable and AsmPrinter to resize it.
Did you use the similar way?


I originally did do it a similar way, but that was before many of the more modern LLVM Machine structures existed.

You can see how I eventually did it here:

You don’t need to put the information in a global variable as you can store it in the MachineModuleInfo and then query/modify it where needed from the various locations.


If you’re running a MachineFunctionPass, then the code has already been lowered to machine instructions and modifying the original IR will most likely not do what you want. You should only be using the IR as an immutable object for reference. If you want to change the IR, I would suggest using a FunctionPass instead of a MachineFunctionPass. Unless you need codegen data.

At the MachineInstr level, to allocate local memory you can use the MachineFrameInfo interface. This provides methods like CreateStackObject to allocate a new stack slot (which will be lowered to local memory in PTX). The return value of these methods is an integer that represents a FrameIndex. You can treat this as a pointer to your allocated object. You will also need to emit the proper MachineInstr-level loads and stores to access the object.


Thanks for your help. I will study on that code.


Sorry for my misleading word. Local memory in OpenCL is the same as share memory in CUDA. What I mean is share memory, so MachineFrameInfo is not suitable to me.
And I need codegen data, so FunctionPass is also not suitable.
Anyway, thanks for the suggestion.


Can you tell us a bit more about what you’re trying to accomplish?

Changes to the IR performed during MachineFunctionPass::doInitialization will likely propagate down through code generation, but at that point what is the purpose of using a MachineFunctionPass? You won’t have any analysis or instruction information available until runOnMachineFunction.

I want to create share memory in my MachineFunctionPass, and insert load/store instruction for it. The way to create share memory is to add global variables which are in share memory address space (not sure if it is the only way). Therefore, I should add global variables in fixed size in doInitialization, and record its real size in other place like MachineModuleInfo. Then modify or query its real size from that place instead of size of variable.


Yes, global variables are the only way to access shared memory.

I’m just trying to get an idea of what you’re aiming to accomplish to see if we can improve on the interface here. A MachineFunctionPass runs after instruction selection and relying on doInitialization to run before instruction selection is an implementation detail that I do not believe is guaranteed anywhere (I could be wrong!). And modifying the IR does in fact violate the rules for a MachineFunctionPass (see bullet 1 in

If you explain what you’re trying to accomplish, I can try to help figure out a good approach here. There very well may be limitations to the current infrastructure that need to be fixed.

OK. I know what you mean…

Simply speaking, I want to do some optimizations for PTX, and the information I need is similar to a register allocator. I know PTX is virtual ISA, but I will use PTX as the input of the simulator, gpgpu-sim, so it makes sense.
Whether to insert shared memory is depend on the analysis that needs LiveAnalysis, PTX InstrInfo, PTX RegisterInfo, etc. That’s why I need to add global variables in MachineFunctionPass.


Is there any way you could approximate the register/instruction usage and perform live-range analysis in a higher-level LLVM IR pass? I’m not sure how useful NVPTXRegisterInfo would be anyway. Unlike backends that target “real” ISAs, these structures do not contain any special properties about registers or instructions, like cost or scheduling information. Are you trying to figure out the total number of PTX registers that will be emitted?

Yes, total number of PTX registers that will be emitted is exactly what I need. It’s hard to figure out this in LLVM IR level.

Does this count have to be exact, or just an accurate approximation? The back-end may add/remove registers fairly late in the codegen process, so if you need an exact count you may need to run just before the assembly printer.

Perhaps we could introduce a special machine node that represents a shared memory allocation. The node’s value would be the shared address space pointer of the allocation, and the assembly printer would turn that into a “.shared .bX …” variable. Would that solve your issue? Or do you need to change other parts of the IR as well?

The count should be exact, so I implement my analysis in preEmitPass.

I can’t imagine how this special node will be implemented. Will this node be metadata, a special instruction, a special register class or another class?
I will use load and store to access shared memory. Besides, the shared memory is allocated dynamically, and may be deleted when resizing or canceling occurs.

I won’t change other parts of IR.