Transforming ConstantExprs to Instructions

Hi,

I've been struggling with constantexprs for a bit. I'm working on a pass that
transforms global variables to local variables, and in particular the
GetElementPtrConstantExpr is a bit troublesome. For my transformation to
properly work, a global value should only be used by Instructions, not by
ConstantExprs.

I was thinking to add a ConstantExpr::replaceWithInstr() virtual method, which
translates all of the uses of a constantexpr with their Instruction
equivalents, whenever possible. Each of the subclasses of ConstantExpr can
implement this as appropriate. Or, thinking of it, it's probably better to
have a protected and virtual replaceUseWithInstr method which is called for
each use by replaceWithInstr().

Is this a useful addition? Is this a sane approach?

Gr.

Matthijs

I've been struggling with constantexprs for a bit. I'm working on a pass that
transforms global variables to local variables, and in particular the
GetElementPtrConstantExpr is a bit troublesome. For my transformation to
properly work, a global value should only be used by Instructions, not by
ConstantExprs.

Ok, this is not possible in general though, global variable initializers have to be constants, not instructions.

I was thinking to add a ConstantExpr::replaceWithInstr() virtual method, which
translates all of the uses of a constantexpr with their Instruction
equivalents, whenever possible. Each of the subclasses of ConstantExpr can
implement this as appropriate. Or, thinking of it, it's probably better to
have a protected and virtual replaceUseWithInstr method which is called for
each use by replaceWithInstr().

Is this a useful addition? Is this a sane approach?

Is it possible to design the pass to work with both? The general approach is to make stuff handle "User"s instead of Instructions. It is much more compile time efficient to just handle the two forms rather than converting them back and forth.

-Chris

Hi,

I've been struggling with constantexprs for a bit. I'm working on a pass that
transforms global variables to local variables, and in particular the
GetElementPtrConstantExpr is a bit troublesome. For my transformation to
properly work, a global value should only be used by Instructions, not by
ConstantExprs.

This is a relatively difficult transformation to prove safe... you
have to prove that the global is only used by instructions (directly
or indirectly) in one function (or possibly a group of functions with
a single entry point), the address of the global doesn't escape, the
value from the previous call to a function isn't used (by either
ensuring the value is set before used, or ensuring that the value is
set to a known constant when the function returns), and that there
isn't any unsafe recursion.

I was thinking to add a ConstantExpr::replaceWithInstr() virtual method, which
translates all of the uses of a constantexpr with their Instruction
equivalents, whenever possible. Each of the subclasses of ConstantExpr can
implement this as appropriate. Or, thinking of it, it's probably better to
have a protected and virtual replaceUseWithInstr method which is called for
each use by replaceWithInstr().

Is this a useful addition? Is this a sane approach?

I can't imagine any uses for a method like that besides a pass like
yours. (This method is essentially the opposite of
ConstantFoldInstruction, and is therefore usually the opposite of what
we want to do.) Anyone else have any ideas for uses?

That said, it should be pretty simple to write recursively: first,
transform all the constants that use the current constant (this must
be possible due to the constraint that the global is only used by
instructions and transformable constants), second, build the value to
replace the constant (inserted in the entry block immediately after
all the allocas), and third, call replaceAllUsesWith on the constant.

The bulk of the code, of course, is building the value. In general,
it can take an arbitrary number of instructions to build a constant
(for example, if the constant is a ConstantStruct), so be careful not
to assume that a constant derivation maps to a single instruction.
The case for ConstantExpr is probably the longest, but there aren't
actually that many cases: you just have GetElementPtrInst, CastInst,
CmpInst, SelectInst, InsertValueInst, and ExtractValueInst (assuming
you wouldn't try to prove the legality for anything involving
ptrtoint). (Okay, that list ended up a bit longer than I expected,
but it still isn't that long.)

-Eli

Hi Chris,

> [ Snip replacing constantexprs with instructions ]
Ok, this is not possible in general though, global variable initializers
have to be constants, not instructions.

Yeah, so if not all uses can be replaced, my pass will just have to skip the
variable.

Is it possible to design the pass to work with both? The general approach
is to make stuff handle "User"s instead of Instructions. It is much more
compile time efficient to just handle the two forms rather than converting
them back and forth.

With both I assume you mean both GEP instrs and GEP constantexprs? If so then
yes, I am intending to make it work with both, and that's exactly why I need
to convert the constantexprs to instructions (Since, unless I'm very much
mistaken, a gepconstantexpr won't work with an alloca, so I can't leave them
intact).

Gr.

Matthijs

Hi Eli,

> [ Snip replacing globals with local vars ]
This is a relatively difficult transformation to prove safe... you
have to prove that the global is only used by instructions (directly
or indirectly) in one function (or possibly a group of functions with
a single entry point), the address of the global doesn't escape, the
value from the previous call to a function isn't used (by either
ensuring the value is set before used, or ensuring that the value is
set to a known constant when the function returns), and that there
isn't any unsafe recursion.

In general, I'm transforming global variables to struct alloca'd in the run()
function (which is our equivalent of the main() function). All other
functions will get an additional struct argument and return value, that
contains all the global variables. This seems to work pretty ok in general.
I've implemented most of the checks you mention, which means that variables
used in a gepconstantexpr aren't transformed (which is why I want to replace
those).

The only check that I didn't implement is to check that the run function does
in fact dominate all uses of each global, but since we are only using this
pass after internalizing every function except run, this should work out.

I can't imagine any uses for a method like that besides a pass like
yours. (This method is essentially the opposite of
ConstantFoldInstruction, and is therefore usually the opposite of what
we want to do.) Anyone else have any ideas for uses?

Yeah, I couldn't find any other pass that does something similar either.

That said, it should be pretty simple to write recursively: first,
transform all the constants that use the current constant (this must
be possible due to the constraint that the global is only used by
instructions and transformable constants), second, build the value to
replace the constant (inserted in the entry block immediately after
all the allocas), and third, call replaceAllUsesWith on the constant.

Yeah, I was thinking along exactly those lines. However, I might not do a
replaceAllUsesWith, but just replace each individual use and return false if
not all uses could be replaced.

Or would it be better to first see if the constexpr is replaceable (by
recursively calling a CanReplaceWithInst method or similar) and only replace
something if it is possible?

The bulk of the code, of course, is building the value. In general,
it can take an arbitrary number of instructions to build a constant
(for example, if the constant is a ConstantStruct), so be careful not
to assume that a constant derivation maps to a single instruction.

Hmm, I hadn't though about constantstructs yet, but for consistency those
should probably be replaced by a chain of insertvalues as well.

The case for ConstantExpr is probably the longest, but there aren't
actually that many cases: you just have GetElementPtrInst, CastInst,
CmpInst, SelectInst, InsertValueInst, and ExtractValueInst (assuming
you wouldn't try to prove the legality for anything involving
ptrtoint). (Okay, that list ended up a bit longer than I expected,
but it still isn't that long.)

AFAICS all of those are pretty simple (only a single instruction). Only
ConstantStruct would end up emitting more than one instruction, but I've been
getting quite adept at emitting insertvalue chains by now :-p

I'll probably not get around to creating a patch until later this week, mainly
because we can live without supporting globals that result in gepconstantexprs
for at least a while.

Gr.

Matthijs

What do you mean? A constantexpr gep can be used as the size of an alloca.

-Chris

A constantexpr as an argument to an alloca is legal, but a
constantexpr can't use an alloca.

Suppose you have a global struct, and you load the third member. The
IL for this is something like "load i32* getelementptr ( %struct @x,
i32 0, i32 2)". Now you want to replace the global @x with a local
alloca %x (assuming all the appropriate safety checks are done).
"load i32* getelementptr (%struct %x, i32 0, i32 2)" is illegal; it
has to be split into "%loadaddr = getelementptr %struct %x, i32 0, i32
2" and "load i32* %loadaddr".

-Eli

Hi Chris,

What do you mean? A constantexpr gep can be used as the size of an
alloca.

I meant the other way around. Ie,

@A = internal global { i32, i32 }
%B = load getelementptr { i32, i32 }* @A, i32 0, i32 0

would trivially translate to

%A = alloca { i32, i32 }
%B = load getelementptr { i32, i32 }* %A, i32 0, i32 0

which is not legal AFAICS (since %A is not a Constant).

Gr.

Matthijs

Ah, I see. Right, you'd need to change the constexprs into instructions.

-Chris