merging globals

Hello, Tatu

Is that correct? I think it's just something to be aware of.

Currently we're aggressively merging globals by default. Do you think it
will be better to provide special flag to control this behavior?

Please no flag. If we want to fix this problem, lets do it right. To me this consists of some flag on GlobalVariable that says that it is 'mergable' or something like that. Then -constmerge would be changed to only merge globals with this flag on it, or globals that are provably not "address exposed" (which could be modeled by having the optimizer set the bit). We'd also have to change the code generator to stop sending these globals into the various sections that the linker merges.

It is important to model this so that we don't lose string merging. We need to distinguish between the string globals in these two cases:

char x = "foo";
char *y = "foo";

Both will end up with a 4 byte string global variable in LLVM IR, but only one of those is mergable.

-Chris

Hi Anton,

Currently we're aggressively merging globals by default. Do you think it
will be better to provide special flag to control this behavior?

I think that (once again from the user POV) the "Principle of Least Surprise" should apply here - meaning that this behavior should never be on by default.

It is interesting that in GCC documentation for -fmerge-all-constants you find this:
"Languages like C or C++ require each non-automatic variable to have distinct location, so using this option will result in non-conforming behavior."

Wouldn't it be slightly cleaner to mark the distinct objects in the
LLVM intermediate representation? This would make the default case the
one resulting in best code, and seems to me to follow the principle of
structural equivalence of types (and values) used elsewhere.

That is what I'm suggesting. Each llvm IR global variable would have its own flag.

-Chris

Hi Chris,

> Wouldn't it be slightly cleaner to mark the distinct objects in the
> LLVM intermediate representation?
That is what I'm suggesting. Each llvm IR global variable would have
its own flag.

I think the suggestion is to have a "notmergable" flag instead of a
"mergeable" flag. In practice, it doesn't really matter, but I guess you'll
see the difference in the IR. I agree here that it would be more logical to
have a notmergable (or "distinct" ?) flag, instead of a mergeable flag, though
it's not quite critical.

Gr.

Matthijs

I'm pretty sure we want mergeable to be opt-in. If it defaults to
being allowed, unaware frontends (like the current llvm-gcc and clang)
will introduce logical contradictions by assuming that globals aren't
mergeable, and optimization passes that mark globals const will have
to make sure to mark them unmergeable.

-Eli