AA and external globals

I’m working on an AA implementation, and I’m using abstract blocks to model the memory used to store the values of program variables.

To be sound my analysis must ensure that all program objects which might possibly occupy the same chunk of runtime memory are modeled with the same abstract memory block. I’m trying to understand if “external” linkage is a problem for me.

More concretely, suppose I have a Module which includes these declarations:

@Bar = external global i32
@Foo = external global i32

Is there any reasonable possibility that the linker will cause @Bar and @Foo to name the same piece of runtime memory?

FWIW, my AA algorithm considers only one llvm Module at a time, so from its perspective the final program is divided into two parts: what it sees in the current Module, and the mysterious, unknowable everything else. Also, I need my algorithm to at least be sound for ELF targets, but ideally for all LLVM targets.

Thanks,
Christian

I’m curious why you don’t use the existing CFL-AA implementation?
Or is this just an experiment/learning experience/whatever?

I'm working on an AA implementation, and I'm using abstract blocks to
model the memory used to store the values of program variables.

To be sound my analysis must ensure that all program objects which *might
possibly* occupy the same chunk of runtime memory are modeled with the
same abstract memory block. I'm trying to understand if "external" linkage
is a problem for me.

More concretely, suppose I have a Module which includes these declarations:
@Bar = external global i32
@Foo = external global i32

Is there any reasonable possibility that the linker will cause @Bar and
@Foo to name the same piece of runtime memory?

Yes, @Bar could be defined as a GlobalAlias to @Foo in another module.

The constant folder assumes @Bar and @Foo are distinct via
areGlobalsPotentiallyEqual <
http://llvm.org/docs/doxygen/html/ConstantFold_8cpp_source.html#l01382&gt;

I'm not entirely sure if this is correct in the case where @Bar happens to
be an alias for @Foo elsewhere... Nick, any ideas?

I'm curious why you don't use the existing CFL-AA implementation?
Or is this just an experiment/learning experience/whatever?

The issue came up as part of me trying out some AA ideas that have no
prayer of getting into the mainline LLVM. But in addition to that, now I'm
troubled by the mere fact that I don't know the answer :slight_smile:

I'm working on an AA implementation, and I'm using abstract blocks to
model the memory used to store the values of program variables.

To be sound my analysis must ensure that all program objects which *might
possibly* occupy the same chunk of runtime memory are modeled with the
same abstract memory block. I'm trying to understand if "external" linkage
is a problem for me.

More concretely, suppose I have a Module which includes these
declarations:
@Bar = external global i32
@Foo = external global i32

Is there any reasonable possibility that the linker will cause @Bar and
@Foo to name the same piece of runtime memory?

Do you want "what's legal in some particular language" or "what may ever
happen"?

I guess both answers are interesting. Pragmatically speaking, I need an
answer that's pretty trustworthy for any not-intentionally-evil C or C++
code compiled by clang 3.7+, and linked with the GNU linker on a modern
Linux distro, on x86-64.

But I'm also curious if there's something in the LLVM IR spec or ELF spec
that should have given me the answer. I couldn't tell if such docs don't
exist, or if my Google-fu was too weak to find the answer.

The constant folder assumes @Bar and @Foo are distinct via
areGlobalsPotentiallyEqual <
LLVM: lib/IR/ConstantFold.cpp Source File;

I'm not entirely sure if this is correct in the case where @Bar happens to
be an alias for @Foo elsewhere... Nick, any ideas?

Thanks David, it sounds like areGlobalsPotentiallyEqual may provide the
exact logic I want, based on your explanation.

My only reservation, aside from the one you mentioned, is that I don't
notice anything in that method looking for the unnamed_addr flag. I
suspect my AA code would give wrong answers regarding two globals which got
unified because of that flag.