How to handle aliases?

My recent change to use more aliases for destrucors in clang has
opened a can of worms.

The fundamental issue is that aliasing in object files have fairly
complex semantics. For example, on ELF if a symbol is defined in a
COMDAT in one object file, it must be defined in every object file
that has that COMDAT.

Given that weak symbols get implemented with COMDATs, the actual rule
at the llvm level would have to be something like

* If an alias is declared to a isWeakForLinker symbol in one object
file, it must be defined in all.

This is very restrictive rule. For example, given

struct A {
  virtual ~A() {}
};
struct B : public A {
  virtual ~B() {}
};

It is nice for clang to be able to say "I know @_ZN1BD2Ev and
@_ZN1AD2Ev" are the same", but there is no guarantee that other
translation units will see the same classes and clang cannot then
produce an alias. It would be nice to produce an alias even when we
don't have the body of A.

I think the long term solution is to add comdats to the LLVM IR. With
that it would be clearly visible if a FE is producing two different
comdats in different translation units. It would also allow llvm to
have unnamed_addr alias as meaning just "known to be semantically
equivalent" as long as the backends are sufficiently clever at
producing a stub instead of a real alias when the file format
semantics are the same. For example, when targeting ELF an alias
pointing to another COMDAT would be replaced with a stub, but one in
the same COMDAT would produce another symbol in that COMDAT.

For now I don't think we can do much better than documenting the
restrictions and living with them. In particular, I would like to

* Document that an alias pointing to a declaration is a different
thing. It is itself a declaration. It will not show up in the object
file and exists just so one can access an external symbol in a
different way (avoid a PLT for example).

* Document that if an alias points to a isWeakForLinker symbol, then
it must be present in every TU that has the symbol.

* Require that aliases do not point to mayBeOverridden aliases. Given
that they are implemented by having multiple symbols at the same
address, there is no way to implement other semantics. For example, we
cannot in an object file represent the semantics of

@f = alias weak bitcast (void ()* @g to void (...)*)
@h = alias void (...)* @f
define void @g() {
  ret void
}

@h will end up being the exact same thing as @g, even if there is a
strong symbol named f in another object file.

Last but not least, I will update the clang patch to work with those
restrictions.

Cheers,
Rafael

The fundamental issue is that aliasing in object files have fairly
complex semantics. For example, on ELF if a symbol is defined in a
COMDAT in one object file, it must be defined in every object file
that has that COMDAT.

Given that weak symbols get implemented with COMDATs, the actual rule
at the llvm level would have to be something like

Pretty weird... I believe the proper solution is indeed to support
comdat groups at IR level.