Handling of section vs global name conflicts

Hi,

currently LLVM does not handle the conflict between a section and a
global _definition_ with the same name well. A section defines a local
symbol with the same name (pointing to the start of the section).
Depending on the order of declarations, LLVM either silently overrides
the section symbol with the global, or crashes with
fatal error: error in backend: symbol 'xxx' is already defined
The latter happens when the conflicting global is emitted when the
section is already created.

See https://llvm.org/bugs/show_bug.cgi?id=26941 for more details and motivation.

I think it would make sense to disallow global definitions with the
same name as a section. Unfortunately, sections are not values, and it
makes it hard to track them (ValueSymbolTable could implement value
renaming in case of conflicts, but there is no direct way of knowing
if the section name is currently in use or not without rescanning the
module). Changing IR representation of sections just for this sounds
like overkill.

Another option is to allow the conflict, and make the global always
override the section symbol. This is easy to do in the integrated
assembler, but it appears that GAS simply does not work this way.

The correct fix is probably to know that section symbols are local and
just allow them to have any name.

Cheers,
Rafael

So, we emit both a global symbol for the global variable, and a local
symbol for the section, with the same name?
This would need to be fixed both in the integrated assembler and in GAS.
Is it even allowed? What would the symbol references bind to, when
both symbols are defined?

So, we emit both a global symbol for the global variable, and a local
symbol for the section, with the same name?

We could. That is what I was expecting gas to do, but I see that both
gas and mc produce errors.

Handling this without MC is really hard. The name (in IR at least) can
conflict with implicit sections:

@".data" = global i32 42

One somewhat ugly option is to just produce a cleaner error when MC
finds the problem. With gcc one gets an error from gas.

This would need to be fixed both in the integrated assembler and in GAS.
Is it even allowed? What would the symbol references bind to, when
both symbols are defined?

It is allowed, you can have as many symbols with the same name as you want.

As for the assembly file, I would say that the section symbol should
have a lower priority. It is always possible to refer to a section by
putting a .Lfoo at the start of the section and referring to that.

Cheers,
Rafael

So, we emit both a global symbol for the global variable, and a local
symbol for the section, with the same name?

We could. That is what I was expecting gas to do, but I see that both
gas and mc produce errors.

Handling this without MC is really hard. The name (in IR at least) can
conflict with implicit sections:

@".data" = global i32 42

One somewhat ugly option is to just produce a cleaner error when MC
finds the problem. With gcc one gets an error from gas.

This sounds reasonable. We need better diagnostics though. And we also
need to avoid creating such broken modules in the IR linker: imagine
module A with the section, and module B with an _internal_ global of
the same name. One way to do that is to track explicit section names
in IRLinker and force rename conflicting non-externally-visible
globals. Does it make sense?

I don't think that is sufficient given the implicit section problem.

If you really want to avoid this the best seems to be to just assign
to the .s the semantics llc thinks it has: the symbol refers to the
one produced by the GV and the section gets a STT_SECTION on the side
that doesn't conflict with anything.

Cheers,
Rafael

Don't we have a similar case for STT_FILE already?

Joerg

We do, yes.
In summary my suggestion is doing something similar for STT_SECTION.

Cheers,
Rafael