How do I get ABI information to a subclass of MCELFObjectTargetWriter::GetLocType?

Anson_MacDonald · December 15, 2015, 10:57pm

I am implementing a defined, but currently unimplemented by LLVM, ABI. This ABI differs from an existing ABI in its ELF object format by implementing a subset of an existing ABI, but encoded differently and by setting the e_ident EI_CLASS field. I am trying to use MCTargetOptions::getABIName to set a boolean in the modified subclass of MCELFObjectTargetWriter to indicate which relocation encoding to use. As far as I can determine by source examination and judicious use of a debugger there isn’t a simple path from the command line and the setting of ABIname in MCTargetOptions to where an instance of a subclass of MCELFObjectTargetWriter is created.

I looked at the approach taken by both Mips and X86 for implementing ILP32 and neither seems applicable. For x86 x32, there is the combination of IsELF64 == false and OSABI == EM_X86_64, but that doesn’t seem applicable, as the ELF e_machine field is the same for the existing and the new ABI. For Mips N32, code and state in MCELFObjectTargetWriter seems to take care of mapping the relocation values and the ELF e_flags bit EF_MIPS_ABI_ON32 is set.

I’m trying to implement the AArch64 ILP32 ELF ABI.Ideally, I’d like to be able to create a modified version of AArch64ELFObjectWriter so that its GerRelocType method can choose which relocation encoding to use based upon what was specified on the command line. Should I make up a new OSABI enum value? Do some kind of manipulation of the Triple environment field to set it based upon the value of “-mabi=”?

ARM64 ELF Reference with ILP32 information:
http://infocenter.arm.com/ help/topic/com.arm.doc. ihi0056c/IHI0056C_beta_ aaelf64.pdf

Daniel_Sanders · December 17, 2015, 2:30pm

Hi Anson,

I’ve been working on similar problems in MIPS. We have several problems with the same root cause but the most relevant is that our N32 ABI implementation behaves too much like N64. We get lots of important N32 details wrong with one of the biggest being that we get the wrong EI_CLASS because we derive it from the triple and not the ABI (which is currently unavailable to the relevant object).

I have three patches that make a start on a general solution for this kind of problem (http://reviews.llvm.org/D13858, http://reviews.llvm.org/D13860, and http://reviews.llvm.org/D13863). The overall intent is that we create an MCTargetMachine that describes the desired target (taking into account the default ABI for the triple and any options that change it) and use it as a factory for the MC layer objects. This way we can pass relevant detail down to the MC objects without having to have all targets agree on what information should be provided to each object. This mechanism can then be extended to other target-specific detail as needed.

This mechanism also provides the groundwork to solve the Triple ambiguity problem (see [LLVMdev] The Trouble with Triples) that most targets have to some degree but ARM and MIPS particularly suffer from. This problem isn’t limited to the MC layer, it also causes problems with CodeGen and compatibility with GCC (differences in default option values, etc.).

My work in this area has been in review in since July and there have been no commits yet so I’ve recently been considering adding MCTargetOptions to some of the createMC*() functions as stop-gap measure to get some of the bugs fixed sooner. I’ll still need to fix the triple ambiguity problem properly to avoid releasing multiple single-target clang toolchains (which I’m very keen to avoid doing but I don’t have much choice as things stand) but it at least lets me improve matters.

By the way, you’ll find that some paths through clang use the default constructor of MCTargetOptions and therefore neglect to set MCTargetOptions::ABIName. I was planning to fix this once I had the backend side of things working.

Should I make up a new OSABI enum value? Do some kind of manipulation of the Triple environment field to set it based upon the value of “-mabi=”?

Both of those approaches would work and are similar to Debian’s concept of Multiarch Tuples.

My original TargetTuple solution was somewhat similar in principle but unfortunately was not accepted. In the TargetTuple solution, I was trying to introduce a boundary between the world of GNU Triples and the world of LLVM Target Descriptions. At the moment llvm::Triple is responsible for interpreting GNU Triples and being a target description within LLVM. So in the TargetTuple solution, llvm::Triple parsed the triple and was then used to initialize a more detailed, unambiguous, and authoritative target description in llvm::TargetTuple. Command line arguments then modified the TargetTuple after which it was passed to the backend instead of llvm::Triple.

It will be interesting to see what answers you get here. Personally, I was avoiding inventing values in the llvm::Triple enums because MIPS needs to convey information that is only implied by the triple (and therefore needed new member variables) and/or differs between linux distributions, and also because I thought that separating the GNU Triple parser and the resulting target description was a good thing to do. However, if there’s some agreement that this is the right thing to do then I can rethink my plan and find some way to encode what I need in one of these fields.

Anson_MacDonald · December 17, 2015, 5:06pm

Daniel: Thanks for your detailed response. I had seen the discussion from earlier this year, but when I read it, I didn't expect it would be so difficult to get just one bit of information where I wanted it. Thanks for the heads up about clang not necessarily setting ABIname. I have at least enough of that working already that I can generate the appropriate assembly source.

After doing a little more investigation, I decided to take an approach that seems simpler than yours, as I'm only trying to solve my own problem. It relies on having things lower in the MC layer be able to query MCTargetOptions. This is my plan:

Make a path from the callers of Target::createAsmBackend to get MCTargetOptions to the MCELFObjectTargetWriter subclass or some method in the creation chain:

<client, e.g. llvm-mc>
  -> Target::createAsmBackend(..., MCTargetOptions)
    -> (*MCAsmBackendCtorFn)(..., MCTargetOptions)
      -> <MCAsmBackend subclass constructor wanting options>(..., MCTargetOptions)
         adds MCTargetOptions to the MCAsmBackend subclass state or the bits needed
<MCAsmBackend subclass wanting options>::createObjectWriter(...)
  -> create<foo>ObjectWriter(..., added information)
    -> <foo>ObjectWriter::<foo>ObjectWriter(..., added information)
       sets added state based on constructor args, in my case the ABI, IsILP32
<foo>ObjectWriter::GetRelocType(...)
  use state to guide which relocations are generated

I don't know if the object lifetime of MCTargetOptions allows a reference to be kept around, so the information extraction in the MCAsmBackend subclass constructor may be required.

Anson

Hi Anson,

I've been working on similar problems in MIPS. We have several problems with the same root cause but the most relevant is that our N32 ABI implementation behaves too much like N64. We get lots of important N32 details wrong with one of the biggest being that we get the wrong EI_CLASS because we derive it from the triple and not the ABI (which is currently unavailable to the relevant object).

I have three patches that make a start on a general solution for this kind of problem (⚙ D13858 Add an MCTargetMachine and have it construct MC classes., ⚙ D13860 Add createMCRelocationInfo() and createMCSymbolizer() to MCTargetMachine., and ⚙ D13863 Virtualize createMCAsmInfo and add MCTargetMachine subclasses.). The overall intent is that we create an MCTargetMachine that describes the desired target (taking into account the default ABI for the triple and any options that change it) and use it as a factory for the MC layer objects. This way we can pass relevant detail down to the MC objects without having to have all targets agree on what information should be provided to each object. This mechanism can then be extended to other target-specific detail as needed.

This mechanism also provides the groundwork to solve the Triple ambiguity problem (see [LLVMdev] The Trouble with Triples) that most targets have to some degree but ARM and MIPS particularly suffer from. This problem isn't limited to the MC layer, it also causes problems with CodeGen and compatibility with GCC (differences in default option values, etc.).

My work in this area has been in review in since July and there have been no commits yet so I've recently been considering adding MCTargetOptions to some of the createMC*() functions as stop-gap measure to get some of the bugs fixed sooner. I'll still need to fix the triple ambiguity problem properly to avoid releasing multiple single-target clang toolchains (which I'm very keen to avoid doing but I don't have much choice as things stand) but it at least lets me improve matters.

By the way, you'll find that some paths through clang use the default constructor of MCTargetOptions and therefore neglect to set MCTargetOptions::ABIName. I was planning to fix this once I had the backend side of things working.

Should I make up a new OSABI enum value? Do some kind of manipulation of the Triple environment field to set it based upon the value of "-mabi="?

Both of those approaches would work and are similar to Debian's concept of Multiarch Tuples.

My original TargetTuple solution was somewhat similar in principle but unfortunately was not accepted. In the TargetTuple solution, I was trying to introduce a boundary between the world of GNU Triples and the world of LLVM Target Descriptions. At the moment llvm::Triple is responsible for interpreting GNU Triples and being a target description within LLVM. So in the TargetTuple solution, llvm::Triple parsed the triple and was then used to initialize a more detailed, unambiguous, and authoritative target description in llvm::TargetTuple. Command line arguments then modified the TargetTuple after which it was passed to the backend instead of llvm::Triple.

It will be interesting to see what answers you get here. Personally, I was avoiding inventing values in the llvm::Triple enums because MIPS needs to convey information that is only implied by the triple (and therefore needed new member variables) and/or differs between linux distributions, and also because I thought that separating the GNU Triple parser and the resulting target description was a good thing to do. However, if there's some agreement that this is the right thing to do then I can rethink my plan and find some way to encode what I need in one of these fields.

Daniel_Sanders · December 18, 2015, 11:22am

That sounds like a good plan for your problem and I should be able to use it to fix a couple of the details in N32 such as private label prefixes.

Daniel: Thanks for your detailed response. I had seen the discussion from
earlier this year, but when I read it, I didn't expect it would be so difficult to
get just one bit of information where I wanted it. Thanks for the heads up
about clang not necessarily setting ABIname. I have at least enough of that
working already that I can generate the appropriate assembly source.

Glad I could help. I've been surprised by the difficulty of getting information in the right place too (and getting accurate information).

I don't know if the object lifetime of MCTargetOptions allows a reference to
be kept around, so the information extraction in the MCAsmBackend
subclass constructor may be required.

It looks like MCTargetOptions do live long enough in LLVM's tools but I think that's a coincidence rather than the intent. It's probably best to take a copy in the MCAsmBackend.

echristo · December 18, 2015, 6:35pm

That sounds like a good plan for your problem and I should be able to use it to fix a couple of the details in N32 such as private label prefixes.

This is what I was suggesting to you originally.

Daniel: Thanks for your detailed response. I had seen the discussion from
earlier this year, but when I read it, I didn’t expect it would be so difficult to
get just one bit of information where I wanted it. Thanks for the heads up
about clang not necessarily setting ABIname. I have at least enough of that
working already that I can generate the appropriate assembly source.

Glad I could help. I’ve been surprised by the difficulty of getting information in the right place too (and getting accurate information).

I don’t know if the object lifetime of MCTargetOptions allows a reference to
be kept around, so the information extraction in the MCAsmBackend
subclass constructor may be required.

It looks like MCTargetOptions do live long enough in LLVM’s tools but I think that’s a coincidence rather than the intent. It’s probably best to take a copy in the MCAsmBackend.

No, the intent is that things like TargetOptions and MCTargetOptions exist the life of the program.

-eric

Daniel_Sanders · December 19, 2015, 11:03am

> That sounds like a good plan for your problem and I should be able to use it to fix a couple of the details in N32 such as private label prefixes.
This is what I was suggesting to you originally.

Yes, and while I agree that it fixes this specific problem it also fails to address the most important problem that that the Trouble with Triples thread was created to resolve (triple ambiguity). From my perspective, this is just plumbing minutiae and was an example to illustrate the real problem that unfortunately turned into a bit of a distraction from it. Additionally, solving that triple ambiguity problem changes the best way to solve the plumbing problem and so I've been reluctant to take the roundabout route because of the unnecessary churn it causes in the public C++ and C API's.

The reason I've been so insistent on TargetTuple/MCTargetMachine-like solutions is that they give me the tools I need to fix the triple ambiguity problem in a way that doesn't break the principle of LLVM supporting all targets in all builds. They have some other nice benefits too but the motivation is solving triple ambiguity.

From our discussions on-list, off-list, and in-person, I believe we're close to agreeing on the MCTargetMachine patches. Do you share this opinion? As far as I know, the only remaining issue is that you want TargetMachine to be a subclass of MCTargetMachine whereas I don't want to introduce the inheritance diamond this would cause* and prefer a has-a relationship. If possible, I'd like to go ahead with the has-a approach and consider converting to is-a later. It shouldn't be too difficult to switch to is-a if it turns out to be a workable solution and allows me to make progress on the triple ambiguity problem. The snag with this is that it would cause some churn for users of createTargetMachine() in the event that we did switch to is-a. Do you think we're close to a (possibly conditional) LGTM on the three initial patches?

*This part of the discussion happened off-list so for the benefit of the list: The root of my concern with is-a is that we have <Target>MCTargetMachine is-a MCTargetMachine, and <Target>TargetMachine is-a TargetMachine. If we have <Target>TargetMachine is-a <Target>MCTargetMachine then we also need TargetMachine is-a MCTargetMachine because of some existing code. These four relationships form an inheritance diamond.

> > Daniel: Thanks for your detailed response. I had seen the discussion from
> > earlier this year, but when I read it, I didn't expect it would be so difficult to
> > get just one bit of information where I wanted it. Thanks for the heads up
> > about clang not necessarily setting ABIname. I have at least enough of that
> > working already that I can generate the appropriate assembly source.
>
> Glad I could help. I've been surprised by the difficulty of getting information in the right place too (and getting accurate information).
>
> > I don't know if the object lifetime of MCTargetOptions allows a reference to
> > be kept around, so the information extraction in the MCAsmBackend
> > subclass constructor may be required.
>
> It looks like MCTargetOptions do live long enough in LLVM's tools but I think that's a coincidence rather than the intent. It's probably best to take a copy in the MCAsmBackend.

No, the intent is that things like TargetOptions and MCTargetOptions exist the life of the program.

-eric

I'll take your word for the intent but I ought to mention that the code does not appear to require that the API user preserve the MCTargetOptions object it gives to the create*() methods.

MCTargetAsmParser and TargetOptions both take a copy of the object and store the copy as a member and AsmPrinter::EmitInlineAsm() takes a temporary copy of it. All other uses are constructors and factories that take a reference argument and then forget about it.

Can you point me at documentation explaining this requirement or any code that keeps a long-term reference to the API users' MCTargetOptions? I might have missed something.

Topic		Replies	Views
Representing MIPS ABI information in the triple as ARM/X86 do for EABI/EABIHF/X32 LLVM Dev List Archives	11	129	July 8, 2016
[Mips][TargetOptions] How to properly instantiate TargetOptions in MC layer? LLVM Dev List Archives	5	149	January 29, 2015
Moving Private Label Prefixes from MCAsmInfo to MCObjectFileInfo LLVM Dev List Archives	10	118	May 28, 2015
[MC] [llvm-mc] Getting target specific information to <target>ELFObjectWriter LLVM Dev List Archives	12	251	December 18, 2012
The Trouble with Triples LLVM Dev List Archives	86	767	September 24, 2015

How do I get ABI information to a subclass of MCELFObjectTargetWriter::GetLocType?

Related topics