[MC] [llvm-mc] Getting target specific information to <target>ELFObjectWriter

Older targets like Mips had/have assemblers and ABIs that carry a lot of baggage.

The small bit of baggage that is giving me fits is that MipsELFObjectWriter needs to know the relocation model (static,pic,cpic), whether we are using xgot (-mgot), which abi (old,new), which architecture (32r[123],64[123]), which if any coprocessor or extention instructions are used (mips16,micromips,etc.).

I shouldn’t have to muck with base classes to handle esoteric target specific issues such as these.

ELFObjectWriteris used for direct object output whether directly from the codegen or from the llvm-mc assembler.

The grand idea was to use MipsSubtarget because it has almost everything I need, but the big glitch is that is only used by llc and when used directly by llvm-mc, we only get the base class MCSubtargetInfo.

AsmPrinter has access to the codegen variation (Subtarget) and thus doesn’t have this issue.

Is the answer for every createMCSubtargetInfo() call to mimic TargetMachine and create a Subtarget? This would give a target a common class object to work from. The mechanism for filling the information doesn’t need to be the same, but the subclass does.

This seems a bit heavy handed. I am also looking at Features to see if that could be abused into passing the information. The Features string is stored in the MCSubtarget base class.

I believe a clean solution to this could lead to much simpler parameter lists for creating other MC class objects as well, but currently I just want my e_header hurt to go away.

Feedback please

Jack

So the mips assembler has command line options like -fPIC?

Cheers,
Rafael

Hi Rafael,

There are a lot of flags. Here are the ones you ask about:

-KPIC, -call_shared generate SVR4 position independent code
-call_nonpic generate non-PIC code that can operate with DSOs
-mvxworks-pic generate VxWorks position independent code
-non_shared do not generate code that can operate with DSOs
-xgot assume a 32 bit GOT

Just to make things fun, the SGI notion of cpic (call pic) fits gnu's -call_nonpic.

Remember, this issue is not whether the direct or standalone assembler can deal with it in their code. The problem is how to convey it to the <target>ELFObjectWriter in a clean manor because both use it. Some flags are picked by being the default, some from the commandline and others from the run of play. It is not a difficult issue except we require the ELF portion to work from the same sketch book for the 2 assemblers.

I am experimenting putting in a std::set into the MCSubtargetInfo base class that is for target use and see if I can end my ordeal :wink: Probably only useful for the older targets which have a lot of baggage to support. If that works I can convert it to work with Feature, but that is a bit of overkill at this point.

Cheers,

Jack

Hi Rafael,

There are a lot of flags. Here are the ones you ask about:

-KPIC, -call_shared generate SVR4 position independent code
-call_nonpic generate non-PIC code that can operate with DSOs
-mvxworks-pic generate VxWorks position independent code
-non_shared do not generate code that can operate with DSOs
-xgot assume a 32 bit GOT

Just to make things fun, the SGI notion of cpic (call pic) fits gnu's -call_nonpic.

Awesome. As a though exercise I would suggest forgetting for the
moment that llvm even has a codegen module. Just pick one of the above
options and an example .s where it changes the output and then make
llvm-mc do the same.

Without knowing what the options do it is hard to guess what the best
change would be, but a first try would be extending
MCELFObjectTargetWriter.

Can you post an example of a .s where one of the above options makes a
difference? I.E., can you write a test that should pass once the
option is implemented in llvm-mc? Having a concrete example should
make it easier to discuss.

Cheers,

Jack

Cheers,
Rafael

Here are some examples using the gnu assembler reacting to the same input file with different commandline options.

These are using the GCC assembler on hello.c
// abi o32, arch mips32r2, relocation model pic+cpic
mips-linux-gnu-as -mips32r2 -EL -KPIC -o hello_gas.o hello_gas.s
e_flags 0x70001007 EF_MIPS_NOREORDER EF_MIPS_PIC EF_MIPS_CPIC E_MIPS_ABI_O32 EF_MIPS_ARCH_32R2

// abi o32, arch mips32r2, relocation model cpic
mips-linux-gnu-as -mips32r2 -EL -call_nonpic -o hello_gas.o hello_gas.s
e_flags 0x70001005 EF_MIPS_NOREORDER EF_MIPS_CPIC E_MIPS_ABI_O32 EF_MIPS_ARCH_32R2

// abi o32, arch mips32r2, relocation model non-shared (not pic or cpic)
mips-linux-gnu-as -mips32r2 -EL -non_shared -o hello_gas.o hello_gas.s
e_flags 0x70001001 EF_MIPS_NOREORDER E_MIPS_ABI_O32 EF_MIPS_ARCH_32R2

// abi n32, arch mips3, relocation model non-shared
mips-linux-gnu-as -mips3 -EL -non_shared -n32 -o hello_gas.o hello_gas.s
e_flags 0x20000021 EF_MIPS_NOREORDER EF_MIPS_ABI2 EF_MIPS_ARCH_3

I dumped these with my own dumper, but you can do it with "readelf -h".

The issue goes beyond the commandline though. The real issue is that <target>ELFObjectWriter is used by both the integrated assembler as well as the standalone one. But the 2 gather information that ends up in the ELF header in different ways such as assembler directives in the standalone assembler such as ".options pic0" which forces non-shared relocation model.

The direction I am going in is to add a new data member in MCSubtargetInfo that is a std::set. This set of booleans are target specific and is used as a bulletin board. This allows me to update my MipsSubtargetInfo object whenever it or a derived reference of it is available. I have a reference of SubtargetInfo in <target>ELFObjectWriter in my current patch.

I am open to better suggestions as long as it solves the basic issue and allows me to mark up ELF header information with fungible target specific information without affecting other targets going forward.

Cheers,

Jack

Here are some examples using the gnu assembler reacting to the same input file with different commandline options.

These are using the GCC assembler on hello.c
// abi o32, arch mips32r2, relocation model pic+cpic
mips-linux-gnu-as -mips32r2 -EL -KPIC -o hello_gas.o hello_gas.s
e_flags 0x70001007 EF_MIPS_NOREORDER EF_MIPS_PIC EF_MIPS_CPIC E_MIPS_ABI_O32 EF_MIPS_ARCH_32R2

// abi o32, arch mips32r2, relocation model cpic
mips-linux-gnu-as -mips32r2 -EL -call_nonpic -o hello_gas.o hello_gas.s
e_flags 0x70001005 EF_MIPS_NOREORDER EF_MIPS_CPIC E_MIPS_ABI_O32 EF_MIPS_ARCH_32R2

// abi o32, arch mips32r2, relocation model non-shared (not pic or cpic)
mips-linux-gnu-as -mips32r2 -EL -non_shared -o hello_gas.o hello_gas.s
e_flags 0x70001001 EF_MIPS_NOREORDER E_MIPS_ABI_O32 EF_MIPS_ARCH_32R2

// abi n32, arch mips3, relocation model non-shared
mips-linux-gnu-as -mips3 -EL -non_shared -n32 -o hello_gas.o hello_gas.s
e_flags 0x20000021 EF_MIPS_NOREORDER EF_MIPS_ABI2 EF_MIPS_ARCH_3

I dumped these with my own dumper, but you can do it with "readelf -h".

The issue goes beyond the commandline though. The real issue is that <target>ELFObjectWriter is used by both the integrated assembler as well as the standalone one. But the 2 gather information that ends up in the ELF header in different ways such as assembler directives in the standalone assembler such as ".options pic0" which forces non-shared relocation model.

The direction I am going in is to add a new data member in MCSubtargetInfo that is a std::set. This set of booleans are target specific and is used as a bulletin board. This allows me to update my MipsSubtargetInfo object whenever it or a derived reference of it is available. I have a reference of SubtargetInfo in <target>ELFObjectWriter in my current patch.

I don't see a definition of a MipsSubtargetInfo class in the tree. Do you mean MipsSubtarget as defined in MipsSubtarget.h?

-Jim

Jim,

You are correct: MipsSubtarget.

For llvm-mc we have a straight MCSubtargetInfo object. For llc we get a MipsSubtarget object which derives from MipsGenSubtargetInfo which derives from TargetSubtargetInfo which derives from MCSubtargetInfo.

The patch I hope to send out for review will do this:

Add a new data member to MCSubtargetInfo base class. It will be a set of integers that is used or not by targets to post flags that otherwise cannot be stored equally by both the llc and llvm-mc use of MCSubtargetInfo.

MCSubtargetInfo reference will be passed to the create<target>AsmBackend() method that passes it to createObjectWriter().

For most of the flags that will be added to the MCSubtargetInfo object for llc would be set during the construction of MipsSubtarget. For llvm-mc, most of the flags would be set during the <target>AsmParser construction. I say most, because some flags may be set during the run of play such as through assembler directives.

All flag types would be target specific. I define my enumerated list in MipsMCTargetDesc.h.

After this change, any target specific ELF header changes should not affect any files above the llvm/lib/Target/<target> level with the possible exception of llvm/include/llvm/Support/ELF.h.

I'll send out the patch later today. I am not wedded to it, but need a solution.

Jack

Attached are the promised patches for the below proposed change.

Cheers,

Jack

elf_header.patch (28 KB)

elf_header_clang.patch (1.59 KB)

Just a quick question from an initial review: isn't the int->bool
mapping of flags a bit limiting. Flag can have actual values and not
only be there or not be there. Wouldn't a more generic mapping
(string->string ?) be more universally useful? Or am I missing
something obvious here...

Eli

Eli,

This is the kind of feedback I want. I believe I have to add to the base class so it should be generally useful. I can see string being better for the value. I still am enamoured with an enumeration for the tab though: int->string. How would that be a limitation?

How about the rest of the patch?

I appreciate the feedback,

Jack

Eli,

This is the kind of feedback I want. I believe I have to add to the base class so it should be generally useful. I can see string being better for the value. I still am enamoured with an enumeration for the tab though: int->string. How would that be a limitation?

I guess that's fine, as long as you don't just limit it to binary "has
/ hasn't flag".

How about the rest of the patch?

There's one thing about it I'm not sure I understand. You are
essentially passing commands to the assembler via "target"
information. But how does this make sense? I realize that the flags
themselves (their kinds and possible values) are properties of the
target, but their passing to the assembler is not. In other words, I
would expect the assembler driver to propagate flags down to the ELF
writer in some manner which is not through the target object. The
target object is supposed to provide information about the target,
which does not depend on the particular invocation of the assembler
and the flags passed to it.

I hope the above is coherent; it not, feel free to demand another attempt.

Eli

Eli,

Yes, SubtargetInfo is more of a container of convenience since it is available to all the assemblers. Working with the current framework it seemed the least disruptive.

I'll describe the problem again.

The Mips ABI for better or worse, uses the ELF headers e_flags extensively. The most pressing issue is the need to post the relocation model such as PIC, CPIC or non-shared. The object method that allows me to update the e_flags at the target level, <target>ELFObjectWriter::getEFlags(), had no access to any target specific information .

MipsELFObjectWriter is created during MipsAsmBackend construction which is create during an invocation of createMipsAsmBackend().

create<target>AsmBackend is called by:
  the codegen (integrated assembler)
  llvm-mc (standalone assembler)
  clang (cc1as_main.cpp)

My solution for getting access of target specific data to <target>ELFObjecWriter was to pass a reference of SubtargetInfo to it through the <target>AsmBackend construction.

For the integrated assembler this works well because what is really passed is a derived class of <target>Subtarget. I can add stuff to MipsSubtarget without affecting any other target. Here is the inheritance:
  SubtargetInfo
  TargetSubtargetInfo
  <target>GenSubtargetInfo
  <target>Subtarget

<target>Subtarget is created through TargetMachine and is codegen centric.

For the standalone assembler and the clang this isn't the case. A straight SubtargetInfo object is created.

From here I added a single data member to the SubratgetInfo class that could be used as a message board so when one is in a method like MipsELFObjectWriter::getEFlags() one could access the information in a common manner.

I made the message board a set with the tag being an integer so each target which chooses to use it could create their own enumerations for the stored info. My initial thought was to make it a set of ints because I just wanted to know if things were on or off, but maybe a set of pointers to templated unions would be more flexible although I do believe that this is overkill.

Cheers,

Jack

Hi Jack,

I understand your motivation here, as I mentioned earlier. It's just
that I don't like the current piping we have in LLVM to pass this
information around. I think that a lot of information is piggybacking
on the Target/Subtarget abstractions, just because these objects
already get pushed everywhere. I wouldn't expect you to turn the whole
infrastructure on its head just for the small changes you need
committed, so I have no objections to your patch. Not being the code
owner I can't formally approve it, of course.

Just a final suggestion: from a cursory look at the llvm-mc tool, it
passes relocation model flags to the streamers by means of a
MCObjectFileInfo object that gets added to MCContext. Could you by any
chance use this for your needs?

Eli