New TargetSpec 'llvmnote'

Hi All,

There is recently a discussion on the LLDB list about how to deal with targets, and our current mismash of llvm::Triple and the various subclasses of TargetSubtarget leave a lot to be desired. GNU target triples are really important as input devices to the compiler (users want to specify them) but they aren't detailed enough for internal clients.

Anyway, in short, I think that we should unify the variety of code we have to deal with this stuff into a new TargetSpec class. I don't have any short-term plan to implement this, but I wrote up some of my thoughts here:
http://nondot.org/sabre/LLVMNotes/TargetSpec.txt

Remember that this isn't intended to be something users deal with, it's just an internal implementation detail of the compiler, debugger, nm implementation, etc.

-Chris

[...]

Remember that this isn't intended to be something users deal with, it's just an internal implementation detail of the compiler, debugger, nm implementation, etc.

Can I put in a plea to have as much of LLVM as possible *not* require
knowledge of a single, specific architecture to work?

I have various things I would like to do that work on abstract machines,
where I don't have a specific target or CPU in mind, but just want to
work at the bitcode level. Right now the only way I know of doing this
is to hardcode the datalayout into a new target and rebuild the whole
shooting match, LLVM and clang combined. I very much do not want to do this.

What would be really nice is to be able to specify a custom datalayout
on the command line and have as many tools as possible still work,
particularly clang --- trying to generate code with non-standard
datalayouts is kinda hard right now.

Bitcode currently does not carry enough options information to handle
LTO. For example, if you use -O1 for a particular translation unit but
-O4 for the rest of them, that information isn't saved and provided to
LTO when the actual optimization is happening. Similarly, some options
like soft-float/hard-float aren't preserved. We should consider these
issues while solving this.

deep

This request is completely orthogonal to the proposal. If you generate target independent LLVM IR, you don't have to put a triple into the IR. This isn't going to change.

-Chris

[...]

This request is completely orthogonal to the proposal. If you generate target independent LLVM IR, you don't have to put a triple into the IR. This isn't going to change.

Unfortunately clang doesn't appear to be aware of this. It's forcing me
to specify a triple (or at least, I haven't discovered a way of
generating target-independent IR with it yet). If I want to, say,
generate code where ints are 64 bits but have a 32 bit alignment, as far
as I know I have to go create a custom target and rebuild everything.

If this is purely a clang issue which doesn't extend to the rest of LLVM
that's fine and I'll bring it up with them, but nevertheless I feel that
hardcoding this kind of information into the target is a bit restrictive.

For example: the proposal mentions storing information as enums. This
means that if I want a target with a feature that TargetSpec doesn't
know about, my choices are to either pick 'unknown' or else add it to
the enum table and then rebuild everything.

Given that the actual values of the enums are arbitrary and hidden, they
should just be an implementation detail. An alternative implementation
would be to keep a map of feature tokens and dynamically assign new
values as necessary. This means that it would be possible for TargetSpec
to parse "fnord.le.linux.elf.with-baz" even though TargetSpec knows
nothing about the 'fnord' architecture. This would then allow custom
passes which *do* know what a fnord is to be able to reason about the
feature information.

Most of these are motivations for refactoring and code cleanup, but not
really for inventing a new target mini-language to replace triples.

The main problems with triples IMHO which motivate this are:

  • The vendor field is vague and non-orthoganal.
  • Triples don’t represent subtarget attributes, except in the way that
    subtarget attributes are sometimes mangled into the architecture field
    in confusing ways.

At an initial read, the targetspec proposal’s solutions to these
problems seem reasonable.

It’s a little surprising to have a dedicated “Byte Order” field. One
possible reason for it is that mips.le.* is marginally nicer than
mipsel.*, however that’s not obviously worth burdening everyone else
for. Another possible reason is to allow otherwise
architecture-independent strings to encode an endianness. However,
that’s not a concept that LLVM currently has. And without more
targetdata parts, it’s not obvious how useful it is by itself.

On the other hand, if “Byte Order” makes sense to include, should
other parts of targetdata be included? Pointer size seems the next
most desirable – endianness and pointer size would be sufficient for
many elf tools, for example. However, the other parts of
targetdata could conceivably be useful too.

The “OS” field seems like it should be renamed to “ABI”, since in the
description you discuss actual OS’s that support multiple ABIs.

In the “Feature Delta” field, using “+” to add features but using
a charactar other than “-” to remove them is unfortunate. How about
just prohibiting “-” in CPU names? Or for another idea, how about
prefixing negative features with “no-”, as in “core2+sse41+no-cmov”?

Dan

Unfortunately clang doesn't appear to be aware of this. It's forcing me
to specify a triple (or at least, I haven't discovered a way of
generating target-independent IR with it yet). If I want to, say,
generate code where ints are 64 bits but have a 32 bit alignment, as far
as I know I have to go create a custom target and rebuild everything.

You cannot generate platform-independent IR out of C.
See Frequently Asked Questions (FAQ) — LLVM 18.0.0git documentation for more info

> This leads to a number of problems in LLVM:
> - we have a bunch of duplication
> - we have confusion about what a triple is (normalized or not)
> - no good way to tell if a triple is normalized
> - no good, centralized way to reason about which triples are allowed and valid
> - the MC assembler has to link in the entire X86 backend to get subtarget info
> - we don't have a good way to implement things like .code32 in the MC assembler
> - LLDB replicates a lot of this code and heuristics
> - we don't have good interfaces to inquire about the host
> - we do std::string manipulation in llvm::Triple
> - linux triples are actually quadruples!
> - darwin tools that take -arch have to map them onto something internally.

Most of these are motivations for refactoring and code cleanup, but not
really for inventing a new target mini-language to replace triples.

The main problems with triples IMHO which motivate this are:

  - The vendor field is vague and non-orthoganal.
  - Triples don't represent subtarget attributes, except in the way that
    subtarget attributes are sometimes mangled into the architecture field
    in confusing ways.

At an initial read, the targetspec proposal's solutions to these
problems seem reasonable.

It's a little surprising to have a dedicated "Byte Order" field. One
possible reason for it is that mips.le.* is marginally nicer than
mipsel.*, however that's not obviously worth burdening everyone else
for. Another possible reason is to allow otherwise
architecture-independent strings to encode an endianness. However,
that's not a concept that LLVM currently has. And without more
targetdata parts, it's not obvious how useful it is by itself.

In LLDB we currently have an "ArchSpec" class that llvm::TargetSpec
could eventually replace. Currently, one of the main applications for
having a "byte order" bit in LLDB is to allow sensible construction of
default specifications: for example ARM is almost always little endian,
but there are board configurations where this is not the case. I think
with sensible default values most people will not find the extra flag a
burden.

Having a byte order bit just helps model bi-endian architectures that
much more accurately IMHO. For example, it would help when implementing
support for debugging boot code that forces the processor to switch
modes (PowerPC for example).

On the other hand, if "Byte Order" makes sense to include, should
other parts of targetdata be included? Pointer size seems the next
most desirable -- endianness and pointer size would be sufficient for
many elf tools, for example. However, the other parts of
targetdata could conceivably be useful too.

Possibly useful again from an LLDB perspective. I could imagine
debugging x86_64 operating system code and needing a way to communicate
transitions from 64-bit mode and 32-bit compatibility mode seamlessly.
However, I must stress this is *possibly* useful -- I do not have a firm
conclusion to offer here. Perhaps this is something that we could
support on an as needed basis.

That's true, but the same is also true for a huge variety of other codegen-level flags. I don't think we want to encode every possible detail at this level. Specific things can be solved in different ways: for example, -ffast-math is best solved by adding a flag onto individual fp ops. Some things (like mixed versions of -mpreferred-stack-boundary) are worth just punting on, IMO.

In any case, I'm not interested in trying to tackle the long tail of weird codegen options + LTO at this point.

-Chris

Most of these are motivations for refactoring and code cleanup, but not
really for inventing a new target mini-language to replace triples.

That's all I'm proposing. I'm not suggesting that the "mini language" be exposed to users, it's just a "serialized for an internal-to-llvm clients" data structure. The string form would be persisted in .ll and .bc files, that's all.

It's a little surprising to have a dedicated "Byte Order" field. One
possible reason for it is that mips.le.* is marginally nicer than
mipsel.*, however that's not obviously worth burdening everyone else
for. Another possible reason is to allow otherwise
architecture-independent strings to encode an endianness. However,
that's not a concept that LLVM currently has. And without more
targetdata parts, it's not obvious how useful it is by itself.

It is useful for doing simple queries about the target, and these are things that can be derived from .o files.

On the other hand, if "Byte Order" makes sense to include, should
other parts of targetdata be included? Pointer size seems the next
most desirable -- endianness and pointer size would be sufficient for
many elf tools, for example. However, the other parts of
targetdata could conceivably be useful too.

I could be convinced about this. The other approach would be to formalize this as part of the arch spec and treat mips and mips-le as two different arches, and have a predicate that generates the bit on demand.

The "OS" field seems like it should be renamed to "ABI", since in the
description you discuss actual OS's that support multiple ABIs.

It's really a cross product of OS's and ABIs. For example, darwin10 vs darwin9 is not an ABI, it is an OS. I consider linux-eabi to be different than linux-someotherabi because the entire OS has to be build that way. *shrug*

In the "Feature Delta" field, using "+" to add features but using
a charactar other than "-" to remove them is unfortunate. How about
just prohibiting "-" in CPU names? Or for another idea, how about
prefixing negative features with "no-", as in "core2+sse41+no-cmov"?

Good idea! I changed it to use commas and "no", giving "core2,sse41,nocmov".

-Chris

I think that this can be reliably determined from the arch (through a predicate). x86-64 will always be 64-bit, x86 will always be 32-bit. Doing a "32-bit ABI in 64-bit mode" needs to be a new arch anyway, so that sort of thing isn't an issue IMO. To Dan's point, this argues for forcing a 1-1 mapping between arch and endianness, which would allow making endianness be a predicate instead of being an encoded part of the data structure.

The *only* downside I see to that is that we couldn't form a TargetSpec that *just* contains an endianness, at least without introducing a "unknown-64bit" and "unknown-32bit" archspec, which seems silly.

-Chris

>>
>> On the other hand, if "Byte Order" makes sense to include, should
>> other parts of targetdata be included? Pointer size seems the next
>> most desirable -- endianness and pointer size would be sufficient for
>> many elf tools, for example. However, the other parts of
>> targetdata could conceivably be useful too.
>
> Possibly useful again from an LLDB perspective. I could imagine
> debugging x86_64 operating system code and needing a way to communicate
> transitions from 64-bit mode and 32-bit compatibility mode seamlessly.
> However, I must stress this is *possibly* useful -- I do not have a firm
> conclusion to offer here. Perhaps this is something that we could
> support on an as needed basis.

I think that this can be reliably determined from the arch (through a
predicate). x86-64 will always be 64-bit, x86 will always be 32-bit.
Doing a "32-bit ABI in 64-bit mode" needs to be a new arch anyway, so
that sort of thing isn't an issue IMO.

Ya. You are right. The use case I was thinking of would probably be
better addressed using mechanisms completely unrelated to TargetSpec.

To Dan's point, this argues for forcing a 1-1 mapping between arch and
endianness, which would allow making endianness be a predicate instead
of being an encoded part of the data structure.

The *only* downside I see to that is that we couldn't form a
TargetSpec that *just* contains an endianness, at least without
introducing a "unknown-64bit" and "unknown-32bit" archspec, which
seems silly.

Thinking about this a bit more, from an API point of view I agree.
If we encode endianness as a fixed property of an arch then provided we
have methods like "getLittleEndian("ppcbe") => "ppcle" then an "endian
bit" is largely irrelevant -- the functionality is certainly equivalent
and I think just as easy to use.

Also, we will need something like that anyways to reason about GNU style
tripples.

However, one downside I can see is that we would effectively double the
number of architectures (for the bi-endian case) by having a 1-1
mapping. The tables needed to model all cpu type, subtype and abi combos
would be quite large even with the extra level of indirection an
"endianness bit" gives us.

So from an implementation point of view it seems to me like having an
endian field would help here. Implementing a "setByteOrder" method
might read like "is this arch bi-endian? If so flip the bit", as
opposed to implementing (and maintaining) the tables needed to get
specifically from the "armv5l" entry to "armv5b", etc, etc. And if that
turns out to be true then embedding the endianness in a TargetSpec's string
representation makes good sense (to me :).

Perhaps I can find some time to get a rough code sketch together. Might
be useful to experiment with a few different approaches.

Thanks!