Hi Daniel,
(from the context, you might have meant ‘tuple’ where you’ve written ‘triple’. I’m answering based on the assumption you meant ‘triple’)
I did mean what I wrote.
The GNU triple is already used as a way of encoding a large amount of the target data in a string but unfortunately, while this data is passed throughout LLVM, it isn’t reliable because GNU triples are ambiguous and inconsistent. For example, in GCC toolchains mips-linux-gnu probably means a MIPS target on Gnu/Linux but anything beyond that (ISA revision, default ABI, multilib layout, etc.) is up to the person who built the toolchain and may change over time. Another example is that Debian’s definition for i386-linux-gnu has been i486 and i586 at various points in time.
Sorta…
The proposed TargetTuple is a direct replacement for the GNU triple and is intended to resolve this ambiguity and move away from a string-based implementation (we need to keep a string serialization though, see below). Essentially, I’m trying to push the ambiguity out of the internals and give the distributor control of how the ambiguity is resolved for their environment. Once that is done, we’ll be able to rely on the TargetTuple for information about the target such as ABI’s, architecture revisions, endianness, etc.
This is pretty vague.
I agree that we should open up the API to specify the appropriate data and that is something that TargetTuple will acquire during step 4 and 7 of the plan (mostly step 7 where compiler/tool options begin mutating the target tuple). I don’t agree with keeping the GNU triple around though for two main reasons. The first is that most people believe that GNU triples accurately describe the target and there will be a strong temptation to inappropriately base logic on them. The second is that the meaning of the triple varies between toolchain builds and over time and there is a significant potential for bugs where different parts of the toolchain use different meanings for the same GNU triple (due to rebuilding or switching toolchains, or moving objects from system to system). We ought to resolve the ambiguity once and then stick to that interpretation.
The string serialization I mentioned above is useful for LLVM-IR as part of a direct replacement for the ‘target triple’ statement. We could split this statement up into smaller pieces but the migration to target tuples is already difficult so I think it would be best to do a direct replacement first and redesign the IR statements later if we want to. The serialization is also useful for command line options on internal tools such as llc to give us precise control over our tests that the GNU triple can’t deliver. This will be particularly important when distributors can apply their own disambiguations to GNU triples. The serialization may also be useful as part of a C API but I haven’t given the C API much thought beyond preserving the current API.
My first impression of using this serialization as is that it’s something I’m against. Keep in mind that being able to parse the string can’t invoke a target backend to handle the rest of the parsing. It’d need to be as generic as a DataLayout if you want to do this sort of thing and I’m entirely uncertain this is possible for the goals you (and I) have in mind here.
Hopefully, that helps clear up your concerns. Let me know if there’s anything that still seems strange.
Not really. I don’t see much of a sketch on what you have in mind for your “TargetTuple” here other than “it’ll be a bunch of things together”.
Let me be clear, I do agree with you that the Triple by itself is insufficient for what we want long term in the backends, however, we won’t be able to get rid of it completely. It’s too ingrained into how cross compilation is done as a base. It is, however, possible to design an API that includes the Triple and the relevant information to augment sufficiently. My vision for this is an API that has a base part that is going to be generic across all targets (think the current arguments to the TargetMachine constructor), and additional target specific information that can be passed in via user customization (i.e. command line options etc).
My suggestion on a route forward here is that we should look at the particular
API and areas of the backend that you’re having an issue with and figure out
how to best communicate the data you’d like to the appropriate area. I realize
this probably seems a little vague and handwavy, but I don’t know what areas
you’ve been having problems with lately. I’ll absolutely help with this effort if
you need assistance or guidance in any way.The MIPS specific problems are broad and varied. Some of the bigger ones are:
- Building clang on a 32-bit Debian and a 64-bit MIPS processor produces a compiler that cannot target the native system. The release packages work around this by ‘cross-compiling’ from the host triple to the target triple which are different strings (mips-linux-gnu vs mips64-linux-gnu) but have the same meaning.
- It’s not possible to produce a clang that can generate code for both 32-bit and 64-bit MIPS without one of them needing a -target option to change the GNU triple. This is because we based the logic on the triple and lack anything else to use.
I blame the mips backend for this one. We can do -m32/-m64 just fine for x86 as an example. Some backends have this problem, others don’t.
- Various details (ELF headers, label prefixes, exception personality, JIT target, etc.) depend on the ABI and OS Distribution rather than just 32-bit vs 64-bit
Sure?
- It’s not possible to implement clang in a way that can support all of mips-linux-gnu’s possible meanings. mips-mti-linux-gnu, and mips-img-linux-gnu have the same problem to a lesser degree
I’m really not sure what any of these things are bringing up. You haven’t actually said what communication problem you’re trying to solve between the user and the compiler here. How about we start this from another perspective? Can you give some examples of what you’d like to do to communicate the information you think you need to various parts of the backend and how you’d like to communicate it?
I promise I’m not trying to be (on purpose at least) particularly dense here, but I just don’t have enough information to work with here. I agree that we probably have an API problem - some of which I solved for the mips backend at one point using MCOptions (which I don’t really like as a general solution), but a more general solution that’ll work and be cleaner is definitely a direction I’d like us to go.
-eric