Target specific info available to Clang (and others)

So,

In order to fix bug PR20787 (common parsing infrastructure for FPU
options between LLVM and tools), I need to create a parser in LLVM
that has all targets' information (types of fpus, cpus, archs, etc)
but that can also be visible by external tools.

Until now, all that information has been hidden from view by creating
them with TableGen and leave them only visible to the target files.
That makes a lot of sense, but I need some of that exposed.

Including generated files from LLVM works ok, but Clang can't see
them. If I enable Clang to see them (by adding the include dir on the
path), there will be nothing to stop people from including more than
they should.

The kind of information I want to expose are the enums that identify
features, mainly VFP/NEON/Crypto etc. This is in
ARMGenSubtargetInfo.inc under GET_SUBTARGETINFO_ENUM.

One way I could think of, but it's controversial, is to generate two
files: ARMGenSubtargetInfo.inc and ARMGenSubtargetInfoPublic.inc, the
former in the current directory, the latter in a shared include path
for all tools. That way, not only LLVM, but also Clang, llc, lld etc
would be able to use the same parsing mechanism and understand the
generated data (enums, structures) in the exact same way.

Attached are two pseudo-patches that illustrate the problem. The Clang
patch shows how much more powerful could such infrastructure become.
The patch doesn't compile on Clang because Clang cannot see
ARMGenSubtargetInfo.inc.

References:

http://llvm.org/bugs/show_bug.cgi?id=20787
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-August/038699.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140811/231052.html

cheers,
--renato

fpuparse-clang.patch (4.2 KB)

fpuparse-llvm.patch (12.2 KB)

One way I could think of, but it's controversial, is to generate two
files: ARMGenSubtargetInfo.inc and ARMGenSubtargetInfoPublic.inc, the
former in the current directory, the latter in a shared include path
for all tools. That way, not only LLVM, but also Clang, llc, lld etc
would be able to use the same parsing mechanism and understand the
generated data (enums, structures) in the exact same way.

I think this should move a bit further. One feature that we have now
and people seem to find desirable is that clang can produce .ll files
for target X even if target X is not compiled.

In general the problem we have then is that there is a tiny bit of
information about a given target that we want to build even when the
target is not enabled. This means we should probably have something
like

lib/Target/ARM: only built if the ARM target is enabled.
lib/TargetInfo/ARM: always build. Include only info that clang wants
to use: parse fpu name, construct the datalayout.

Cheers,
Rafael

Yes! That's the final plan!

So, TableGen would generate two sets of files, one public and another
private. The former only containing high-level target descriptions,
cpu names, feature combinations, etc and the latter containing
everything else, from instructions to pipeline descriptions.

The public stuff should be available somewhere other projects can
access without having access to the other local files by accident.
Projects should not include them directly, but they'll include files
that include them, so the directory should be in the projects' paths,
too. We also may need to get these new classes into a library on its
own, no?

cheers,
--renato

Yes, a fpu-name to fpu-enum function would live is such a library. In
the above example TargetInfo would not be just a directory
organization, there would be a lib/libLLVMTargetInfo.a in the build
directory.

Cheers,
Rafael

We also need a way to know which intrinsics are available for the current target. Currently clang accepts all of them and you get a “cannot select” error from the backend if the intrinsic is not available. We now have the mechanism to provide a better diagnostic from the backend, but clang will do a better job of getting the source location right…. except that clang currently has no way to know details of the targets.

lib/Target/ARM: only built if the ARM target is enabled.
lib/TargetInfo/ARM: always build. Include only info that clang wants
to use: parse fpu name, construct the datalayout.

Yes! That’s the final plan!

So, TableGen would generate two sets of files, one public and another
private. The former only containing high-level target descriptions,
cpu names, feature combinations, etc and the latter containing
everything else, from instructions to pipeline descriptions.

The public stuff should be available somewhere other projects can
access without having access to the other local files by accident.
Projects should not include them directly, but they’ll include files
that include them, so the directory should be in the projects’ paths,
too. We also may need to get these new classes into a library on its
own, no?

Yes, a fpu-name to fpu-enum function would live is such a library. In
the above example TargetInfo would not be just a directory
organization, there would be a lib/libLLVMTargetInfo.a in the build
directory.

We also need a way to know which intrinsics are available for the current target. Currently clang accepts all of them and you get a “cannot select” error from the backend if the intrinsic is not available. We now have the mechanism to provide a better diagnostic from the backend, but clang will do a better job of getting the source location right…. except that clang currently has no way to know details of the targets.

Yes, that would be a reasonable thing to put in the library.

You know, thinking about named registers, Clang accepts any valid
register name for any target, so if you specify "eax" on ARM, clang
will happily let it go, to fail on the back-end, so maybe even
register names would be a valid thing to have available... I'm
wondering what shouldn't we put in this public library...

How big would this library end up? The size problem would only be a
problem on small devices, when running native (like Android
Renderscript or ARM JIT), since you don't *really* need all that bloat
from all other targets. If that's a real problem, wouldn't the value
of Clang being able to produce IR on all archs be overshadowed by it?

I mean, we have three issues:

1. It's nice to have Clang generate IR for all archs on any arch
2. We need target info to do that, currently Clang duplicates everything
3. We don't want to bloat native binaries in restricted environments

I'm beginning to think that the "nice" feature 1 is not worth the two
big problems 2 and 3. If we tie Clang builds with back-end builds and
force it not to have support for other arches (because the info isn't
available if you don't build its back-end), than all that info can
still be public and not completely bloat the final libraries.

Is there any stronger reason why Clang must be able to generate IR for
any arch on any arch?

cheers,
--renato

I mean, we have three issues:

1. It's nice to have Clang generate IR for all archs on any arch
2. We need target info to do that, currently Clang duplicates everything
3. We don't want to bloat native binaries in restricted environments

I'm beginning to think that the "nice" feature 1 is not worth the two
big problems 2 and 3. If we tie Clang builds with back-end builds and
force it not to have support for other arches (because the info isn't
available if you don't build its back-end), than all that info can
still be public and not completely bloat the final libraries.

Is there any stronger reason why Clang must be able to generate IR for
any arch on any arch?

Last time the idea of just using lib/Target was raised Alp mentioning
having a use case for the current ability.

Cheers,
Rafael

To clarify: Are you asking if there's a use case for clang being able to generate IR for architectures that it can't generate native code for? The only cases I can think of for this are special targets (SPIR, PNaCl), where there isn't a corresponding back end.

If we can reduce the size of a build of clang that isn't a cross compiler, without reducing the utility of a build of clang that is a cross compiler, then this sounds like a good plan. We generally have two uses for clang:

- On big machines, we want a few symlinks to clang for different targets

- On small machines, we want as small a clang as possible, with only the features that are absolutely essential as a C compiler

While we're discussing target-specific things though, there is one corner case that's come up a few times and is worth considering:

A few people want to target a lowest-common-denominator architecture for most of the code and have some functions use intrinsics of inline assembly relying on features for something more specific (e.g. x86-64 base, SSE 4.2 in some functions) and then decide whether to use those specific functions at run time.

We actually have a more general case of this, where we'd like to be able to embed some inline MIPS assembly in x86 and ARM binaries that are going to be used for controlling a MIPS-compatible coprocessor (or talking to it over JTAG).

David

To clarify: Are you asking if there's a use case for clang being able to generate IR for architectures that it can't generate native code for? The only cases I can think of for this are special targets (SPIR, PNaCl), where there isn't a corresponding back end.

If the only cases are the ones that don't have back-ends, then the
solution is trivial: create empty back-ends for them that *just*
contains the target information, and all should behave identical.

If we can reduce the size of a build of clang that isn't a cross compiler, without reducing the utility of a build of clang that is a cross compiler, then this sounds like a good plan.

That's the idea. I don't think that producing ARM IR on a native MIPS
compiler (or vice-versa) is a useful feature.

A few people want to target a lowest-common-denominator architecture for most of the code and have some functions use intrinsics of inline assembly relying on features for something more specific (e.g. x86-64 base, SSE 4.2 in some functions) and then decide whether to use those specific functions at run time.

That's where it all started... :slight_smile:

The problems I'm trying to tackle with this are:

1. http://llvm.org/PR20757: .fpu/.arch/.cpu asm directives don't
change the behaviour in assembly files as they should. The problem is
deeper and spawned a long discussion here and at the GCC list.

2. http://llvm.org/PR20787: assembler directives and command line
parsing have the same arguments and need a common parser.

These problems were the seed to this target info library that would
free all tools of repeating both the parsing and the identification of
any target specific knowledge or behaviour.

Once we have a common description, we can then fix the parser by
adding a TargetParser to the TargetInfo set of classes, and fix the
asm directives by changing the local behaviour of the architecture but
not the global (which will be in TargetInfo).

cheers,
--renato

I think this is OK for PNaCl.

What David is suggesting is interesting though: what if a target wanted to
have ARM, x86 and MIPS assembly all assembled in a single executable at the
end (not PNaCl's usecase)? It seems like you could just create separate .o
and merge them later, no?

To have any target's assembly/object, you'd have to have its back-end,
so it still would work in a target-specific way. I see no problems
with this.

cheers,
--renato