RFC: Adding __INTEGRATED_ASSEMLER__ macro

This seems to be a slightly contentious suggestion, so it seems fitting to bring it up on cfe-dev.

Given that -fintegrated-as and -fno-integrated-as are available to the user, and that the integrated assembler can be a bit stringent (as compared to GAS at least, which, for example, supports deprecated syntax), it might be nice to permit the user to detect that an integrated assembler is in use.

The flag is intentionally generic so that if another compiler were to implement an integrated assembler, the same flag could be used. The variant could be detected similar to how the compiler is detected (e.g. defined(clang) && defined(INTEGRATED_ASSEMBLER) would indicate the LLVM IAS).

This would allow for also checking the version of the assembler which is currently not possible. Doing would enable using newer features of the assembler over time, while not breaking compatibility will slightly older toolchains.

As mentioned on IRC, I am opposed to the idea. There are few reasons why:

* The integrated assembler is just another assembler. I would't like
for us to provide a compile time macro for it but not for the version
of gas for example.
* The only cases we should be incompatible are when some feature is
completely missing in one of the assemblers.

Do you have an specific use case in mind? I.E, an example source that
would use __INTEGRATED_ASSEMBLER__? Is that a case that is not
considered a bug (including missing features) in MC?

Cheers,
Rafael

As I said over IRC, I'm also completely against it.

First of all, __INTEGRATED_ASSEMBLER__ would mean any integrated
assembler, not only LLVM's, even though it was created here, other
tools will compile the same code and could mess up with developers'
heads. If anything, it should be called __LLVM_IAS__ (as Joerg
suggested).

Now, on to the semantics of such a flag...

This flag would separate IAS vs. !IAS, which in itself is a pretty bad
separation of things. Even though it *can* be used to mean "I'm using
a modern assembler", It would actually be primarily used to write code
that is only relevant (or supported) but our integrated assembler.
This would create a whole class of ifdefs that would only work with
our assembler. Moreover, the fact that we assume our assembler is
"modern", means that nothing else will appear in a decade or so better
than LLVM, which in my view, is a pretty limited view of the world. I
can't see how that is *NOT* going to be yet another magic block of ASM
that only one tool supports.

On a higher level, there's the quality issue. People should test for
*behaviour* and *standards* not *tools* or *versions*. So, if my code
only works on ARM UAL syntax, I should ifdef UAL, not ifdef
MY_OWN_ASM_VERSION_7.34+. ARM is historically polluted with such
flags, and they've now created the ACLE (ARM C Language Extensions),
which moves from architecture version to feature support macros and
extensions, which means it doesn't really matter what tool you're
using, if that tool supports feature A, you can use it.

There is legacy code that is complicated to compile without some form
of backward compatibility, I know that, but I'd rather do like Iain
said and have an assembler with multiple personality (that can support
multiple syntaxes separately) than start adding magic macros. To me,
it seems like the wrong hack to be fixing the wrong problem.

As I said before, the worse the hack, the longer it lingers.

cheers,
--renato

First, I would assume this would be better spelled as:

__has_feature(integrated_assembler)

But I agree with others that "integrated assembler" isn't a feature which
should be observable in source code.

First, I would assume this would be better spelled as:

__has_feature(integrated_assembler)

Sure, I have no issue with this.

But I agree with others that "integrated assembler" isn't a feature which
should be observable in source code.

On a higher level, there's the quality issue. People should test for
*behaviour* and *standards* not *tools* or *versions*. So, if my code
only works on ARM UAL syntax, I should ifdef UAL, not ifdef
MY_OWN_ASM_VERSION_7.34+. ARM is historically polluted with such
flags, and they've now created the ACLE (ARM C Language Extensions),
which moves from architecture version to feature support macros and
extensions, which means it doesn't really matter what tool you're
using, if that tool supports feature A, you can use it.

Very much. If we have specific assembler features, we should expose them
through __has_feature, but they should be source code visible features
rather than "my code compiles faster with fewer temporary files" features.

Unfortunately, its not that simple. The IAS is not a perfect drop in
replacement. As a concrete example, on ARM, the IAS does not support
pre-UAL syntax (which the Linux kernel uses in some cases). This is more
of a philosophical limitation rather than technical AFAIK.

Having the ability to detect what assembler is being targeted is useful. I
might be overlooking something, but I dont see why this would be any more
dangerous than exposing the size of long or long long via the preprocessor.

But you've just said: "the IAS does not support pre-UAL syntax". I think
this precisely answers the question. Add
"__has_feature(some_spelling_of_what_UAL_stands_for)" which says
specifically that the UAL syntax is supported. And/or, __has_extension(...)
for the name of the pre-UAL syntax which could hypothetically be supported
as an extension, but isn't in Clang. And/or have the UAL-syntax specify a
name of a preprocessor macro that all conforming compilers that support
this syntax are required to define.

Again, here we have a concrete behavioral feature that we can and should
support testing for. This isn't about whether the assembler is integrated
or not, it is about whether the assembler supports a particular syntax on a
particular platform.

Because the size of long long doesn't change between versions. If I have code that requires long long to be 128 bits now, it won't suddenly work later on a platform where it's 64 bits just because I upgrade my compiler.

The test you want is not 'is the assembler the LLVM integrated assembler', it's 'does the assembler support pre-UAL syntax'. If LLVM 3.9 implements support for pre-UAL syntax, then your test would be wrong and you've added a dependency on an external assembler for no reason.

The problem with exposing these as preprocessor macros at all is that, while it's easy for clang to know about the features of the LLVM integrated assembler, it has no knowledge at all of the behaviour of whatever tool it finds called as, other than that it accepts gas-compatible arguments. The only way for it to find out would be to interrogate the assembler on startup.

If you depend on a particular feature like this, your best bet is to encode the logic in your build configuration system: try to compile a simple file containing pre-UAL syntax assembly and see if it works. You'll need to encode the fall-back logic in your build system anyway.

David

I agree this is possibly the best way out. Though, we're talking about
the kernel, and it might be slightly harder than usual to get this
accepted. But that's not an excuse to use clear anti-patterns in a
toolchain.

If you're felling adventurous, you might propose on the binutils list
for some UAL-specific macros to be set by both compilers (and there
may be enough closed-source compiler engineers listening), so the
*feature* checking macro will be widespread.

cheers,
--renato

But you've just said: "the IAS does not support pre-UAL syntax". I think
this precisely answers the question. Add
"__has_feature(some_spelling_of_what_UAL_stands_for)" which says
specifically that the UAL syntax is supported. And/or, __has_extension(...)
for the name of the pre-UAL syntax which could hypothetically be supported
as an extension, but isn't in Clang. And/or have the UAL-syntax specify a
name of a preprocessor macro that all conforming compilers that support this
syntax are required to define.

Again, here we have a concrete behavioral feature that we can and should
support testing for. This isn't about whether the assembler is integrated or
not, it is about whether the assembler supports a particular syntax on a
particular platform.

While exposing this would be conceptually OK, how would we implement
it? Take pre_ual_syntax for example. With "-c -integrated-as" we know
the answer is currently no since we don't implement it. But what about
"-no-integrated-as -S"? We have no idea where that assembly is going.
It might be sent to gas (has it) or back to us (doesn't have it) or
the ARM assembler (I don't thin it has it).

Cheers,
Rafael

This is a very good point. Such macros *only* make sense if you're
actually using the assembler to assemble files, not just validate a
piece of assembler that will be passed, as is, to another assembler,
which might not recognize the syntax that you do.

If the decision is in the compiler, than the compiler either knows
*exactly* what the assembler can or cannot deal with (we don't), or it
just doesn't make sense.

cheers,
--renato