tl;dr Some inline assembly constructs, particularly certain .if
directives, are not supported by integrated assemblers.
I plan to resolve this issue beginning with clang -c
and subsequently addressing clang -S
.
Since the clang -S
fix is difficult,
certain .if
constructs might work with clang -c
but not with clang -S
for an extended period of time.
C/C++ =(front end)=> LLVM IR =(middle end)=> LLVM IR (optimized) =(instruction selector)=> MachineInstr =(AsmPrinter)=> MCInst =(assembler)=> relocatable object file
In the AsmPrinter pass, AsmPrinter::emitInlineAsm
handles inline assembly.
There are 3 code paths:
-fno-integrated-as
(except XCOFF for AIX): emit raw assembly toMCAsmStreamer
to without parsing/checking-S -fintegrated-as
(also-c --save-temps -fintegrated-as
): parse, check, and emit object code toMCAsmStreamer
-c -fintegrated-as
: parse, check, and emit object code toMCObjectStreamer
In theory, their behaviors should be identical. However, quality of implementation can cause differences.
The most noticeable difference is inline assembly, which is known to be used extensively in Linux kernel.
(Personally I haven’t found any other projects stress testing LLVM integrated assembler to such a great extent. )
In the past few years, many code constructs that only worked with -fno-integrated-as
have been made working with -fintegrated-as
as well.
However, a known issue, bothering several users and the Linux kernel (.if constant expression folding works for .s but not for inline assembly due to UseAssemblerInfoForParsing · Issue #62520 · llvm/llvm-project · GitHub), involves constant expression folding during inline assembly parsing.
GCC/gas accept the following inline assembly while our integrated assembler feature has trouble:
% cat b.cc
asm(R"(
.pushsection .text,"ax"
.globl _start; _start: ret
.if . -_start == 1
ret
.endif
.popsection
)");
% gcc -S b.cc && gcc -c b.cc # succeeded
% clang -S -fno-integrated-as b.cc # succeeded
% clang -c b.cc # failed now. will succeed with https://github.com/llvm/llvm-project/pull/91082
% clang -S b.cc # failed. requires improvement to MCAsmStreamer
<inline asm>:4:5: error: expected absolute expression
4 | .if . -_start == 1
| ^
1 error generated.
I propose [MC] Remove UseAssemblerInfoForParsing by MaskRay · Pull Request #91082 · llvm/llvm-project · GitHub to fix clang -c
.
The error for clang -S will be handled in the future.
I’ve been involved in improving integrated assembler support over the past few years, including last year’s successful assembler revamp for RISC-V linker relaxation
Further enhancements to MCAsmStreamer
are planned, including support for printing .if
directives and
- Move
MCObjectStreamer::Assembler
toMCStreamer::Assembler
so thatMCAsmStreamer
has access to the assembler. - In the
Res->evaluateAsAbsolute
call inAsmParser::parseExpression
, passMCAssembler
.
However, these items are not of the highest priority in my spare time.
Alternative
Enhancing MCAsmStreamer
to a level that allows seamless support for both clang -c
and clang -S
.
This approach would ensure consistent user experience and avoid potential confusion arising from differing outputs between the two commands.
I propose prioritizing improvements to MCObjectStreamer first given its relatively higher importance (than MCAsmStreamer), the impending need by the Linux kernel, and the simplification #91082 will bring (the patch will simplify the code in a few other places).
Discrepancies between assembler and object code:
In general, we aim for consistency between compiled source code and assembled intermediate assembly code.
This applies to both object file and diagnostics.
When discrepancies arise, we strive to identify and address them.
However, it’s important to acknowledge that some minor differences might exist.