Hello Brian (and everyone who is reading this),
allow me to be very verbose (-vvvv) so that you, and others, can understand what I did, why I did it, and what I found:
## Motivation
The main objective was to provide an "Apple Companion Guide" to the book „Programming with 64-Bit ARM Assembly Language“ by Stephen Smith. The book uses the RaspberryPi 4, Linux and accordingly the GNU toolchain and GNU syntax. I wanted to provide both information on how Apple aarch64 devices differ, as well as modified, runnable source code. You can see all of what I did here: https://github.com/below/HelloSilicon
## The Problem
I ran into an issue when it came to an example using inline-assembly in C. Here is the original code, which works on ARM64 Linux, simply by invoking gcc on the file without any flags: https://github.com/below/HelloSilicon/blob/4c4b2911c43644adfd3b78aee093857444f28472/Chapter%209/uppertst4.c
When compiling the same exact file with clang on an M1 Mac (again, no flags), there is a trivial warning that we will ignore, and the following errors:
uppertst4.c:22:4: error: conditional branch requires assembler-local label. 'cont' is external.
"BGT cont\n"
^
<inline asm>:4:1: note: instantiated into assembly here
BGT cont
^
uppertst4.c:24:4: error: conditional branch requires assembler-local label. 'cont' is external.
"BLT cont\n"
^
<inline asm>:6:1: note: instantiated into assembly here
BLT cont
^
uppertst4.c:28:4: error: conditional branch requires assembler-local label. 'loop' is external.
"B.NE loop\n"
^
<inline asm>:10:1: note: instantiated into assembly here
B.NE loop
^
## The Provisional Solution
I was able to solve the issue by replacing the `cont` label with a numeric value, 2, and the branches with `BGT 2f` and `BLT 2f`. Curiously, after this change, `loop` could stay like it was.
## Finding The Real Problem
Other than the trivial warnings, godbolt is absolutely happy with the file: https://godbolt.org/z/6s8bs4rW9
Next, I let clang create assembly output using `clang -S uppertst4.c` (on the M1 Mac) on the original, GNU, sourcefile. This also produces no errors, however trying to invoke the clang assembler `as` did:
% as uppertst4.s
uppertst4.s:33:2: error: conditional branch requires assembler-local label. 'cont' is external.
b.gt cont
^
uppertst4.s:35:2: error: conditional branch requires assembler-local label. 'cont' is external.
b.lt cont
^
uppertst4.s:40:2: error: conditional branch requires assembler-local label. 'loop' is external.
b.ne loop
^
As I had a [working, standalone assembly file](https://github.com/below/HelloSilicon/blob/main/Chapter%2005/upper.s) which uses both `cont` and `loop`, I took a „divide and conquer“ approach: I deleted and added lines to the two files, until I would know what precisely caused the issue.
It turned out to be on the very last line, clang adds the compiler directive: `.subsections_via_symbols`.
llvm does it whenever it outputs assembly for a MachO binary, the source code tells us:
if (TT.isOSBinFormatMachO()) {
// Funny Darwin hack: This flag tells the linker that no global symbols
// contain code that falls through to other global symbols (e.g. the obvious
// implementation of multiple entry points). If this doesn't occur, the
// linker can safely perform dead code stripping. Since LLVM never
// generates code that does this, it is always safe to set.
OutStreamer->emitAssemblerFlag(MCAF_SubsectionsViaSymbols);
}
https://github.com/llvm/llvm-project/blob/89b57061f7b769e9ea9bf6ed686e284f3e55affe/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp#L568
The part about "Since LLVM never generates code that does this“ of course leaves out code that llvm did not generate, such as inline-assembly.
## The Real Solution
Because the C-Frontend created its own symbolic forward label, I learned that to remedy my issue, I needed to prefix `cont` with `L`, whereever using it. Here is the final file for ARM64 on Mac: https://github.com/below/HelloSilicon/blob/main/Chapter%2009/uppertst4.c
## What I still don’t know
While I did not take the time to fully understand what `. subsections_via_symbols` does, and still being a learner about these things, three questions remain:
1) What precisely was the violation of the inline code? Are there "global symbols which contain code that falls through to other global symbols“?
2) Why is the `loop` label apparently not a problem? Just to be sure, I changed it’s name to something not beginning with `l`, but that did not cause any issue either
3) Is there a substantial issue in my the standalone assembly code that I should know about?
So thank you for bearing with me. If you have any questions of input, please let me know!
Alex