Creation of inline assembly labels across assembly blocks


we want to use clang to compile intel inline assembly on 64-bit windows.

Unfortunately, clang generates errors when trying to access labels in another __asm block, for example

int main() {
    __asm {
    __asm {
        jmp l0;

When running

clang++ main.cpp

This generates the following error:

main.cpp:5:5: error: assembler label 'L__MSASMLABEL_.1__l0' can not be undefined
    __asm {
<inline asm>:2:2: note: instantiated into assembly here
        jmp L__MSASMLABEL_.1__l0
1 error generated.

Investigating the LLVM IR generated by running

clang++ -S -emit-llvm main.cpp

generates the following intermediate representation:

; ModuleID = 'main.cpp'
source_filename = "main.cpp"
target datalayout = "e-m:x-p:32:32-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc19.36.32535"

; Function Attrs: mustprogress noinline norecurse nounwind optnone
define dso_local noundef i32 @main() #0 {
  %1 = alloca i32, align 4
  store i32 0, ptr %1, align 4
  %2 = call i32 asm sideeffect inteldialect "L__MSASMLABEL_.${:uid}__l0:", "={eax},~{dirflag},~{fpsr},~{flags}"() #1, !srcloc !4
  store i32 %2, ptr %1, align 4
  %3 = call i32 asm sideeffect inteldialect "jmp L__MSASMLABEL_.${:uid}__l0", "={eax},~{dirflag},~{fpsr},~{flags}"() #1, !srcloc !5
  ret i32 %3

attributes #0 = { mustprogress noinline norecurse nounwind optnone "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #1 = { nounwind }

!llvm.module.flags = !{!0, !1, !2}
!llvm.ident = !{!3}

!0 = !{i32 1, !"NumRegisterParameters", i32 0}
!1 = !{i32 1, !"wchar_size", i32 2}
!2 = !{i32 7, !"frame-pointer", i32 2}
!3 = !{!"clang version 15.0.1"}
!4 = !{i64 20}
!5 = !{i64 53}

When changing the labels in this IR from L__MSASMLABEL_.${:uid}__l0: to L0, the error disappears and the build succeeds.

How can we convince clang to use the global L0 symbol instead of generating assembly-block specific ones using the uid parameter?

In general, LLVM IR doesn’t allow jumping between inline asm blocks; even if you got the label names to match, it would be undefined behavior. (Optimizations need to correctly model control flow to generate correct code.)

For the exact construct you show, we could teach clang to merge adjacent __asm statements, but I’m not sure if that’s representative of your actual code.

1 Like

Does the label need to be defined in the asm? Perhaps you could change the above example to:

int main() {
    asm(""); // 1st half of the previous __asm statment
    asm(""); // 2nd half of the previous __asm statement
    asm goto ("jmp %l0"::::l0);

Otherwise, I think this has been discussed elsewhere; the asm contexts don’t share anything, so label definitions are isolated to each asm statement.

The root cause for our wish to define labels across assembly blocks is the lack of debugging information for inline assembly blocks in clang.

In contrast to the intel compiler, it is not possible to step through the individual instructions of an inline assembly block in clang.

Therefore, we tried to define each line as a separate inline assembly block, but this caused the aforementioned problems with labels.

Is it possible to generate debug information after each instruction such that a compiler like WinDbg can step through the code one instruction at a time?

I’d expect most debuggers to have a mode where they can step one instruction at a time, although the lack of debugging information for inline asm may mean it wouldn’t be able to show you source at the same time. I’m not familiar with WinDbg however.

1 Like

Thank you very much for this hint!

Using the Disassembly view for debugging instead of the Source code view makes it possible to debug the generated inline assembly step by step.