Fatal error when using Sjlj EH on X86_64

Hi Folks,

I’m playing with exception handling in LLVM for X86_64 on Linux and I would like to know if the support for SjLj approach is still in use somewhere, because I think the code generation fails to handle it.

I have found a fatal error that is reduced and resumed in the following example:

1 - I created a small C++ code with exception handling (sample.cpp):

 void foo(){
     throw 1;
 void bar(){
     try    {
     } catch (int e){

2 - Then I generated the corresponding IR with:
$ clang -c sample.cpp -emit-llvm -o sample.ll

3 - So, I tried to compile the previous IR (using LLVM version 15.0.0git):
$ llc --exception-model=sjlj sample.ll

And got the following fatal error:

The register $rdx needs to be live in %bb.8, but is missing from the live-in list.
LLVM ERROR: Invalid global physical register
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: ../llvm_build/bin/llc --exception-model=sjlj sample.ll --print-after-all
1.	Running pass 'Function Pass Manager' on module 'sample.ll'.
2.	Running pass 'Greedy Register Allocator' on function '@_Z3barv'
 #0 0x000000000451e27a llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (../llvm_build/bin/llc+0x451e27a)
 #1 0x000000000451e43b PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
 #2 0x000000000451c4d6 llvm::sys::RunSignalHandlers() (../llvm_build/bin/llc+0x451c4d6)
 #3 0x000000000451f695 SignalHandler(int) Signals.cpp:0:0
 #4 0x00007f1f9dd70750 __restore_rt (/lib64/libc.so.6+0x42750)
 #5 0x00007f1f9ddbd88c __pthread_kill_implementation (/lib64/libc.so.6+0x8f88c)
 #6 0x00007f1f9dd706a6 gsignal (/lib64/libc.so.6+0x426a6)
 #7 0x00007f1f9dd5a7d3 abort (/lib64/libc.so.6+0x2c7d3)
 #8 0x000000000444001a (../llvm_build/bin/llc+0x444001a)
 #9 0x000000000443fe92 (../llvm_build/bin/llc+0x443fe92)
#10 0x000000000302408f llvm::LiveRangeCalc::findReachingDefs(llvm::LiveRange&, llvm::MachineBasicBlock&, llvm::SlotIndex, unsigned int, llvm::ArrayRef<llvm::SlotIndex>) (../llvm_build/bin/llc+0x302408f)

This problem appears because findReachingDefs is checking the RDX in the livein of the following MachineBasicBlock:

352B	bb.2.lpad:
	; predecessors: %bb.8
	  successors: %bb.3
	  liveins: $rax, $rdx
368B	  EH_LABEL <mcsymbol .Ltmp2>
384B	  dead %4:gr64 = COPY killed $rdx
400B	  dead %3:gr64 = COPY killed $rax
416B	  %11:gr32 = MOV32rm %stack.0.fn_context, 1, $noreg, 12, $noreg :: (volatile load (s32) from %ir.exception_gep)
432B	  undef %10.sub_32bit:gr64_with_sub_8bit = MOV32rr %11:gr32
464B	  %7:gr32 = MOV32rm %stack.0.fn_context, 1, $noreg, 16, $noreg :: (volatile load (s32) from %ir.exn_selector_gep)
480B	  MOV64mr %stack.1.exn.slot, 1, $noreg, 0, $noreg, %10:gr64_with_sub_8bit :: (store (s64) into %ir.exn.slot)
496B	  MOV32mr %stack.2.ehselector.slot, 1, $noreg, 0, $noreg, %7:gr32 :: (store (s32) into %ir.ehselector.slot)

The other blocks of interest are:

928B	bb.7 (landing-pad):
	; predecessors: %bb.0
	  successors: %bb.9(0x40000000), %bb.8(0x40000000); %bb.9(50.00%), %bb.8(50.00%)

944B	  NOOP <regmask>
960B	  undef %24.sub_32bit:gr64_nosp = MOV32rm %stack.0.fn_context, 1, $noreg, 8, $noreg :: (load (s704) from %stack.0.fn_context + 8, align 8)
976B	  CMP32ri %24.sub_32bit:gr64_nosp, 1, implicit-def $eflags
992B	  JCC_1 %bb.9, 3, implicit killed $eflags

1008B	bb.8:
	; predecessors: %bb.7
	  successors: %bb.2(0x80000000); %bb.2(100.00%)

1024B	  %23:gr64 = LEA64r $rip, 1, $noreg, %jump-table.0, $noreg
1056B	  JMP64m %23:gr64, 8, %24:gr64_nosp, 0, $noreg

The first MachineBasicBlock is a former landing pad accessing the ExceptionSelectorRegister (X86::RDX) which is not allocated by the Register Allocator itself, that is, it is previously allocated considering ABI characteristics. The fatal error occurs because this register does not appear in bb.8 livein list.

This error does not appear in the Dwarf EH because bb.2.lpad is in fact a landing pad, avoiding the execution of findReachingDefs over RDX in this block. For SjLj EH, this block (bb.2.lpad) becomes a former landing pad because X86TargetLowering::EmitSjLjDispatchBlock replaces all landing pads with a jumptable based dispatcher (bb.7) which becomes the real landing pad. When replacing landing pads, no livein updates are made on the predecessor of bb.2.lpad.

So, my question remains, is this exception model still working on X86 or I simply forget something?

I appreciate your help and comments if possible.

Use the -fsjlj-exceptions option for the Clang driver; the IR varies between SJLJ and DWARF-based exceptions.

Thank for the clue. If I use the -fsjlj-exceptions in Clang with assembly or object output, it works fine.

The interesting point is that, when generating IR from clang, -fsjlj-exceptions affects only the personality function (v0 or sj0). Compiling the generated IR with llc --exception-model=sjlj fails again (fatal error). I remain intrigued by this point.

Hm, probably worth filing an issue then about that