Crash passing a const char* constant to an external function

Hi all,
I have another crash calling back into a C function from some LLVM JIT’d code.
EXC_BAD_ACCESS trying to read from address 0 (initial_string_val is NULL)
Mac M1, LLVM 15.0.7.

The C function being called:

StringRep* allocateString(const char* initial_string_val)
{
	const size_t string_len = strlen(initial_string_val);
	StringRep* string_val = allocateStringWithLen(string_len);
	std::memcpy((char*)string_val + sizeof(StringRep), initial_string_val, string_len);
	return string_val;
}

The disassembly of the C code:

winter`Winter::allocateString:
    0x1000b2b18 <+0>:  sub    sp, sp, #0x30
    0x1000b2b1c <+4>:  stp    x29, x30, [sp, #0x20]
    0x1000b2b20 <+8>:  add    x29, sp, #0x20
    0x1000b2b24 <+12>: stur   x0, [x29, #-0x8]
    0x1000b2b28 <+16>: ldur   x0, [x29, #-0x8]
    0x1000b2b2c <+20>: bl     0x1018e9d44               ; symbol stub for: strlen
->  0x1000b2b30 <+24>: str    x0, [sp, #0x10]
    0x1000b2b34 <+28>: ldr    x0, [sp, #0x10]
    0x1000b2b38 <+32>: bl     0x1000b2ac8               ; Winter::allocateStringWithLen at CompiledValue.cpp:107
    0x1000b2b3c <+36>: str    x0, [sp, #0x8]
    0x1000b2b40 <+40>: ldr    x8, [sp, #0x8]
    0x1000b2b44 <+44>: ldur   x1, [x29, #-0x8]
    0x1000b2b48 <+48>: ldr    x2, [sp, #0x10]
    0x1000b2b4c <+52>: add    x0, x8, #0x18
    0x1000b2b50 <+56>: bl     0x1018e9900               ; symbol stub for: memcpy
    0x1000b2b54 <+60>: ldr    x0, [sp, #0x8]
    0x1000b2b58 <+64>: ldp    x29, x30, [sp, #0x20]
    0x1000b2b5c <+68>: add    sp, sp, #0x30
    0x1000b2b60 <+72>: ret   

The module IR of the code making the call to allocateString:

; ModuleID = 'WinterModule'
source_filename = "WinterModule"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"

%string = type { i64, i64, i64, [0 x i8] }

@0 = private unnamed_addr constant [6 x i8] c"hello\00", align 1

; Function Attrs: nounwind
declare ptr @allocateString_opaque_(ptr) local_unnamed_addr #0

; Function Attrs: nofree nounwind readonly
define ptr @main_string_(ptr nocapture readnone %s) local_unnamed_addr #1 {
entry:
  %str = tail call ptr @allocateString_opaque_(ptr nonnull @0)
  store i64 1, ptr %str, align 8, !string_literal_set_intial_ref_count_to_1 !0
  %0 = getelementptr inbounds %string, ptr %str, i64 0, i32 1
  store i64 5, ptr %0, align 8, !string_literal_set_intial_length_to_5 !1
  %1 = getelementptr inbounds %string, ptr %str, i64 0, i32 2
  store i64 1, ptr %1, align 8, !string_literal_set_intial_flags_to_1 !2
  ret ptr %str
}

attributes #0 = { nounwind "warn-stack-size"="0" }
attributes #1 = { nofree nounwind readonly "warn-stack-size"="0" }

!0 = !{!"string literal set intial ref count to 1"}
!1 = !{!"string literal set intial length to 5"}
!2 = !{!"string literal set intial flags to 1"}

The module assembly:

	.text
	.file	"WinterModule"
	.globl	main_string_
	.p2align	2
	.type	main_string_,@function
main_string_:
	str	x30, [sp, #-16]!
	movz	x0, #.L__unnamed_1
	movk	x0, #.L__unnamed_1
	movk	x0, #.L__unnamed_1
	movk	x0, #.L__unnamed_1
	bl	allocateString_opaque_
	mov	w8, #1
	mov	w9, #5
	stp	x8, x9, [x0]
	str	x8, [x0, #16]
	ldr	x30, [sp], #16
	ret
.Lfunc_end0:
	.size	main_string_, .Lfunc_end0-main_string_

	.type	.L__unnamed_1,@object
	.section	.rodata.str1.1,"aMS",@progbits,1
.L__unnamed_1:
	.asciz	"hello"
	.size	.L__unnamed_1, 6

	.section	".note.GNU-stack","",@progbits

I’m new to the ARM ISA, but why is

movk	x0, #.L__unnamed_1

executed three times in a row? isn’t this redundant?

EDIT: Either the LLVM assembly output is incorrect (e.g. not showing shifts), or this seems like a bug in the loading of immediate memory addresses into the x0 register.

Some weirdness going on, Stepping through the code in the Xcode debugger (showing disassembly), my Jitted code looks like this:

->  0x1031dc000: str    x30, [sp, #-0x10]!
    0x1031dc004: mov    x0, #0x0
    0x1031dc008: movk   x0, #0x0, lsl #16
    0x1031dc00c: movk   x0, #0x0, lsl #32
    0x1031dc010: movk   x0, #0x0, lsl #48
    0x1031dc014: bl     0x1031dc030
    0x1031dc018: mov    w8, #0x1
    0x1031dc01c: mov    w9, #0xa

This is different from what the LLVM assembly output shows. Also the reason for the pointer being NULL is quite clear (x0 is just zeroed out).

Any idea what is going on here?

Ah nevermind, think I solved it. Issue was caused by

this->triple.append(“-elf”); // MCJIT requires the -elf suffix currently, see Redirecting to Google Groups

(some old LLVM workaround code)