Questions on String initialization and usage

Today I was experimenting on how I could call printf and implement basic c-style null terminated strings.

The first problem I ran in to is how to initialize the string type (an array of i8). I tried using an empty string which gave me an error about the types being wrong and then tried using null, which passes verification but then I get an error when building with llvm-as. I found that odd because LLVMWriteBitcodeToFile DID work so I’m curious if the versions are different or something (should both be LLVM 16). So what really is the correct way to do this in LLVM 16?

Secondly, if I used LLVMWriteBitcodeToFile and built the program printf did work but there was an extra “0” at the end of the string. I’m curious about this because the null terminator \00 was added via the LLVM constant string function. Do I need to use a GEP to get the first index or can I simply pass the pointer like below? Maybe it was because the null initializer too for all I know,

Thanks guys.

@s = global [255 x i8] null
@i = global i32 0
@"$result" = global i8 0

declare i32 @printf(ptr %0, ...)

define i8 @main() {
entry:
  store [9 x i8] c"hello %d\00", ptr @s, align 1
  store i32 100, ptr @i, align 4
  %i = load i32, ptr @i, align 4
  %0 = call i32 (ptr, ...) @printf(ptr @s, i32 %i)
  %1 = load i8, ptr @"$result", align 1
  ret i8 %1
}

Compiling the IR:

llvm-as main.ll -o main.bc
llvm-as: main.ll:4:24: error: null must be a pointer type
@s = global [255 x i8] null
^

It needs to be initialized with 255 bytes since that’s how big the array is. The basic zero initializer would be spelled in LLVM as @s = global [255 x i8] [i8 0, i8 0, <... repeat 255 times>, i8 0]. Fortunately this is common enough that LLVM has a special syntax and you can write @s = global [255 x i8] zeroinitializer.

From the C API, I believe the function LLVMConstNull(<arraytype>) handles this case (as opposed to the very similarly named LLVMConstPointerNull which really is only for pointers and won’t work).

I’m surprised the verifier passed, I’d have expected it to catch the null too. Other than that there would probably be an assertion failure when you first tried to set the initializer if you were using an LLVM with assertions enabled (always a good idea if you’re developing against the API).

I’m not sure what’s going on here. When I compile the IR you pasted in (fixed to use zeroinitializer) it prints hello 100 with no newline at the end. Is your environment maybe also printing the whole program’s exit status (0) which ends up as hello 1000 because there isn’t a newline after the real output?

No need for a GEP, that part of the program is fine.

The common reference for such questions is:
https://mapping-high-level-constructs-to-llvm-ir.readthedocs.io/en/latest/

1 Like

Oh! I was indeed using LLVMConstPointerNull instead of LLVMConstNull! I wonder if those assertions were not set and that’s why it worked.

As for the extra “0”, excellent guess, i was indeed printing the return code because I didn’t have printf yet.

Thanks for your help I appreciate it.