[OpenMP][IR] OMP: Error #132: Thread identifier invalid

I’m trying to make parallel loops with different scheduling policies. My generated LLVM looks like this:

; ModuleID = 'GalaxyJIT'
source_filename = "GalaxyJIT"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@omp_ident = private constant { i32, i32, i32, i32, ptr } { i32 0, i32 514, i32 0, i32 22, ptr null }

define i32 @main() {
entry:
  br label %preloop

preloop:                                          ; preds = %entry
  %i = alloca i32, align 4
  store i32 0, ptr %i, align 4
  %i1 = alloca i32, align 4
  store i32 0, ptr %i1, align 4
  br label %cond

cond:                                             ; preds = %update, %preloop
  %i2 = load i32, ptr %i1, align 4
  %loopcond = icmp slt i32 %i2, 10
  br i1 %loopcond, label %body, label %endloop

body:                                             ; preds = %cond
  call void @__kmpc_for_static_init_4(ptr @omp_ident, i32 0, i32 33, ptr %i1, i32 10, ptr null, ptr null, i32 1, i32 1)
  %updatedLoopVar = add i32 %i2, 1
  store i32 %updatedLoopVar, ptr %i1, align 4
  br label %update

update:                                           ; preds = %body
  br label %cond

endloop:                                          ; preds = %cond
  ret i32 0
}

declare void @__kmpc_for_static_init_4(ptr, i32, i32, ptr, i32, ptr, ptr, i32, i32)

Then i compile the generated object file with:

clang -fopenmp -L/usr/lib/llvm-19/lib/ output.o -o executable

And when I run it…

OMP: Error #132: Thread identifier invalid.
Assertion failure at kmp_runtime.cpp(6993): __kmp_registration_flag != 0.
OMP: Error #13: Assertion failure at kmp_runtime.cpp(6993).
OMP: Hint Please submit a bug report with this message, compile and run commands used, and machine configuration info including native compiler and operating system versions. Faster response will be obtained by including all program sources. For information on submitting this issue, please see https://github.com/llvm/llvm-project/issues/.

If more information is needed, like how my for loops generation code looks like, let me know and I will send it.

By the way, for more context, the input code is this one:

def main() -> int:
  for parallel static (int i := 0; i <= 10; ++i) -> 4:
    // nothing
  end;

  return 0;
end;

where 4 is the number of threads used

I’m fairly certain you need to be within a forked parallel region to make use of the worksharing runtime calls. The assertion is likely triggering because some global control variable wasn’t set properly as would normally be done via __kmpc_fork_call or something.

Sure, I tested putting real code inside the for loop and it indeed performs __kmpc_fork_call. But the weird thing is that before I begin writing the OpenMP generation, I was based on the emited LLVM from a C file using #pragma omp parallel for schedule(static) num_threads(4) and I can’t see differences between my code and the emited one. Can you reference me the correct way of defining the global control variables or something like that?

The code snippet you have is the callback that would be passed to the fork call, Compiler Explorer.

I’m still getting in troubles with this:

Assertion failure at kmp_sched.cpp(133): plastiter && plower && pupper && pstride.
OMP: Error #13: Assertion failure at kmp_sched.cpp(133).

Do you know any implementation of this with the LLVM API? My code started getting huge of many tries I’ve done.

Finally I managed to make it work. I mean… “work”, because I think I implemented it incorrectly. Parallel for statements are now slower than classic for statements

That’s expected depending on the size of the loop, creating threads is expensive and most of the runtime is initialized on first call AFAIK.

It looks as if you could do with some more information about the interface between the compiler and the OpenMP runtime… It’s possible that the book I wrote with Michael Klemm (High Performance Parallel Runtimes, Design and Implementation)
may be useful, along with the “Little OpenMP runtime” [LOMP] source code that accompanies it. It should help you to understand an OpenMP runtime, while being a lot smaller than the LLVM runtime…