Segmentation Fault in .omp_outlined

Dear LLVM superusers

I am working on an application with OpenMP target offloading to multiple GPUs. The code works as expected in NVC 22.5, GCC 13, and, to some extent, Clang 16. My code works fine in Clang on a machine with four AMD Instinct MI250X GPUs, but on a compute node with four NVIDIA Tesla V100 GPUs, I get SIGSEGV in .omp_outlined… xxx (). I am stuck with debugging here as I do not know what that could mean. Does it mean that some internal OpenMP function segfaults?

Does anybody have a suggestion for what I could try to do to fix my application?

Unfortunately, I cannot share the code I am working on as I do not own the code. :frowning:

I have Clang installed from commit 710a834c4c822c5c444fc9715785d23959f5c645 on both of the machines.

Best regards, Anton

The crash sounds like in the generated functions for parallel region. It’s quite challenging to figure where is wrong w/o any code information. Did it show any CUDA error? Can you please share the last couple of lines showing the segment fault?

Hi Shilei

Thanks for the quick reply!

I don’t get any CUDA errors. The output from gdb looks like this:

billede

Best regards,
Anton

And if you backtrace, what does it show?
It should be able to show you where that parallel region function was called from, which would tell you much more about what is going on.

Thanks for the suggestion! It looks like this, but I, unfortunately, cannot see in which part of my program the error occurs.

Can you compile your code with the “-g” flag. That should give you line numbers and function information what a debugger can use to show you where the SEGV happened.
(For now, just add the -g flag, no need to remove optimisation flags…)

FWIW, it looks as if it is in an OpenMP task.

Thank you so much! I had already compiled the code with -g, but adding -gdwarf showed me what went wrong. It was indeed some stupid mistake with a macro that did not expand as I thought it would. Now I have no idea why the code worked in all the other compilers. :sweat_smile:

No problem.

I once fixed many bugs when I was failing to run the modified and recompiled code!

1 Like