#pragma omp target teams distribute parallel for reduction(+:var) <clauses>
we need to copy the value at least from the device so you can use it on the host, right?
That said, we probably could avoid the h2d transfer.
If you want the data to stay on the device, before and after, you need to write it differently, e.g.
#pragma omp target teams
{
double var = ...;
#pragma omp distribute parallel for reduction(+:var)
...
}
That said, you probably want to enable remarks (-Rpass=openmp-opt -Rpass-analysis=openmp-opt -Rpass-missed=openmp-opt, see OpenMP Optimization Remarks — LLVM/OpenMP 18.0.0git documentation) to ensure the code is properly optimized.
If any remark is a “bad” one, OMP112 or OMP130, you should investigate.
Also, LIBOMPTARGET_INFO=16 will tell you how the code is executed, which should hopefully be (Generic-)SPMD mode.
Btw, these are the remarks, found one 112, but I don’t know where it is, since it is a library function.
remark: <unknown>:0:0: Found thread data sharing on the GPU. Expect degraded performance due to data globalization. [OMP112]
remark: ./OpenMP_offload/initialise_chunk_kernel_xx_ompoffload_kernel.cpp:101:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_xx_ompoffload_kernel.cpp:101:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_yy_ompoffload_kernel.cpp:101:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_yy_ompoffload_kernel.cpp:101:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_zz_ompoffload_kernel.cpp:101:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_zz_ompoffload_kernel.cpp:101:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_x_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_x_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_y_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_y_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_z_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_z_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_cellx_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_cellx_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_celly_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_celly_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_cellz_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_cellz_ompoffload_kernel.cpp:97:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_volume_ompoffload_kernel.cpp:122:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/initialise_chunk_kernel_volume_ompoffload_kernel.cpp:122:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/generate_chunk_kernel_ompoffload_kernel.cpp:147:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/generate_chunk_kernel_ompoffload_kernel.cpp:147:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/ideal_gas_kernel_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/ideal_gas_kernel_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_b2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_b2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_b1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_b1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_t2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_t2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_t1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_t1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_l2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_l2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_l1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_l1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_r2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_r2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_r1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_r1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_ba2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_ba2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_ba1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_ba1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_fr2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_fr2_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_fr1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel1_fr1_ompoffload_kernel.cpp:143:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_4_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_4_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_2_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_2_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_4_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_4_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_2_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_2_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_minus_4_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_minus_4_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_minus_2_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_minus_2_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_minus_4_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_minus_4_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_minus_2_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_minus_2_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_xvel_plus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_minus_4_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_minus_4_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_minus_2_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_minus_2_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_minus_4_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_minus_4_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_minus_2_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_minus_2_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_4_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_4_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_2_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_2_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_4_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_4_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_2_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_2_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_yvel_plus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_4_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_4_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_2_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_2_bot_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_4_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_4_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_2_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_2_top_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_4_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_4_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_2_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_2_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_4_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_4_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_2_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_plus_2_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_minus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_minus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_minus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_minus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_minus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_minus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_minus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel2_zvel_minus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_4_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_4_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_2_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_2_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_4_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_4_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_2_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_2_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_minus_4_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_minus_4_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_minus_2_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_minus_2_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_minus_4_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_minus_4_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_minus_2_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_minus_2_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel3_plus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_minus_4_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_minus_4_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_minus_2_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_minus_2_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_minus_4_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_minus_4_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_minus_2_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_minus_2_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_4_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_4_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_2_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_2_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_4_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_4_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_2_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_2_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel4_plus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_4_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_4_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_2_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_2_a_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_4_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_4_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_2_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_2_b_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_4_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_4_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_2_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_2_left_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_4_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_4_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_2_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_plus_2_right_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_minus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_minus_4_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_minus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_minus_2_back_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_minus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_minus_4_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_minus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/update_halo_kernel5_minus_2_front_ompoffload_kernel.cpp:112:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/field_summary_kernel_ompoffload_kernel.cpp:176:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/field_summary_kernel_ompoffload_kernel.cpp:176:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/viscosity_kernel_ompoffload_kernel.cpp:153:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/viscosity_kernel_ompoffload_kernel.cpp:153:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/calc_dt_kernel_ompoffload_kernel.cpp:166:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/calc_dt_kernel_ompoffload_kernel.cpp:166:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/calc_dt_kernel_min_ompoffload_kernel.cpp:102:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/calc_dt_kernel_min_ompoffload_kernel.cpp:102:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/calc_dt_kernel_get_ompoffload_kernel.cpp:133:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/calc_dt_kernel_get_ompoffload_kernel.cpp:133:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/calc_dt_kernel_print_ompoffload_kernel.cpp:166:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/calc_dt_kernel_print_ompoffload_kernel.cpp:166:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/PdV_kernel_predict_ompoffload_kernel.cpp:166:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/PdV_kernel_predict_ompoffload_kernel.cpp:166:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/PdV_kernel_nopredict_ompoffload_kernel.cpp:185:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/PdV_kernel_nopredict_ompoffload_kernel.cpp:185:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/revert_kernel_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/revert_kernel_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/accelerate_kernel_ompoffload_kernel.cpp:166:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/accelerate_kernel_ompoffload_kernel.cpp:166:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/flux_calc_kernelx_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/flux_calc_kernelx_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/flux_calc_kernely_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/flux_calc_kernely_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/flux_calc_kernelz_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/flux_calc_kernelz_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel1_xdir_ompoffload_kernel.cpp:116:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel1_xdir_ompoffload_kernel.cpp:116:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel2_xdir_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel2_xdir_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel3_xdir_ompoffload_kernel.cpp:128:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel3_xdir_ompoffload_kernel.cpp:128:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel4_xdir_ompoffload_kernel.cpp:147:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel4_xdir_ompoffload_kernel.cpp:147:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel1_ydir_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel1_ydir_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel2_ydir_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel2_ydir_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel3_ydir_ompoffload_kernel.cpp:128:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel3_ydir_ompoffload_kernel.cpp:128:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel4_ydir_ompoffload_kernel.cpp:147:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel4_ydir_ompoffload_kernel.cpp:147:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel1_zdir_ompoffload_kernel.cpp:116:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel1_zdir_ompoffload_kernel.cpp:116:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel2_zdir_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel2_zdir_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel3_zdir_ompoffload_kernel.cpp:128:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel3_zdir_ompoffload_kernel.cpp:128:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel4_zdir_ompoffload_kernel.cpp:147:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_cell_kernel4_zdir_ompoffload_kernel.cpp:147:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_x1_ompoffload_kernel.cpp:116:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_x1_ompoffload_kernel.cpp:116:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_z1_ompoffload_kernel.cpp:116:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_z1_ompoffload_kernel.cpp:116:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_x2_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_x2_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_y2_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_y2_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_x3_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_x3_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_z3_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_z3_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_mass_flux_x_ompoffload_kernel.cpp:91:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_mass_flux_x_ompoffload_kernel.cpp:91:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_post_pre_advec_x_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_post_pre_advec_x_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel1_x_nonvector_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel1_x_nonvector_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel2_x_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel2_x_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_mass_flux_y_ompoffload_kernel.cpp:91:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_mass_flux_y_ompoffload_kernel.cpp:91:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_post_pre_advec_y_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_post_pre_advec_y_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel1_y_nonvector_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel1_y_nonvector_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel2_y_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel2_y_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_mass_flux_z_ompoffload_kernel.cpp:91:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_mass_flux_z_ompoffload_kernel.cpp:91:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_post_pre_advec_z_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel_post_pre_advec_z_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel1_z_nonvector_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel1_z_nonvector_ompoffload_kernel.cpp:110:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel2_z_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/advec_mom_kernel2_z_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/reset_field_kernel1_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/reset_field_kernel1_ompoffload_kernel.cpp:103:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/reset_field_kernel2_ompoffload_kernel.cpp:116:3: Moving memory allocation from the heap to the stack.
remark: ./OpenMP_offload/reset_field_kernel2_ompoffload_kernel.cpp:116:3: Moving memory allocation from the heap to the stack.
remark: src/ompoffload/ops_ompoffload_common.cpp:125:3: Moving memory allocation from the heap to the stack.
remark: src/ompoffload/ops_ompoffload_common.cpp:125:3: Moving memory allocation from the heap to the stack.
remark: src/ompoffload/ops_ompoffload_rt_support_kernels.cpp:68:1: Moving memory allocation from the heap to the stack.
remark: src/ompoffload/ops_ompoffload_rt_support_kernels.cpp:68:1: Moving memory allocation from the heap to the stack.
remark: src/ompoffload/ops_ompoffload_rt_support_kernels.cpp:117:1: Moving memory allocation from the heap to the stack.
remark: src/ompoffload/ops_ompoffload_rt_support_kernels.cpp:117:1: Moving memory allocation from the heap to the stack.
That said, are you really “wasting time in this” IT says 68% but it also says 265ms. This function should not be called often, I am surprised to see it called 3 times, potentially once per device? Are you sure this is a bottleneck in your application?
That might be just fine. You can try to compile with assertions on, that often helps; -fopenmp-target-assume-no-thread-state -fopenmp-target-assume-no-nested-parallelism
It was an issue with SPMD, looking more into it, I realised that clang does not support omp target loop yet (not perfectly at least). With NVHPC, a significant performance bump was observed.
So, omp target loop is not openmp at all. omp target teams loop can be used as a shorthand for the omp target teams distribute parallel for simd form.
That said, did LIBOMPTARGET_INFO=16 tell you the kernels were executed in Generic mode? or (Generic-)SPMD mode?