Hello,
I enabled "-polly-dump-before -polly-dump-after" to obtain the IR before
and after Polly. Have attached kernel_gemm-before.ll (kg_before.ll) and
kernel_gemm-after.ll (kg_after.ll)
Perfect.
observations,
- Both kg_before.ll and kg_after.ll have,
- The only DICompileUnit is at !2
- !llvm.dbg.cu=!{!2}
- I don't think Julia is generating incorrect debug info .
Neither do I.
- For opt kernel_gemm-before.ll $FLAG -o output.ll
- All the following FLAGs except "-O3 -polly -polly-target=gpu"
didn't have a problem with the Module,
- -O3 -polly -polly-target=cpu
- -O3
- The only time verifyModule is called within Polly is at
kernerlFinalize.
- This could mean that Polly's generating the wrong debug info,
which is uncovered only when GPU code's generated.
- I'm not sure if verifyModule is called indirectly when
-polly-target=cpu
- Or, the debug information is corrupted when sending or/and
receiving information to/from PPCG.
I think we are just generating incorrect code when extracting the GPU
module.
- For opt output.ll -O3 -polly -polly-target=gpu -o
output.output.ll, where output.ll generated by opt kg_before.ll -O3
-polly
-polly-target=cpu,
- verifyModule is involed thrice, but issues the same error
- Possibly 3 SCoPs detected ?
- This increases the possibility that Polly isn't handling the
debug
info properly
- For opt kernel_gemm-after.ll $ANY_FLAG -o output.ll
- Cannot invoke an intrinsic other than donothing, patchpoint,
statepoint, coro_resume or coro_destroy
- store void (metadata, i64, metadata, metadata)*
@llvm.dbg.value,
void (metadata, i64, metadata, metadata)** %polly_launch_0_param_7
- Invalid user of intrinsic instruction!
- store void (metadata, i64, metadata, metadata)*
@llvm.dbg.value,
void (metadata, i64, metadata, metadata)** %polly_launch_0_param_7
- ./llvm_build/bin/opt: kernel_gemm-after.ll: error: input module
is
broken!
- and output.ll isn't produced.
The following patch highlights the first issue:
--- a/lib/CodeGen/PPCGCodeGeneration.cpp
+++ b/lib/CodeGen/PPCGCodeGeneration.cpp
@@ -1163,6 +1163,8 @@ GPUNodeBuilder::createLaunchParameters(ppcg_kernel
*Kernel, Function *F,
}
for (auto Val : SubtreeValues) {
+ errs() << "Referenced Parameter\n";
+ Val->dump();
Instruction *Param = new AllocaInst(
Val->getType(), Launch + "_param_" + std::to_string(Index),
EntryBlock->getTerminator());
@@ -1254,8 +1256,8 @@ void GPUNodeBuilder::createKernel(__isl_take
isl_ast_node *KernelStmt) {
Value *GridDimX, *GridDimY;
std::tie(GridDimX, GridDimY) = getGridSizes(Kernel);
- createCallLaunchKernel(GPUKernel, GridDimX, GridDimY, BlockDimX,
BlockDimY,
- BlockDimZ, Parameters);
+ // createCallLaunchKernel(GPUKernel, GridDimX, GridDimY, BlockDimX,
BlockDimY,
+ // BlockDimZ, Parameters);
createCallFreeKernel(GPUKernel);
opt -polly-codegen-ppcg kernel_gemm-before.ll
opt kernel_gemm-before.ll -polly-codegen-ppcg -polly-acc-dump-kernel-ir
WARNING: You're attempting to print out a bitcode file.
This is inadvisable as it may cause display problems. If
you REALLY want to taste LLVM bitcode first-hand, you
can force output with the `-f' option.
Referenced Parameter
; Function Attrs: nounwind readnone
declare void @llvm.dbg.value(metadata, i64, metadata, metadata) #2
Referenced Parameter
i64 %1
Referenced Parameter
i64 %0
Cannot invoke an intrinsic other than donothing, patchpoint, statepoint,
coro_resume or coro_destroy
store void (metadata, i64, metadata, metadata)* @llvm.dbg.value, void
(metadata, i64, metadata, metadata)** %polly_launch_0_param_7
LLVM ERROR: Broken function found, compilation aborted!
We try to pass the llvm.dbg.value function call to the kernel. This does
not work. We should not list @llvm.dbg.value in the Subtree values that
are passed to the kernel.
In fact, we should ignore all debug tree values and not code-generate
them at all.
Best,
Tobias