CloneFunctionInto produces invalid debug info

Hi!

We are currently working on a science project and implemented a FunctionPass that clones a function (more precisely a constructor of a struct/class) and adds a parameter.

First, we create a new function with a new function type, which includes the newly added parameter:

Function *NF = Function::Create(NewFTy, F.getLinkage(), F.getName() + “Cloned”, F.getParent());

and after setting up the ValueToValueMapTy, we use the CloneFunctionInto method to clone the function body

CloneFunctionInto(NF, &F, Map, true, Returns, “Cloned”);

The code seems to work as intended, but when we try to emit debug symbols (clang -g flag) the pass fails with following message:

“All DICompileUnits must be listed in llvm.dbg.cu

Nevertheless, we can dump the Module and therefore can print out the annotated IR.

This is what the function to be cloned looks like:

; Function Attrs: noinline nounwind uwtable

define linkonce_odr void @_ZN12MyFunnyClassC2Ev(%struct.MyFunnyClass* %this) unnamed_addr #4 comdat align 2 !dbg !46 {

entry:

%this.addr = alloca %struct.MyFunnyClass*, align 8

store %struct.MyFunnyClass* %this, %struct.MyFunnyClass** %this.addr, align 8

call void @llvm.dbg.declare(metadata %struct.MyFunnyClass** %this.addr, metadata !49, metadata !31), !dbg !50

… rest of function code

}

!46 = distinct !DISubprogram(name: “MyFunnyClass”, linkageName: “_ZN12MyFunnyClassC2Ev”, scope: !15, file: !1, line: 1, type: !25, isLocal: false, isDefinition: true, scopeLine: 1, flags: DIFlagArtificial | DIFlagPrototyped, isOptimized: false, unit: !0, declaration: !47, variables: !2)

and the cloned function:

; Function Attrs: noinline nounwind uwtable

define linkonce_odr void @_ZN12MyFunnyClassC2EvCloned(%struct.MyFunnyClass* %this, { [6 x i8*] }* %newparam) unnamed_addr #4 align 2 !dbg !73 {

entry:

%this.addr = alloca %struct.MyFunnyClass*, align 8

store %struct.MyFunnyClass* %this, %struct.MyFunnyClass** %this.addr, align 8

call void @llvm.dbg.declare(metadata %struct.MyFunnyClass** %this.addr, metadata !89, metadata !31), !dbg !91

… rest of function code

}

!73 = distinct !DISubprogram(name: “MyFunnyClass”, linkageName: “_ZN12MyFunnyClassC2Ev”, scope: !74, file: !1, line: 1, type: !81, isLocal: false, isDefinition: true, scopeLine: 1, flags: DIFlagArtificial | DIFlagPrototyped, isOptimized: false, unit: !87, declaration: !88, variables: !2)

So the cloned function gets annotated with debug symbols as expected. We noticed that the linkageName of the cloned function is the same as the original one’s. Could that cause the error mentioned above? If so, how can we fix that error?

Best regards and thanks in advance,
Matthias

We are seeing the same. Cloning is failing to add newly created DICompileUnit to llvm.dbg.cu in the module resulting in verifier assert.

Sergei

+Adrian

If you are doing this work based off LLVM trunk, could you send me your patch to reproduce the problem?

– adrian

This all looks very similar to a bug in the cloning stuff I fixed recently, so would be indeed good to know if this is still happening on master.

Yes, it does for us. My tree is couple days off the tip, and I see it there.

Sergei

Can you send me a patch with instructions to reproduce? I can take a look.

-- adrian

The CompileUnit is not supposed to be duplicated on master: https://github.com/llvm-mirror/llvm/blob/e3e43d9d574cf0a829e9a58525372ba0868a3292/lib/Transforms/Utils/CloneFunction.cpp#L129-L141. Is there no subprogram attached to your function, but somehow the CU is referenced anyway?

Sorry… It takes a pass that was not accepted for upstreaming…. It uses CloneFunctionInto with module level flag on. In the input IR there is a strangely formed (but correct) debug info MD that causes duplication of existing DICompileUnit during cloning, but llvm.dbg.cu is not updated. I got around by a quick cleanup pass that detects the situation and simply adds them in… Something like this:

auto *CUs = F->getParent()->getNamedMetadata(“llvm.dbg.cu”);

if (!CUs)

return;

SmallPtrSet<Metadata *, 2> Listed;

Listed.insert(CUs->op_begin(), CUs->op_end());

for (auto *CU : CUVisited)

if (!Listed.count(CU)) {

auto *Op = dyn_cast(CU);

CUs->addOperand(Op); <<<<<<<<<<<<<<<<<<<<<<<

}

Sorry, I realize this is not much help.

Sergei

The if you are cloning into the same LLVM module the CU should not cloned. If don't mind sharing your code, I can try to help diagnose why the CU gets cloned... just send me a patch that applies to trunk and instructions.

-- adrian

In your example the instructions in the cloned function have debug locations belonging to a different function, and the function itself is missing a DISubprogram metadata attachment.

(lldb) p OldFunc->dump()

; Function Attrs: nounwind optsize
define internal void @f_1.extracted_region(i32, i32*, %struct.t_c*, %struct.t_d*) #0 {
if.end12.extracted_entry:
  %and14 = and i32 %0, 2, !dbg !89
  %tobool15 = icmp eq i32 %and14, 0, !dbg !89
  br i1 %tobool15, label %exit, label %if.then16, !dbg !185

if.then16: ; preds = %if.end12.extracted_entry
  %4 = load i32, i32* %1, align 4, !dbg !186
  %or18 = or i32 %4, 2, !dbg !186
  store i32 %or18, i32* %1, align 4, !dbg !186
  %pps = getelementptr inbounds %struct.t_c, %struct.t_c* %2, i32 0, i32 4, !dbg !188
  %5 = load i32, i32* %pps, align 8, !dbg !188
  %to20 = getelementptr inbounds %struct.t_d, %struct.t_d* %3, i32 0, i32 2, i32 0, i32 0, !dbg !189
  store i32 %5, i32* %to20, align 4, !dbg !190
  %pp = getelementptr inbounds %struct.t_c, %struct.t_c* %2, i32 0, i32 2, !dbg !191
  %6 = load i8, i8* %pp, align 8, !dbg !191
  %us = getelementptr inbounds %struct.t_d, %struct.t_d* %3, i32 0, i32 2, i32 0, i32 1, !dbg !192
  store i8 %6, i8* %us, align 4, !dbg !193
  br label %exit, !dbg !194

exit: ; preds = %if.then16, %if.end12.extracted_entry
  ret void
}

Apparently the Verifier currently doesn't reject this, but this is not valid. If you want the debug info to survive you should create a new DISubprogram for the .extracted_region function and reparent the debug locations of the instructions into it, or you should strip all debug info from the function and its instructions.
Otherwise (as in the example) CloneFunction will not properly seed the metadata value mapper because the DISubprogram is missing. This then causes a deep copy of the debug locations all the way up to the DICompileUnit to be made.

-- adrian

- old Keno
+current Keno

Adrian,

  Thank you for the explanation. The example is produced by yet another pass and I will further debug it there...

Nevertheless, should it not the deep copy of debug locations (once it has created the new DICompileUnit) updated the llvm.dbg.cu in this case?

Sergei

I was just going to say: With well-formed debug info it should create a deep copy up until the DISubprogram, but no further. But because the DISubprogram linked to the Function is missing the special handling of the DISubprogram (that would prohibit cloning the DICompileUnit is side-stepped).
But then I remembered the discussion we had in http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435395.html and now I think that this might actually be legal IR.

With this in mind, the correct behavior for CloneFunction is to not remap any debug metadata (and just attach the original nodes) if it is cloning into the same Module *and* there is not DISubprogram attached to the function.

-- adrian

I haven't tested this, but something like this should work:

diff --git a/lib/Transforms/Utils/CloneFunction.cpp b/lib/Transforms/Utils/CloneFunction.cpp
index 314c990293c..64c3c5371e1 100644
--- a/lib/Transforms/Utils/CloneFunction.cpp
+++ b/lib/Transforms/Utils/CloneFunction.cpp
@@ -50,7 +50,8 @@ BasicBlock *llvm::CloneBasicBlock(const BasicBlock *BB, ValueToValueMapTy &VMap,
   // Loop over all instructions, and copy them over.
   for (BasicBlock::const_iterator II = BB->begin(), IE = BB->end();
        II != IE; ++II) {

Adrian,

  Yes, Indeed something like this work:

    if (!DIFinder && II->getDebugLoc()) {
      auto &MD = VMap.MD();
      MD[II->getDebugLoc()].reset(II->getDebugLoc());
    }

It fixes the test and produces much cleaner IR. My lit tests are also green. I would love to see this addition to the llvm::CloneBasicBlock unless someone has specific objections.

Thank you for helping.

Sergei

Hi!

Sorry I’ve taken so long to reply.

We have used an old version (3.9) of llvm and clang and rebased our code to the newest master changes. Now we are able to emit the IR, but compiling IR with llc leads to following error:

$ …/…/llvm/build/debug/bin/llc test.ll
inlinable function call in a function with debug info must have a !dbg location
call void @_ZN12MyFunnyClassC2EvCloned(%struct.MyFunnyClass* %0)
…/…/llvm/build/debug/bin/llc: test.ll: error: input module is broken!

To reproduce the error, here is a link to our gist: https://gist.github.com/ewokhias/3ec3eb19d5d3a7c1cba5ce58e5040d8c . This is not our actual code, be we’ve created something similar.
In clang we set metadata for each consructor and in our FunctionPass (adapted skeleton pass from: https://github.com/sampsyo/llvm-pass-skeleton/), we create a new function for annotated constructors and use CloneFunctionInto to copy the instructions. In our original code, we add an extra parameter to the copied function, but the error occurs even without this newly added parameter.

Thanks for helping.
BR Matthias

Adrian,

  Would you be willing to apply this patch to master? I do not have enough background to foresee all possible side effects... but I believe it is generally right thing to do in this case.

Sergei