Handling of DebugLocs during CSE of SelectionDAG nodes.

I've been investigating a case with the XCore target (which doesn't use
FastISel) where the DWARF line number emitted at -O0 results in the xgdb
visiting source lines in an unexpected order. I've tracked down the
problem to the handling of DebugLocs in the selection DAG, in the getNode
method shown bellow.

It first tries to find if a similar node already exists in order to use that one
otherwise it creates a new one. The problem of this approach is that it will
wrongly use the line number debug info from the first node it will find. I
can not see any reasonable way of dealing with that, but only dropping the
line number info from that node in order to keep the debugger behaving
correctly. Is that the correct approach? Are there any other suggestions?

02367 /// getNode - Gets or creates the specified node.
02368 ///
02369 SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL, EVT VT) {
02370 FoldingSetNodeID ID;
02371 AddNodeIDNode(ID, Opcode, getVTList(VT), 0, 0);
02372 void *IP = 0;
02373 if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
02374 return SDValue(E, 0);
02375
02376 SDNode *N = new (NodeAllocator) SDNode(Opcode, DL, getVTList(VT));
02377 CSEMap.InsertNode(N, IP);
02378
02379 AllNodes.push_back(N);
02380 #ifndef NDEBUG
02381 VerifySDNode(N);
02382 #endif
02383 return SDValue(N, 0);
02384 }

Kyriakos,
We have this same problem with the AMDIL target. The only way to get around it that we have found, and it doesn't work 100% of the time, is to disable all the combine operations. The problem with doing this in a generic way is disabling DAG.combine breaks the x86 backend and generates really weird code in many cases.

Micah

Only other alternative is to avoid falling back on DAG, at -O0, as much as possible. If … FastISel handles everything then you completely sidestep DAG nodes at -O0.

Compile time at -O0 isn't a high priority for us since, so if we did implement FastISel the only reason would be to improve debug information at -O0. Since this results in more code in the XCore backend that needs to be maintained over time I'd rather explore other options first.

Is there a view on which of the following approaches is better in general when two nodes with debug locations are merged because of CSE (either in the MachineCSE pass or because of the CSEMap in the SelectionDAG):

1) Use the DebugLoc of either of the pair of nodes (chosen arbitrarily).
2) Throw away the line number information on the merged node since it is not possible to preserve the information that operation is associated with multiple lines.

The current approach seems to be 1). Would you consider patches to change the behavior to 2)?

In the cases we have seen at -O0 for each of the pair of nodes to be merged there are also other nodes in the DAG with the same debug locations that won't be merged. Therefore if the line numbers for merged nodes were dropped it would still be possible to set breakpoints on each source line. It would also avoid many cases where the debugger jumps forward and backward when skipping.

Above -O0 I guess there would be fewer cases where the debugger jumps forward or backward, but there may be more cases where there are no instructions associated with a line and so lines will be skipped entirely.

I've been investigating a case with the XCore target (which doesn't use
FastISel) where the DWARF line number emitted at -O0 results in the xgdb
visiting source lines in an unexpected order. I've tracked down the
problem to the handling of DebugLocs in the selection DAG, in the getNode
method shown bellow.

It first tries to find if a similar node already exists in order to use that one
otherwise it creates a new one. The problem of this approach is that it will
wrongly use the line number debug info from the first node it will find. I
can not see any reasonable way of dealing with that, but only dropping the
line number info from that node in order to keep the debugger behaving
correctly. Is that the correct approach? Are there any other suggestions?

Only other alternative is to avoid falling back on DAG, at -O0, as much as possible. If … FastISel handles everything then you completely sidestep DAG nodes at -O0.

-
Devang

Compile time at -O0 isn't a high priority for us since, so if we did implement FastISel the only reason would be to improve debug information at -O0. Since this results in more code in the XCore backend that needs to be maintained over time I'd rather explore other options first.

Is there a view on which of the following approaches is better in general when two nodes with debug locations are merged because of CSE (either in the MachineCSE pass or because of the CSEMap in the SelectionDAG):

1) Use the DebugLoc of either of the pair of nodes (chosen arbitrarily).
2) Throw away the line number information on the merged node since it is not possible to preserve the information that operation is associated with multiple lines.

The current approach seems to be 1). Would you consider patches to change the behavior to 2)?

Yes, if the patches are enabled only at -O0. In other words, throwing away line numbers for merged node to smoothen debugging experience at -O0 is OK, but don't want to lose line number if optimization is enabled.

At -O0, SelectionDAG nodes are scheduled in source order when possible.

When merging identical nodes, you should make sure that the resulting node has the earlier source order, and the debug location should match.

That way, CSE simply looks like the last definition of a value has been removed. Debug location and source order of the first definition should be preserved.

/jakob

hmm… I agree with Jakob. See SelectionDAG::GetOrdering()

Thanks for your replies. That does seem like a better approach if that kind of information is easily available.