Proposing a small change to the SelectionDAG class to support recover from LLVM IR

Hello, llvm-dev.
    I am trying to commit in the following weeks the back end I wrote for our research lab's Connex SIMD processor, a promising vector processor which we hope to make it visible (please see if you have the time these links related to the Connex processor:, ).

     But, since our back end has the exotic feature of handling symbolic immediate assembler operands to achieve vector-length portability by using JIT assembling I had to make a small change to the SelectionDAG class which I now would like to commit on Github.

     So I come to ask you what do you think about adding this useful small feature in the SelectionDAG class (a DenseMap<const Value*, SDValue> *crtNodeMapPtr datastructure and a bit of book-keeping for it).

 From \(my previous attempt to commit the Connex backend\):
 &gt;      As written at
 &gt;         There is one thing a bit more special with this back end, namely our back
 &gt; end handles symbolic immediate operands \(C/C\+\+ expressions written as strings in      INLINEASM MachineInstrs\)\. This means the back end can output vector assembly code like:
 &gt;         VLOAD Reg0, N \* 10 \+ 5 // where N is a variable in the original C program,

       which means that to assign to register Reg0 the value N*10 + 5. This instruction is later JIT assembled (at runtime) in order to have a constant immediate operand equal to
N*10 + 5
> Therefore, in order to support recovering from a SelectionDAG node to LLVM IR (and then all the way back to the original source C/C++ code), something possible with the current LLVM distribution, which is implemented in our Connex LLVM compiler project, we require adding a simple data structure in the LLVM source file:
> include/llvm/CodeGen/SelectionDAG.h (and helper methods in the related
> SelectionDAG.cpp file)
> that maps an SDValue to the LLVM IR Value object it was used to translate from called:
> DenseMap<const Value*, SDValue> *crtNodeMapPtr .

     Roman Lebedev asked me about this new feature that I propose, at [llvm-dev] [GSoC] Supporting Efficiently the Shift-vector Instructions of the Connex Vector Processor
     > I'm not sure how that is supposed to work. What if IR does not originate from C?
     > How do you verify that the "C string" is in the form that will be understood
     > by whatever will handle it later on? Is there a clang part of the patch?
    My answer to his question is: the IR originates from C and there is no clang patch, since there is no need to alter clang. To recover a C/C++ expression from SelectionDAG node to LLVM IR (and then all the way back to the original source C/C++ code) we only need to add this crtNodeMapPtr data structure. The recovery is performed by walking recursively on the LLVM IR code starting from the immediate operand of the assembler instruction until the end: the moment we visit an LLVM IR add instruction we simply put in the resulting C/C++ string we create a "+" operator.

    This recovery from SelectionDAG node to LLVM IR and then to C/C++ is working very well in my LLVM build. It is well tested on a few tenths of small benchmarks.

    Please let me know if you have any questions related to this proposed small change to the SelectionDAG class. You can find some more explanation on how we recover to C/C++ in my PhD thesis: see Section 4.1.8 ("Symbolic Scalar Immediate Operands") at .

   Thank you,

Hi Alex,

I’m not sure I understand how this map works. A single Value * may have a StructType which is initially represented by a MERGE_VALUES node. But that node will be removed in DAGCombine by connecting each operand of the MERGE_VALUES to the users of each of the results. At that point there is no single SDValue that represents the Value *. How does the map model this?

Hi, Craig,
     The added map to the SelectionDAG class was meant to help my own back end.
     The crtNodeMapPtr map is not meant as it is to model the case with several SDValues that represent a Value* (the example with StructType you provide). If we would be interested to care about this case then maybe we should use C++'s std::unordered_multimap instead of a DenseMap - would you be interested in having that changed like this:
     std::unordered_multimap<const Value*, SDValue> *crtNodeMapPtr ?

     I have to add that I do not care about such complex cases as the one from your example: I have only 2 assembler instructions that check for crtNodeMapPtr (just for information, one Connex vector instruction is called VLOAD and the other REPEAT).

     Please note that I also changed besides the source file lib/CodeGen/SelectionDAG/SelectionDAG.cpp, the following source files a bit (please see

   Thank you very much,

For your purposes, would it be possible to have a target specific IR pass before SelectionDAG that finds the information you care about and creates the inline assembly instruction that you need? Or maybe some other target specific intrinsic that captures the information in a form your backend could use?

Craig, sorry for the late reply.

 Quite frankly even if I would create another pass before the SelectionDAG, I still believe I would end up adding to the SelectionDAG class a map between Value\* and SDValues\. Anyhow I need to &quot;intercept&quot; the cases when a Value\* is basically translated into a SDValue: this basically happens in all the methods SelectionDAGBuilder::visit\*\(\) from SelectionDAGBuilder\.cpp \(and also some update in the DAGCombiner::Run\(\) method from DAGCombiner\.cpp\)\. Since I really have to change the SelectionDAGBuilder class I decided to declare the map in the SelectionDAG class \(source file SelectionDAG\.h\)\.
 Adding another pass before the SelectionDAG pass does not seem to allow me to do something similar to inserting elements in the map from inside the methods SelectionDAGBuilder::visit\*\(\)\.

     I personally prefer keeping to my current implementation and ask the LLVM community to accept my changes (the number of changes is around 150 changes - the number is not actually a big problem) in SelectionDAGBuilder, DAGCombiner and SelectionDAG classes.

   Thank you very much,


     I personally prefer keeping to my current implementation and ask the LLVM community
to accept my changes (the number of changes is around 150 changes - the number is not
actually a big problem) in SelectionDAGBuilder, DAGCombiner and SelectionDAG classes.

This is not how the review process works, you cannot just throw out
the comments and the # of LOCs does not matter. Your changes affect
all the targets and impose both memory- and compile-time increase (as
you need to track the values in this map), however, only a single out
of the tree target would benefit from it. It seems that you could
solve this particular problem in other ways as Craig suggested – there
are multiple ways how you could capture the necessary information
without pessimizing other targets. Roman's concerns were not answered
as well – you cannot rely that IR comes from a single source (e.g.
generated by clang).

Also I do not see why you can't solve this problem on SDAG level – it
seems that everything is much more simply there and you'd just need to
carefully pattern match the operands of your inline asm. After all,
many targets have complex addressing models allowing strides, offsets,
etc. and it seems here the situation is pretty much the same.