Update PHINode after extracting code

Hi all,
I have problem with ExtractCodeRegion (CodeExtractor.cpp).
My original program is as follows.
bb:

%tmp.15 = load %struct.MYSQL_ROWS** %3, align 4

bb1:

%tmp.1 = load %struct.MYSQL_ROWS** %6, align 4

bb4: ; preds = %bb1, %bb, %entry
%tmp.0 = phi %struct.MYSQL_ROWS* [ null, %entry ], [ %tmp.15, %bb ], [ %tmp.1, %bb1 ]

%tmp.0 is the PHINode whose value is from entry, bb and bb1.
After extracting bb and bb1 into new function, the program becomes

codeRepl: ; preds = %entry
call void @mysql_data_seek_bb(%struct.MYSQL_DATA* %1, i64 %row, %struct.MYSQL_ROWS** %tmp.15.loc, %struct.MYSQL_ROWS** %tmp.1.loc)
%tmp.15.reload = load %struct.MYSQL_ROWS** %tmp.15.loc
%tmp.1.reload = load %struct.MYSQL_ROWS** %tmp.1.loc
br label %bb4

bb4: ; preds = %codeRepl, %entry
%tmp.0 = phi %struct.MYSQL_ROWS* [ null, %entry ], [ %tmp.15.reload, %codeRepl ], [ %tmp.1.reload, %codeRepl ]

bb4 now has only 2 predecessors since bb and bb1 are replaced by codeRepl.
The PHINode in bb4, on the other hand, still has 3 incoming values and that make the assertion failed.
I do want to extract this code region into function. Does anyone have a solution for this?
Thanks a lot.
Vu

I guess I didn’t have a clear question.

Suppose we have BB1 and BB2 both point to BB3.
BB1 has variable x. BB2 also as variable x.
BB3 will have PHINode for x with 2 value from BB1 and BB2.
BB1 BB2
\ /
BB3

Now if BB1 and BB2 is extracted into a function
(using ExtractCodeRegion), they will be replaced by
a basic block called codeRepl (which has a call to the extracted
function).
codeRepl

BB3

The problem is that the PHINode in BB3 still has 2 values x,
both from codeRepl. That’s wrong because BB3 has only 1 successors,
which is codeRepl. In fact, there is NO way to update PhiNode,
because all the x are from ONE node, codeRepl.
So the function ExtractCodeRegion doesn’t work in this example.

My solution for this is before extracting BB1 and BB2, I split the PHINode
in BB3 into another basic block
BB1 BB2
\ /
PhiBB

BB
And extract BB1, BB2 and PhiBB.
If you have any ideas or if you think this will not work, please let me know.
Thanks.
Vu

Hi Vu,

I believe this is the right approach. If I am right and you work on the RegionInfo pass I believe what you want is only to extract so called simpleRegions that have just a single entry and exit edge.

Have a look at the email from Andreas Simbuerger two weeks ago:

[llvm-commits] [PATCH] Add SeSeRegionInfo transform pass

This should contain the functions needed to transform a refined region into a simple region. Either we get this patch committed and you could just use this pass or we add the corresponding functions to RegionInfo and you translate just the regions that you finally will extract.

Cheers
Tobi

Hi Tobias,
If the PHI node at exit block of region R has multiple inputs from the R,
I split the exit block into 2.
The first block contains all PHI nodes whose input are all from the region.
The second is the new exit node.

All branches (outside R) to the old exit now point the new exit.
All regions whose exit is the old exit are also updated with the new exit node.

It works like a charm.
Thanks.
Vu

Great. Did you check with Andreas patch. I believe you also may need to update the entry nodes of all regions that started at the old node to the new node.

Are you going to contribute your region extractor? It would be pretty useful for me.

Cheers
Tobi

Oh, yes.
My algorithm extracts regions in a top-down manner.
I modified your RegionPass so that it only visits 2nd level regions.
Those regions are extracted into functions and the process
continues for the new functions.
It’s kind of messy since I’ve just played with LLVM.
I’ll modify the algorithm so it can extract regions bottom-up.
Vu

Cool. It would be great to integrate this in LLVM. I would be glad to review some patches.

By the way. For what are you using this?

Tobi

I’ll do this for my research project.