Stupid '-load-vn -licm' question (LLVM 1.6)

Hello! I'm compiling code which uses pointers as iterators. For some reason--probably a silly misunderstanding of the docs--I can't eliminate duplicate pointer loads. I'll probably figure this out eventually, but if somebody else sees the answer instantly, I certainly won't complain. :slight_smile:

Here are the optimizers I'm running:

   opt -f -simplifycfg -dce -instcombine -anders-aa -load-vn -licm -o regex-opt.bc regex.bc

The duplicate loads appear at the top of the %regex6 and %regex2 blocks below. I've tried various alias analysis implementations either alone or in combination.

Any thoughts? I'm sure that this will prove embarrassingly obvious. :slight_smile:

Cheers,
Eric

; ModuleID = 'regex-opt.bc'

implementation ; Functions:

bool %matches(ubyte* %begin, ubyte* %end) {
entry:
         %scan_end = getelementptr ubyte* %end, int -1 ; <ubyte*> [#uses=1]
         br label %loop_test

regex6: ; preds = %loop_test
         %c8 = load ubyte* %iter ; <ubyte> [#uses=1]
         %matches9 = seteq ubyte %c8, 97 ; <bool> [#uses=1]
         br bool %matches9, label %ret_true, label %regex2

regex2: ; preds = %regex6
         %c = load ubyte* %iter ; <ubyte> [#uses=1]
         %matches = seteq ubyte %c, 98 ; <bool> [#uses=1]
         br bool %matches, label %ret_true, label %loop_step

loop_step: ; preds = %regex2
         %next = getelementptr ubyte* %iter, int 1 ; <ubyte*> [#uses=1]
         br label %loop_test

loop_test: ; preds = %loop_step, %entry
         %iter = phi ubyte* [ %begin, %entry ], [ %next, %loop_step ] ; <ubyte*> [#uses=4]
         %done = setgt ubyte* %iter, %scan_end ; <bool> [#uses=1]
         br bool %done, label %ret_false, label %regex6

ret_true: ; preds = %regex2, %regex6
         ret bool true

ret_false: ; preds = %loop_test
         ret bool false
}

Hello! I'm compiling code which uses pointers as iterators. For some reason--probably a silly misunderstanding of the docs--I can't eliminate duplicate pointer loads. I'll probably figure this out eventually, but if somebody else sees the answer instantly, I certainly won't complain. :slight_smile:

There are no stupid questions. Some are just easier to answer than others :slight_smile:

Here are the optimizers I'm running:

opt -f -simplifycfg -dce -instcombine -anders-aa -load-vn -licm -o regex-opt.bc regex.bc

The duplicate loads appear at the top of the %regex6 and %regex2 blocks below. I've tried various alias analysis implementations either alone or in combination.

LICM doesn't remove common subexpressions, also -load-vn doesn't affect LICM. Try "-licm -load-vn -gcse" instead of "-load-vn -licm"

-Chris

Any thoughts? I'm sure that this will prove embarrassingly obvious. :slight_smile:

Cheers,
Eric

; ModuleID = 'regex-opt.bc'

implementation ; Functions:

bool %matches(ubyte* %begin, ubyte* %end) {
entry:
       %scan_end = getelementptr ubyte* %end, int -1 ; <ubyte*> [#uses=1]
       br label %loop_test

regex6: ; preds = %loop_test
       %c8 = load ubyte* %iter ; <ubyte> [#uses=1]
       %matches9 = seteq ubyte %c8, 97 ; <bool> [#uses=1]
       br bool %matches9, label %ret_true, label %regex2

regex2: ; preds = %regex6
       %c = load ubyte* %iter ; <ubyte> [#uses=1]
       %matches = seteq ubyte %c, 98 ; <bool> [#uses=1]
       br bool %matches, label %ret_true, label %loop_step

loop_step: ; preds = %regex2
       %next = getelementptr ubyte* %iter, int 1 ; <ubyte*> [#uses=1]
       br label %loop_test

loop_test: ; preds = %loop_step, %entry
       %iter = phi ubyte* [ %begin, %entry ], [ %next, %loop_step ] ; <ubyte*> [#uses=4]
       %done = setgt ubyte* %iter, %scan_end ; <bool> [#uses=1]
       br bool %done, label %ret_false, label %regex6

ret_true: ; preds = %regex2, %regex6
       ret bool true

ret_false: ; preds = %loop_test
       ret bool false
}

_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-Chris

That works! Thank you. The bytecodes look really good--not only are the loads eliminated, but the tests are actually reduced to a switch statement using '-anders-aa -load-vn -gcse -simplifycfg -instcombine':

regex6: ; preds = %loop_test
         %c8 = load ubyte* %iter ; <ubyte> [#uses=1]
         switch ubyte %c8, label %loop_step [
                  ubyte 97, label %ret_true
                  ubyte 98, label %ret_true
         ]

Unfortunately, this generates really weird code on the LLVM 1.6 PowerPC backend:

LBB_matches_1: ; regex6
         lbz r4, 0(r3)
LBB_matches_2: ; NodeBlock
         rlwinm r5, r4, 0, 24, 31
         cmplwi cr0, r5, 98
         blt cr0, LBB_matches_4 ; LeafBlock
LBB_matches_3: ; LeafBlock1
         rlwinm r4, r4, 0, 24, 31
         cmpwi cr0, r4, 98
         beq cr0, LBB_matches_8 ; ret_true
         b LBB_matches_5 ; NewDefault
LBB_matches_4: ; LeafBlock
         rlwinm r4, r4, 0, 24, 31
         cmpwi cr0, r4, 97
         beq cr0, LBB_matches_8 ; ret_true
LBB_matches_5: ; NewDefault
LBB_matches_6: ; loop_step

I'm particularly confused by the rlwinm instructions that keep turning up in PowerPC output, and the double test against 98. I don't have a problem or anything; I'm just trying to figure out what's going on. :slight_smile:

In any case, many thanks for your help!

Cheers,
Eric

Ah! The backend is running -lowerswitch, which does a binary search of the case labels (even when there's only two of them).

If I run -lowerswitch manually, though, I *do* get some very nice results out of -cee. :slight_smile: It's automatically killing some backtrack nodes in my CFG, which is a big (and unexpected) win.

Overall, I'm extremely impressed with the LLVM optimizer framework--even when it does something weird, it's easy to dive in and read the code. In the past, I've spent months working on compiler backends before feeling this comfortable.

Cheers,
Eric

FWIW, the rlwinm instructions are because the code generator isn't keep track, across basic blocks, that the byte value, stored in a 32-bit register, has its top part already zeroed.

The redundant compares against 98 are actually different compares (lt vs eq) and LLVM isn't CSE'ing them in this case.

FWIW, Nate is actually currently working on replacing the lower switch pass. :slight_smile:

-Chris

I'm particularly confused by the rlwinm instructions that keep turning up in PowerPC output, and the double test against 98. I don't have a problem or anything; I'm just trying to figure out what's going on. :slight_smile:

Ah! The backend is running -lowerswitch, which does a binary search of the case labels (even when there's only two of them).

yup

If I run -lowerswitch manually, though, I *do* get some very nice results out of -cee. :slight_smile: It's automatically killing some backtrack nodes in my CFG, which is a big (and unexpected) win.

Great. Note that -cee is a beta pass... it has some known bugs and isn't planned to be worked on in the immediate future. :frowning:

Overall, I'm extremely impressed with the LLVM optimizer framework--even when it does something weird, it's easy to dive in and read the code. In the past, I've spent months working on compiler backends before feeling this comfortable.

Great! I'm glad to hear that!

-Chris

After a bit of digging, I found one open bug and two failing test cases, all of which appear to be feature requests:

   #217: LLVM needs a generic dominator update mechanism
   nullpointer.ll: "a load or store of a pointer indicates that the pointer is not null."
   looptest.ll: "Note that this is a "feature" test, not a correctness test."

All the other test cases pass, and none of the other unassigned bugs mention CEE. (I checked Bugzilla by hand--since searching for "correlated" doesn't even find bug #217--so I might have missed something.) Google doesn't seem to turn up anything, either.

Does -cee actually produce incorrect code for some inputs? Or is it merely incomplete?

I'm looking for an optimizer pass to tweak (as a learning exercise), and if -cee isn't too hairy, I'd be happy to add support for 'switch' instructions and fix a few minor bugs. :slight_smile:

Cheers,
Eric

Great. Note that -cee is a beta pass... it has some known bugs and isn't planned to be worked on in the immediate future. :frowning:

After a bit of digging, I found one open bug and two failing test cases, all of which appear to be feature requests:

#217: LLVM needs a generic dominator update mechanism
nullpointer.ll: "a load or store of a pointer indicates that the pointer is not null."
looptest.ll: "Note that this is a "feature" test, not a correctness test."

Yup.

All the other test cases pass, and none of the other unassigned bugs mention CEE. (I checked Bugzilla by hand--since searching for "correlated" doesn't even find bug #217--so I might have missed something.) Google doesn't seem to turn up anything, either.

Does -cee actually produce incorrect code for some inputs? Or is it merely incomplete?

As I mentioned on IRC, I don't really remember. I thought it did miscompile some cases, but it has been too long for me to remember. Running it through llvm-test would be a good way to find out :slight_smile:

I'm looking for an optimizer pass to tweak (as a learning exercise), and if -cee isn't too hairy, I'd be happy to add support for 'switch' instructions and fix a few minor bugs. :slight_smile:

Sounds great!

-Chris