Hello,
This patch fixes PR2218. However, I'm not pretty sure that this optimization should be in MemCpyOpt. I think that GVN is good place as well.
Regards
pr2218.patch (6 KB)
Hello,
This patch fixes PR2218. However, I'm not pretty sure that this optimization should be in MemCpyOpt. I think that GVN is good place as well.
Regards
pr2218.patch (6 KB)
Hello,
This patch fixes PR2218.
Very nice. Are you sure this fixes PR2218? The example there doesn't have any loads in it.
However, I'm not pretty sure that this optimization should be in MemCpyOpt. I think that GVN is good place as well.
Yes, you're right. My long term goal is to merge the relevant pieces of memcpyopt into GVN/DSE at some point. To do that, some more major surgery needs to be done to memdep to make it work both backward (to support GVN) and forward (to support DSE). In the meantime, enhancing memcpyopt is fine, so long as there are good testcases (and your patch has them!).
Nit picky stuff about the patch:
+ for(unsigned argI = 0; argI < CS.arg_size(); ++argI) {
+ if(CS.getArgument(argI)->stripPointerCasts() == pointer)
Please put spaces after if/for. You get this right in some places, but wrong in others.
+ Value* pointer = L->getPointerOperand();
Please use "Value *pointer" style throughout. It looks like MemCpyOpt is already really inconsistent about this :-/ but please don't make it worse :).
+ /* Pointer must be a parameter (case 1) */
please use C++ style // comments.
Some other points:
1. I don't see a check that the load isn't volatile, please don't hack volatile loads.
2. The dependency check is relatively expensive, so please reorder it to be checked after the check for alloca:
+ Value* pointer = L->getPointerOperand();
+ MemDepResult dep = MD.getDependency(L);
Hello,
Sorry for my stupid mistakes. I hope that everything is fine now. This patch fixes PR2218. There are no loads in example, however "instcombine" changes memcpy() into store/load.
Regards,
Jakub Staszak
pr2218-2.patch (6.37 KB)
Hi Jakub,
Sorry for the delay, I'm way behind on code review. Generally if you respond quickly, I'll remember enough about the previous email that I can respond quickly too. If it gets out of my brain then it takes me a while to get back to it. I'll try to be better in the future
This patch is looking like a great improvement. Some comments on formatting:
Please pull this out to a helper function:
+ CallSite CS = CallSite::get(C);
Hello,
I fixed my patch as you asked. Sorry for the delay, I'd been working on my SSU patch (http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-August/025347.html)
I hope that everything is fine now.
-Jakub
pr2218-3.patch (7.33 KB)
Hey Jakub,
Thanks for working on this again, one more round
Please merge the three testcases into one file. We added a new FileCheck tool which allows you to check for the exact sequence of instructions expected, which also allows the tests to be merged into one file.
+/// MemCpyOpt::pointerIsParameter - returns true iff pointer is a parameter of
+/// C call instruction.
+bool MemCpyOpt::pointerIsParameter(Value *pointer, CallInst *C, unsigned &argI)
+{
+ CallSite CS = CallSite::get(C);
+ for (argI = 0; argI < CS.arg_size(); ++argI)
Please make this a static function, it doesn't need "this". Also, please do something like this in the for loop:
+ for (argI = 0, CS.arg_size(); argI != argE; ++argI)
pointerIsParameter returning a bool and the argument # byref is somewhat awkward. I think it would be better if it returned an int, which was -1 if not found.
+ MemoryDependenceAnalysis& MD = getAnalysis<MemoryDependenceAnalysis>();
You can sink this link to right before the call to MD.getDependency.
+ // Require pointer to have local dependency.
+ if (!dep.getInst() || dep.isNonLocal())
+ return false;
Random thought: after this patch goes in, it would be interesting to see how many queries get rejected because they are non-local: handling non-local would be pretty easy. Please add a TODO that mentions this.
In any case, the check for dep.isNonLocal() can be removed. Nonlocal queries have a null instruction.
+ } else if (StoreInst *S = dyn_cast<StoreInst>(dep.getInst())) {
...
+ uint64_t ptrSize = AA.getTypeStoreSize(pointer->getType());
...
+ uint64_t memSize = AA.getTypeStoreSize(memPtr->getType());
...
+ if (AA.alias(pointer, ptrSize, memPtr, memSize) != AliasAnalysis::MustAlias)
This is passing in the size of the pointer, not the pointee. However, since you only care about must alias, you don't care about the size of the object, you can just pass in 1 for both sizes.
Please put a comment above the call to AA.alias explaining why you require mustalias here.
+ // Look for a replacement for our pointer. If more than one found, exit.
+ if (!L->hasOneUse())
+ return false;
+ StoreInst *StoreOfLoad = dyn_cast<StoreInst>(L->use_back());
+ if (!StoreOfLoad)
+ return false;
Please do these check before the MemDep query, they are very cheap and will make the optimization run faster (by avoiding queries).
I'm pretty sure that there is still a legality check missing here. Your code will transform this:
%temporary = alloca i8
call void @initialize(i8* noalias sret %temporary)
%tmp = load i8* %temporary
store i8 42, i8* %result
store i8 %tmp, i8* %result
to:
%temporary = alloca i8
call void @initialize(i8* noalias sret %result)
store i8 42, i8* %result
which is a miscompilation. To fix this, I'd do a memdep query of "result" from StoreOfLoad and from L. If they both have the same dependence, then the transform is ok.
+ if (CallInst *C = dyn_cast<CallInst>(dep.getInst())) {
+ CallSite CS = CallSite::get(C);
+ CS.setArgument(argI, Repl);
pointerIsParameter strips bitcasts. Because of that, the type of the call argument may not be the same as the type of "repl", which could cause an assertion failure. We need to either make pointerIsParameter not strip bitcasts, or have this code insert a bitcast when the types don't line up.
-Chris
Please merge the three testcases into one file. We added a new FileCheck tool which allows you to check for the exact sequence of instructions expected, which also allows the tests to be merged into one file.
+/// MemCpyOpt::pointerIsParameter - returns true iff pointer is a parameter of
+/// C call instruction.
+bool MemCpyOpt::pointerIsParameter(Value *pointer, CallInst *C, unsigned &argI)
+{
+ CallSite CS = CallSite::get(C);
+ for (argI = 0; argI < CS.arg_size(); ++argI)Please make this a static function, it doesn't need "this". Also, please do something like this in the for loop:
+ for (argI = 0, CS.arg_size(); argI != argE; ++argI)
pointerIsParameter returning a bool and the argument # byref is somewhat awkward. I think it would be better if it returned an int, which was -1 if not found.
The problem is that CS.arg_size() is "unsigned int". Of course, I can cast int -> unsigned int but it wouldn't look good I think.
+ MemoryDependenceAnalysis& MD = getAnalysis<MemoryDependenceAnalysis>();
You can sink this link to right before the call to MD.getDependency.
+ // Require pointer to have local dependency.
+ if (!dep.getInst() || dep.isNonLocal())
+ return false;Random thought: after this patch goes in, it would be interesting to see how many queries get rejected because they are non-local: handling non-local would be pretty easy. Please add a TODO that mentions this.
In any case, the check for dep.isNonLocal() can be removed. Nonlocal queries have a null instruction.
OK, i'll work on it.
+ } else if (StoreInst *S = dyn_cast<StoreInst>(dep.getInst())) {
...
+ uint64_t ptrSize = AA.getTypeStoreSize(pointer->getType());
...
+ uint64_t memSize = AA.getTypeStoreSize(memPtr->getType());
...
+ if (AA.alias(pointer, ptrSize, memPtr, memSize) != AliasAnalysis::MustAlias)This is passing in the size of the pointer, not the pointee. However, since you only care about must alias, you don't care about the size of the object, you can just pass in 1 for both sizes.
Ahh, last time you said I had to check sizes. You said:
"Knowing that the loaded and stored pointer must alias is important, but you also need to check that the loaded and stored sizes equal each other."
Please put a comment above the call to AA.alias explaining why you require mustalias here.
+ // Look for a replacement for our pointer. If more than one found, exit.
+ if (!L->hasOneUse())
+ return false;
+ StoreInst *StoreOfLoad = dyn_cast<StoreInst>(L->use_back());
+ if (!StoreOfLoad)
+ return false;Please do these check before the MemDep query, they are very cheap and will make the optimization run faster (by avoiding queries).
I'm pretty sure that there is still a legality check missing here. Your code will transform this:
%temporary = alloca i8
call void @initialize(i8* noalias sret %temporary)
%tmp = load i8* %temporary
store i8 42, i8* %result
store i8 %tmp, i8* %resultto:
%temporary = alloca i8
call void @initialize(i8* noalias sret %result)
store i8 42, i8* %resultwhich is a miscompilation. To fix this, I'd do a memdep query of "result" from StoreOfLoad and from L. If they both have the same dependence, then the transform is ok.
See PR2218-dont.ll test.
-Jakub
Please make this a static function, it doesn't need "this". Also, please do something like this in the for loop:
+ for (argI = 0, CS.arg_size(); argI != argE; ++argI)
pointerIsParameter returning a bool and the argument # byref is somewhat awkward. I think it would be better if it returned an int, which was -1 if not found.
The problem is that CS.arg_size() is "unsigned int". Of course, I can cast int -> unsigned int but it wouldn't look good I think.
The typical way to do this is to return it as int, then at the call site do something like:
int V = foo();
if (V == -1) handle error
unsigned RealVal = (unsigned)V;
That way, it is very explicit what is going on.
+ } else if (StoreInst *S = dyn_cast<StoreInst>(dep.getInst())) {
...
+ uint64_t ptrSize = AA.getTypeStoreSize(pointer->getType());
...
+ uint64_t memSize = AA.getTypeStoreSize(memPtr->getType());
...
+ if (AA.alias(pointer, ptrSize, memPtr, memSize) != AliasAnalysis::MustAlias)This is passing in the size of the pointer, not the pointee. However, since you only care about must alias, you don't care about the size of the object, you can just pass in 1 for both sizes.
Ahh, last time you said I had to check sizes. You said:
"Knowing that the loaded and stored pointer must alias is important, but you also need to check that the loaded and stored sizes equal each other."
This is a different issue. Previously you would trigger when the load and store were of a pointer that could have been bitcast. You need to check that the load and store have the same size, but that doesn't impact the alias query.
I'm pretty sure that there is still a legality check missing here. Your code will transform this:
%temporary = alloca i8
call void @initialize(i8* noalias sret %temporary)
%tmp = load i8* %temporary
store i8 42, i8* %result
store i8 %tmp, i8* %resultto:
%temporary = alloca i8
call void @initialize(i8* noalias sret %result)
store i8 42, i8* %resultwhich is a miscompilation. To fix this, I'd do a memdep query of "result" from StoreOfLoad and from L. If they both have the same dependence, then the transform is ok.
See PR2218-dont.ll test.
What do you mean?
-Chris