So,
both GVN’s we have mainly exist to eliminate redundancies, they don’t perform general transforms.
So in your example, it would be “easy” to make NewGVN see through the memcpys
ie
teach it that memcpy(a, b, full size of b and a) means a = b, and propagate it through, and whatever goodness this entails.
It also currently knows how to coerce loads of memset/memcpy to pull the value out of the original piece, even when they are not full size memcpy.
This requires instruction insertion in non-constant cases.
So it only does this for constants ATM (but I have code to extend it to the general case and it works fine, it’s in my patch list).
The infrastructure i’m about to commit tomorrow also makes it know that it can use alternative “fake” instructions to represent values, and make the instructions real later.
So for example, for
if (foo)
a = 5, b = 2
else
b = 5, a = 2
return a + b
it knows that a+b is equivalent to 7
and when faced with:
if (foo)
a = 5, b = 2
else
b = 5, a = 3
return a+b
it knows that a+b is equivalent to phi(7, 8) (which doesn’t exist in the original program), and that it can insert that phi to eliminate c.
You could use that infrastructure to teach it that the memcpy’s are equivalent to stores that it could insert.
I would actually start by teaching it about memcpy in general, and that full size memcpy implies value equivalence (this would require wiring it into performSymbolicCallEvaluation), that store of a memcpy target and memcpy are equivalent when they copy/store the same value (see the FIXME in performSymbolicStoreEvaluation).
If you want to teach it to try to see memcpy’s as stores, that’s possible too.
It’s a bit trickier because NewGVN is ruthless. It terminates only at fixpoint. So you have to make the conditions under which you do things like “see memcpy as store” totally deterministic and ordered, or someone will come up with a testcase where we bounce forever between two the possible values for the same thing.
We have a bunch of verifiers to try to help with this, but as Davide can tell you, it can be a workout sometimes
At this point, in the base, it’s really rare to be able to trigger this behavior (and all cases i’m aware of involve undef).
But if you add symbolic evaluation code, it usually takes a few tries to get it right.