MergeConsecutiveStores for FP types

Hi,

I’m trying to fix MergeConsecutiveStores for FP immediates. Currently it doesn’t really work because of an ordering problem, since this combine is only enabled for the first run of DAGCombiner before types are legal, with a comment that this is to avoid running it on every iteration presumably for compile time reasons.

When storing an FP constant, visitSTORE replaces this with a store of an integer instead. This won’t necessarily have happened for all stores on the chain before MergeConsecutiveStores eventually decides to merge them all. MergeConsecutiveStores then sees stores of different types, and won’t merge them.

If I allow MergeConsecutiveStores to run for every DAG combine phase, it successfully merges all of the stores in my simple test cases in the second iteration. I’m not sure what the best solution to fix this is. Would it be acceptable to just allow MergeConsecutiveStores combine to run another time after the DAG is legalized? Running it again after legalize would also help with the non-constant cases after all FP stores have been promoted to integers (on R600). Alternatively, would it be better to try to have MergeConsecutiveStores do the store type replacement when it sees another store of ConstantFP, or somehow change the worklist order?

Another option I considered was to delay the store type replacement, but this just reveals the code in MergeConsecutiveStores to handle FP constant stores is actually broken (although trivially fixable). I also don’t really want this, since I would like stores of mixed types to be merged.

-Matt

From: "Matt Arsenault" <arsenm2@gmail.com>
To: "llvmdev" <llvmdev@cs.uiuc.edu>
Sent: Saturday, May 23, 2015 2:43:00 PM
Subject: [LLVMdev] MergeConsecutiveStores for FP types

Hi,

I’m trying to fix MergeConsecutiveStores for FP immediates. Currently
it doesn’t really work because of an ordering problem, since this
combine is only enabled for the first run of DAGCombiner before
types are legal, with a comment that this is to avoid running it on
every iteration presumably for compile time reasons.

I recommend taking compile-time measurements with it running in every iteration so that we can see what the actual cost is.

-Hal