[RFC] [MLIR] [Vector] Constant Folding Vector Reduction (Splat-Splat)

Nice example.

So the fold can actually do the computation without folding a + … + a into n * a. It can just loop over the value and do the partial reduction. This may take a long (compilation)time, so it can be limited to N below a threshold. If this is how we do it: doing the partial reduction in compile time, then the results would be the same.

Is there any reason not to do it then?