Status of llvm.experimental.vector.reduce.* intrinsics

I am currently working on a transformation pass that transforms
masked.load and masked.store intrinsics to (hopefully) increase
performance on targets where masked.load and masked.store are not legal.
To check if the loads and stores are necessary at all I take the mask
for the masked operations and want to reduce them to a single value.
vector.reduce.or seemed very handy to do the job.

I will take a look into the function you suggested. Maybe I can come up
with something that drives the development of these intrinsics ahead.

Cheers,
Michael

Actually for mask vectors of i1 values, you don’t need to use reductions at all(although for SVE this is what we’ll do). You can instead bitcast the vector value to an i8/i16/whatever and then compare against zero.

Amara

I assume smaller types like <4 x i1> are getting zero extended to e.g., i8?

It may not be related, but there was a talk on EuroLLVM regarding i1 types on x86 vector expansion with some pitfalls. I recommend you to have a look.

Is the aarch64 error an assert/internal one? If so, we may need better error handling…

Bitcasting is only valid between types of the same size, so you can bitcast to i4 and then directly do a cmp i4 %castval, 0 etc.

Amara

Thanks, I already found it out the hard way :wink: Now it works and looks
nice and shiny.

Michael