I’m currently investigating broadcasts from the constant pool on Sandy Bridge. I see this comment in llvm/lib/Target/X86/X86ISelLowering.cpp:
// Handle the broadcasting a single constant scalar from the constant pool
// into a vector. On Sandybridge it is still better to load a constant vector
// from the constant pool and not to broadcast it from a scalar.
Would anyone be able to explain why it is better to load a vector from the constant pool rather than broadcast a scalar?
I checked out Agner Fog’s tables, but it wasn’t so obvious to me…
vmovaps y, m256: Uops: 1 Lat: 4 Throughput: 1 vbroadcastsd y, m64: Uops: 2 Lat: [Not or cannot be measured] Throughput: 1
Thanks in advance,