[PATCH] vload/vstore: Use casts instead of scalarizing everything in CLC version

This generates bitcode which is indistinguishable from what was
hand-written for int32 types in v[load|store]_impl.ll.

v3: Also remove unused generic/lib/shared/v[load|store]_impl.ll
v2: (Per Matt Arsenault) Fix alignment issues with vector load stores

Signed-off-by: Aaron Watry <awatry@gmail.com>

This generates bitcode which is indistinguishable from what was
hand-written for int32 types in v[load|store]_impl.ll.

Is this also the case for the nvptx target?

Jeroen