This commit gets us back to pure CLC and fixes offset calculations.
The next commit will re-enable the assembly implementation for R600,
fix bugs related to 64-bit address spaces, and also fix the
incorrect assumption that address space identifiers are the same in
all architectures.
Signed-off-by: Aaron Watry <awatry@gmail.com>
The assembly optimizations were making unsafe assumptions about which address
spaces had which identifiers.
Also, fix 64-bit pointer calculation. This was broken previously for Radeon SI.
This version still only has assembly versions of int/uint 2/4/8/16 for global
loads and stores on R600, but it does it in a way that would be very easily
extended to private/local/constant and could also be handled easily on other
architectures.
v2: 1) Leave v[load|store]_impl.ll in generic/lib
2) Remove vload_if.ll and vstore_if.ll interfaces
3) Fix address+offset calculations
3) Remove offset from assembly arg list
Signed-off-by: Aaron Watry <awatry@gmail.com>
This commit gets us back to pure CLC and fixes offset calculations.
The next commit will re-enable the assembly implementation for R600,
fix bugs related to 64-bit address spaces, and also fix the
incorrect assumption that address space identifiers are the same in
all architectures.
Signed-off-by: Aaron Watry <awatry@gmail.com>
For the series:
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>