Our arch has 8 and 16 bit stores that do a read-modify-write on 32 bit. This means that we need to specify a latency between two such instructions that may touch the same memory.
I will usually have an output dependence edge that I can find to ‘dagmutate’ the latency, fine.
I’m afraid that alias analysis may be able to disambiguate based on constant offset differences between accesses on the same object, losing me that output edge.
I’d like to tell dependence analysis that these stores access 32 bit.
I have marked the instruction to be both a load and a store; ultimately I would like to specify different load and store bitwidths as well.
My question: What is the best way to widen the access width of loads and stores from the default?