Is shortening a load a bug?

When the IR specifies a 32 bit load can it be changed to a narrower load?
What if the load is from memory (e.g. a peripheral) that only supports 32-bit
access?
Consider the following IR:

Hi Brian,

When the IR specifies a 32 bit load can it be changed to a narrower load?
What if the load is from memory (e.g. a peripheral) that only supports 32-bit
access?
Consider the following IR:
----
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:32"
target triple = "thumbv7m-unknown-unknown"
@f = external global i32
define zeroext i8 @bar() nounwind {
L.0:
  %rv.0 = alloca i8
  %0 = load i32* @f
  %1 = trunc i32 %0 to i8
  ret i8 %1
}
----
Which for the arm cortex-m3 generates:
----
bar:
  movw r0, :lower16:f
  movt r0, :upper16:f
  ldrb r0, [r0]
  bx lr
----
Although we are only interested in low 8-bits, the load MUST be a 32-bit load.

I believe this is correct. As long as loading an i8 is legal, I do not think we should bother loading the whole 32 bits.

Why do you think this is a bug?

Thanks,
-Quentin

The phrase "As long as loading an i8 is legal" is the whole point. What if it isn't? How do I specify that (should of specifying the load as volatile, which is overkill)? As an author of a front end, I want to know what the contract is when I say load 32-bits? When I say load 32-bits I mean load 32-bits.

I think it's a bug because there is no good way do avoid it and it breaks device drivers.

brian

This sounds like the *exact* use case for volatile, where the load is
observable in some way other than the result that it is used for. If i8 is
a legal type for the rest of your ISA, then this should be a volatile load.

I agree with Reid.
You should use volatile. Loading a i8 is legal in the ISA.

-Quentin

This is an "if all you have is a hammer, everything looks like a nail"
approach. "load volatile" causes several things to happen:
1. The load size is not changed.
2. The loaded value is not "cached" in a register, i.e., the load can
   never elided. Several loads may be issued even though the value
   of the object does not change.
3. The load may not be moved around another "load volatile" of a different
   object.

All I want is the first thing. Things 2 and 3 are orthogonal and just cause
worse code to be generated. Just because C got this wrong is no reason to
perpetuate history.

Is there a flag that disables this "load narrowing" optimization?

brian