Hi Karl, Roman,
> I was looking into how the global optimization pass fares against
> things like what's reported in
I need to take a closer look but I would have expected BasicAA to be
able to determine that `do_log` and `R` cannot alias. In the -Os version
(lower right here Compiler Explorer), the write to `R`
clobbers the read from `do_log` which prevents us from removing the
load/store pair. My reasoning would have been that we know the size of
`do_log` to be less than the size accessed via `R`. What exactly goes
wrong or if my logic is flawed needs to be examined. I would start
looking at the debug generated by the code parts touched here:
> Looking at this, I think it would be pretty trivial to optimize that
> down given that there are already threading assumptions made:
> Compiler Explorer
Optimizing more aggressively based on forward process guarantees will
get us in more trouble than we are already in. I don't have the link
handy but as far as I remember the proposed solution was to have a
forward process guarantee function attribute. I would recommend we look
into that first before we start more aggressive optimizations which will
cause problems for a lot of (non C/C++) folks.
> Is this something I can look into?
> Another thing is that currently *all* external calls break this
> optimization, including calls to intrinsics that probably shouldn't:
> Compiler Explorer
I think during load propagation, there is a legality check "here's a
load, and here's a store.
Is there anything in between that may have clobbered that memory location?".
Right now we only have `__attribute__((pure/const))` but we want to
expose all LLVM-IR attributes to the user soon  which will allow way
more fine-grained control. Intrinsics are a different story again.
For calls, there are some attributes that are helpful here:
So in this case, i guess `@llvm.x86.flags.write` intrinsic maybe can
be annotated with readonly attribute,
thus signalling that it won't clobber that memory location?
While target specific intrinsics are a bit more complicated we see the
problem often with generic intrinsic already. We proposed the other day
 to change the default semantics of non-target specific intrinsics
such that you have to opt-in for certain effects.
For the above example you want `llvm.x86.flags.write` to be `writeonly` and
`inaccesiblememonly`. Also `nosync`, `willreturn`, ...
 2019 LLVM Developers’ Meeting: W. Moses “Cross-Translation Unit Optimization via Annotated Headers” - YouTube