PROPOSAL: struct-access-path aware TBAA

Maybe I were not clear in my previous mail, or maybe we are talking about
different topics.

My point was, if the pair of memory accesses are direct access, in most
cases, the base/offset/size
should be able to sufficient to figure out if they are alias or not.

It's about 60% of cases, the last time I ran the numbers on GCC.
Base + offset isn't as useful as it used to be given how today's
languages are lowered.

A
tricky case would be
subscripped variables, like a.b[i][j] vs a.c[m][n] where a.c and b.c are
enclosed in a common union.
If mem-dep test failure to figure out if there is dependence, in that case,
TBAA might help.

It seems what we discussing here is somewhat deviating the original
topic:-)

Yes :slight_smile:

Based on discussions with John McCall

We currently focus on field accesses of structs, more specifically, on fields that are scalars or structs.

Fundamental rules from C11
--------------------------
An object shall have its stored value accessed only by an lvalue expression that has one of the following types: [footnote: The intent of this list is to specify those circumstances in which an object may or may not be aliased.]
1. a type compatible with the effective type of the object,
2. a qualified version of a type compatible with the effective type of the object,
3. a type that is the signed or unsigned type corresponding to the effective type of the object,
4. a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
5. an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
6. a character type.

Example
-------
struct A {
int x;
int y;
};
struct B {
A a;
int z;
};
struct C {
B b1;
B b2;
int *p;
};

Type DAG:
int <- A::x <- A
int <- A::y <- A <- B::a <- B <- C::b1 <- C
int <----------------- B::z <- B <- C::b2 <- C
any pointer <--------------------- C::stuck_out_tongue: <- C

The type DAG has two types of TBAA nodes:
1> the existing scalar nodes
2> the struct nodes (this is different from the current tbaa.struct)
A struct node has a unique name plus a list of pairs (field name, field type).
For example, struct node for "C" should look like
!4 = metadata !{"C", "C::b1", metadata !3, "C::b2", metadata !3, "C::p", metadata !2}
where !3 is the struct node for "B", !2 is pointer type.

Given a field access
struct B *bp = ...;
bp->a.x = 5;
we annotate it as B::a.x.

In the case of multiple structures containing substructures, how are
you differentiating?

IE given

struct A {
struct B b;
}
struct C {
struct B b;
}

How do you know the above struct B *bp =...; is B::b from C and not B::b from A?

(I agree you can know in the case of direct aggregates, but I argue
you have no way to know in the case of pointer arguments without
interprocedural analysis)
It gets worse the more levels you have.

ie if you add
struct B {
struct E e;
}

and have struct E *e = ...
how do you know it's the B::e contained in struct C, or the B::e
contained in struct A?

Again, i agree you can do both scalar and direct aggregates, but not
aggregates and scalars through pointers.

I don't have immediate plan of pointer analysis. For the given example, we will treat field accesses from bp (pointer to struct B) as B::x.x, bp can be from either C or A.

Implementing the Hierarchy
--------------------------
We can attach metadata to both scalar accesses and aggregate accesses. Let's call scalar tags and aggregate tags.
Each tag can be a sequence of nodes in the type DAG.
!C::b2.a.x := [ "tbaa.path", !C, "C::b2", !B, "B::a", !A, "A::x", !int ]

This can get quite deep quite quickly.
Are there actually cases where knowing the outermost aggregate type +
{byte offset, size} of field access does not give you the exact same
disambiguation capability?

The answer is no. We should be able to deduce the access path from {byte offset, size}.
However, I don't know an easy way to check alias(x,y) given {byte offset, size} for x and y.

Well, that part is easy, it's just overlap, once you know they are
both contained in the same outermost type.

How do you figure out the outermost common type of two access 'x' and 'y'?

How are you annotating each access with a path as you go right now?
It should be the same thing to just annotate it then.

I thought your suggestion was to replace
!C::b2.a.x := [ "tbaa.path", !C, "C::b2", !B, "B::a", !A, "A::x", !int ]
with
!C::b2.a.x := [ "tbaa.offset", !C, offset within C, size ]

!B::a := [ "tbaa.path", !B, "B:a", !A ]
with
!B::a := [ "tbaa.offset", !B, offset within B, size]

Yes, it is.

Then we already lost the access path at IR level.

But you are *generating* the above metadata at the clang level, no?
That is were I meant you should annotate them as you go.

I may not get what you are saying.
But if we don't keep it in the LLVM IR, then the information is not available at the IR level.

I agree, but it's not being regenerated from the LLVM IR, it's being
generated *into* the LLVM IR by clang or something.
I'm saying you can generate the offset, size info there as well.

I thought your suggestion was to replace tbaa.path because it "can get quite deep quite quickly",

Yes. Your access path based mechanism is going to basically try to
find common prefixes or suffixes, which is expensive.

but here you are saying as well.
So clarification: you are suggesting to add {offset, size} into metadata tbaa.path, right :slight_smile:

No, i'm suggesting you replace it.

I am going to add a few examples here:
Given
struct A {
   int x;
   int y;
};
struct B {
   A a;
   int z;
};
struct C {
   B b1;
   B b2;
   int *p;
};
struct D {
   C c;
};

with the proposed struct-access-path aware TBAA, we will say "C::b1.a" will alias with "B::a.x", "C::b1.a" will alias with "B",
"C::b1.a" does not alias with "D::c.b2.a.x".

The proposal is about the format of metadata in IR and how to implement alias queries in IR:

Yes, and when I suggested replacing it, you said my replacement would
be difficult to generate.
So what i started to ask is:
What is generating this metadata?
Clang?
Something else?
What is the actual algorithm you plan on using to generate it?

I'm trying to understand how you plan on actually implementing the
metadata generation in order to give you a suggestion of how you would
generate differently structured metadata that, while conveying the
same information, would be able to be queried faster.

Actually in the interest of getting something done, why don't you just
implement it as you suggest, and if it turns out slow on large
programs, we can work out a better representation then.
This would be easier if we were both in the same room with a
whiteboard, but absent that, ...

It's not like it would be incredibly difficult to change the
representation later.

> Given
> struct A {
> int x;
> int y;
> };
> struct B {
> A a;
> int z;
> };
> struct C {
> B b1;
> B b2;
> };
> struct D {
> C c;
> };
>
> with struct-access-path aware TBAA, C::b1.a.x does not alias with D::c.b2.a.x.
> without it, the 2 scalar accesses can alias since both have int type.

I browsed the 2012 standard for a while and I didn't see anything that would make this illegal:

char *p = malloc(enough_bytes);
intptr_t x = reinterpret_cast<intptr_t>(p);
x += offsetof(C, b2);
D &vd = *reinterpret_cast<D*>(p);
C &vc = *reinterpret_cast<C*>(x);
vd.c.b2.a.x = 1; // ..accessing the same
int t = vc.b1.a.x; // ..storage

I don't think that the path through the type structure is really sufficient.

-Krzysztof

Not 100% sure your point here, but I think you are likely arguing TBAA is not sufficiently safe.
True for this case. However, in this case, TBAA is not supposed to kick in to disambiguate these
two access because the base/offset/size rule is supposed to give a definite answer if these two access
*definitely* alias or *definitely* not alias.

Note that, to make base/offset/size more effective, analyzer need to track the base along the UD chains
as far as possible. In this case, the "bases" for the memory are "p", instead of &vd and &vc, respectively.

If the base pointers are the same, then offset+length for each access is sufficient to determine aliasing. No type information is needed for this. I'm still not sure exactly what this proposal is supposed to address, but yes---given two base pointers (possibly not equal), the structural paths to the accessed objects are not sufficient.

-Krzysztof

This is assuming that there isn't anything in the C++ standard that I missed.

-Krzysztof

Based on my understanding of her design, following is one obtuse motivating example:

And my example shows that they can alias regardless of that. Granted, it's somewhat contrived and I'm not 100% sure if it's legal. Once there are pointers, there is little that can be done without knowing at least something about what they point to.

-Krzysztof

There are simpler examples of this kind for C++, because placement
new can change the dynamic type of the object (I actually haven't
looked to see if they changed this in 2012, but it was definitely
legal in C++98):

#include <new>
struct Foo { long i; };
struct Bar { void *p; };
long foo(int n)
{
  Foo *f = new Foo;
  f->i = 1;
  for (int i=0; i<n; ++i)
    {
      Bar *b = new (f) Bar;
      b->p = 0;
      f = new (f) Foo;
      f->i = i;
    }
  return f->i;
}

Both access to the same memory, both will end up with completely
different access paths, both legal by TBAA rules, but access path
alone will claim no-alias.

(This is taken from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29286)

yes, this is essentially one of my claims :slight_smile:

Someone privately asked me to explain this example, so here goes ...

There are simpler examples of this kind for C++, because placement
new can change the dynamic type of the object (I actually haven't
looked to see if they changed this in 2012, but it was definitely
legal in C++98):

#include <new>
struct Foo { long i; };
struct Bar { void *p; };
long foo(int n)
{
  Foo *f = new Foo;
  f->i = 1;
  for (int i=0; i<n; ++i)
    {
      Bar *b = new (f) Bar;
      b->p = 0;
      f = new (f) Foo;
      f->i = i;
    }
  return f->i;
}

Both access to the same memory, both will end up with completely
different access paths, both legal by TBAA rules, but access path
alone will claim no-alias.

(This is taken from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29286)

At the language level, both f and b are not the same "object". The
lifetime of b ends with the f = new (f) Foo.
This is actually irrelevant at the IR level, however.

What this transforms into at the IR level is something like this.
I've elided the fact that it will really start out one level
indirected, and then get promoted to regs, and pretended we have a
load/store at offset instead of getelementptr.

#include <new>
struct Foo { long i; };
struct Bar { void *p; };
long foo(int n)
{
   Foo *f1 = call operator new
   store value 1 into offset 0 of f

   for (int i=0; i<n; ++i)
     {
       Foo *f4 = phi(f1, f2)
       Bar *b = cast f4 to type Bar *
       store value 0 into offset 0 of b
       Foo *f2 = cast b to type Foo *
       store value i into offset of f2
    }
   Foo *f3 = phi(f1, f2)
temp = load offset 0 of f3
return temp;
}

If you annotate the loads/stores with access paths, these paths will
say "no alias" if you ask if the stores alias.
Thus, you will feel free to reorder the stores.
If you reorder the loop stores, you will return 0 instead of n.
The language guarantees the function to return n.

The cause of this is simple: C++ provides a way to legally and
dynamically change the TBAA type of a pointer. Without evaluating the
placement new's, you can't know what that new type is. You also need
what the thing you placement new'd pointed to to know what now legally
shares memory, despite TBAA, and even if the objects in that memory do
not share the same lifetime.

At the IR level, it's all just memory, loads and stores, and if you
are asked "do these pointers alias", the answer is "yes", and if you
annotate them with TBAA paths, the paths, as described, this will
cause you to say "no".

FWIW: GCC does three things now in this case:
1. These are transformed into "change dynamic type expressions"
2. For the higher level IR, it unions the type's TBAA info and the
location's points-to sets when it sees a "change dynamic type
expression"
3. For the lower level, where this won't work, we assume the pointer
can alias anything)

I personally believe that in the context of type-based AA, correctness is a subjective term:-).

If the user smell something fishy, it is up to user to disable such opt, there is no other
way around. TBAA is just to find the a sweet spot between precision & safeness.
Unfortunately, in the context of TBAA, precision & safeness usually come at each other's expense...

It would be nice if we can union the dynamic types of each elements in point-to set. However,
we certainly can live without if the program being compiled is well typed. I don't think it is
a dispensable tool for this kind of opt.

For some nasty license issue, I'm not allowed to peruse and discuss gcc internals.
I'm just using the gcc 4.6.3 binary shipped with Ubuntu to compile following snippet. I'm wondering
why the p->x is promoted. In the light of "dynamic type",
shouldn't p and q's point-to sets are "everything whose addr is taken".

I personally believe that in the context of type-based AA, correctness is a
subjective term:-).

If the user smell something fishy, it is up to user to disable such opt,
there is no other
way around. TBAA is just to find the a sweet spot between precision &
safeness.

Sorry, but we have to agree to disagree.

Users should expect that normal, reasonable, standards compliant code
doesn't break because the compiler decided to be aggressive.
Where the standard is ambiguous, that's one thing, but if certain
cases require us disabling TBAA for functions automatically, that is
the way it should be.

I actually used to be much more aggressive about this, probably at
your level, before the users came crashing down on me :slight_smile:
You can almost always find other ways to disambiguate.

In a lot of cases, we talked with standards folks, asked what they
intended, and did that. We filed DR's for the rest.

Unfortunately, in the context of TBAA, precision & safeness usually come at
each other's expense...

But in most cases, IMHO, you have to choose safeness.

It would be nice if we can union the dynamic types of each elements in
point-to set. However,
we certainly can live without if the program being compiled is well typed.

The program I gave was well typed :slight_smile:

I don't think it is a dispensable tool for this kind of opt.

For some nasty license issue, I'm not allowed to peruse and discuss gcc
internals.

FWIW: I have the right to give you the versions I wrote, under
whatever license you like.
Being a lawyer, I can guarantee I know what i'm giving you is safe :slight_smile:

I'm just using the gcc 4.6.3 binary shipped with Ubuntu to compile following
snippet. I'm wondering
why the p->x is promoted. In the light of "dynamic type",
shouldn't p and q's point-to sets are "everything whose addr is taken".

In the below example, GCC assumes p and q point to anything because
they are incoming arguments.

Based on discussions with John McCall

We currently focus on field accesses of structs, more specifically, on fields that are scalars or structs.

Fundamental rules from C11
--------------------------
An object shall have its stored value accessed only by an lvalue expression that has one of the following types: [footnote: The intent of this list is to specify those circumstances in which an object may or may not be aliased.]
1. a type compatible with the effective type of the object,
2. a qualified version of a type compatible with the effective type of the object,
3. a type that is the signed or unsigned type corresponding to the effective type of the object,
4. a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
5. an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
6. a character type.

Example
-------
struct A {
int x;
int y;
};
struct B {
A a;
int z;
};
struct C {
B b1;
B b2;
int *p;
};

Type DAG:
int <- A::x <- A
int <- A::y <- A <- B::a <- B <- C::b1 <- C
int <----------------- B::z <- B <- C::b2 <- C
any pointer <--------------------- C::stuck_out_tongue: <- C

The type DAG has two types of TBAA nodes:
1> the existing scalar nodes
2> the struct nodes (this is different from the current tbaa.struct)
A struct node has a unique name plus a list of pairs (field name, field type).
For example, struct node for "C" should look like
!4 = metadata !{"C", "C::b1", metadata !3, "C::b2", metadata !3, "C::p", metadata !2}
where !3 is the struct node for "B", !2 is pointer type.

Given a field access
struct B *bp = ...;
bp->a.x = 5;
we annotate it as B::a.x.

In the case of multiple structures containing substructures, how are
you differentiating?

IE given

struct A {
struct B b;
}
struct C {
struct B b;
}

How do you know the above struct B *bp =...; is B::b from C and not B::b from A?

(I agree you can know in the case of direct aggregates, but I argue
you have no way to know in the case of pointer arguments without
interprocedural analysis)
It gets worse the more levels you have.

ie if you add
struct B {
struct E e;
}

and have struct E *e = ...
how do you know it's the B::e contained in struct C, or the B::e
contained in struct A?

Again, i agree you can do both scalar and direct aggregates, but not
aggregates and scalars through pointers.

I don't have immediate plan of pointer analysis. For the given example, we will treat field accesses from bp (pointer to struct B) as B::x.x, bp can be from either C or A.

Implementing the Hierarchy
--------------------------
We can attach metadata to both scalar accesses and aggregate accesses. Let's call scalar tags and aggregate tags.
Each tag can be a sequence of nodes in the type DAG.
!C::b2.a.x := [ "tbaa.path", !C, "C::b2", !B, "B::a", !A, "A::x", !int ]

This can get quite deep quite quickly.
Are there actually cases where knowing the outermost aggregate type +
{byte offset, size} of field access does not give you the exact same
disambiguation capability?

The answer is no. We should be able to deduce the access path from {byte offset, size}.
However, I don't know an easy way to check alias(x,y) given {byte offset, size} for x and y.

Well, that part is easy, it's just overlap, once you know they are
both contained in the same outermost type.

How do you figure out the outermost common type of two access 'x' and 'y'?

How are you annotating each access with a path as you go right now?
It should be the same thing to just annotate it then.

I thought your suggestion was to replace
!C::b2.a.x := [ "tbaa.path", !C, "C::b2", !B, "B::a", !A, "A::x", !int ]
with
!C::b2.a.x := [ "tbaa.offset", !C, offset within C, size ]

!B::a := [ "tbaa.path", !B, "B:a", !A ]
with
!B::a := [ "tbaa.offset", !B, offset within B, size]

Yes, it is.

Then we already lost the access path at IR level.

But you are *generating* the above metadata at the clang level, no?
That is were I meant you should annotate them as you go.

I may not get what you are saying.
But if we don't keep it in the LLVM IR, then the information is not available at the IR level.

I agree, but it's not being regenerated from the LLVM IR, it's being
generated *into* the LLVM IR by clang or something.
I'm saying you can generate the offset, size info there as well.

I thought your suggestion was to replace tbaa.path because it "can get quite deep quite quickly",

Yes. Your access path based mechanism is going to basically try to
find common prefixes or suffixes, which is expensive.

Yes, handling alias queries based on access path is similar to finding common prefixes/suffixes.
As given in the proposal, rule(x,y) where 'x' and 'y' are access paths:
Check the first element of 'y', if it is not in 'x', return rule_case1(x,y)
Check the next element of 'y' with the next element of 'x', if not the same, return false.
When we reach the end of either 'x' or 'y', return true.

but here you are saying as well.
So clarification: you are suggesting to add {offset, size} into metadata tbaa.path, right :slight_smile:

No, i'm suggesting you replace it.

I am going to add a few examples here:
Given
struct A {
  int x;
  int y;
};
struct B {
  A a;
  int z;
};
struct C {
  B b1;
  B b2;
  int *p;
};
struct D {
  C c;
};

with the proposed struct-access-path aware TBAA, we will say "C::b1.a" will alias with "B::a.x", "C::b1.a" will alias with "B",
"C::b1.a" does not alias with "D::c.b2.a.x".

The proposal is about the format of metadata in IR and how to implement alias queries in IR:

Yes, and when I suggested replacing it, you said my replacement would
be difficult to generate.

That is the point of misunderstanding :]
Generating offset+size is not hard.
I was asking about how to answer alias(x,y) if we only have offset+size+outmost type.

Let's call the alias rule with access path: alias_path(x,y)
and call the alias rule with offset+size: alias_offset(x,y)
In order to have same answers from alias_path and alias_offset, I thought we have to regenerate the access path at IR level from the offset+size+outmost type information.
The regeneration is not easy.

So what i started to ask is:
What is generating this metadata?
Clang?
Something else?
What is the actual algorithm you plan on using to generate it?

I'm trying to understand how you plan on actually implementing the
metadata generation in order to give you a suggestion of how you would
generate differently structured metadata that, while conveying the
same information, would be able to be queried faster.

"C::b1.a" will be annotated as {!C, offset to C, size of A}
"B::a.x" will be annotated as {!B, offset to B, size of int}
"B" will be annotated as {!B, 0, size of B}
"D::c.b2.a.x" will be annotated as {!D, offset to D, size of int}

alias_path("C::b1.a", "B::a.x") = true
alias_path("C::b1.a", "B") = true
alias_path("C::b1.a" "D::c.b2.a.x") = false
What about
alias_offset("C::b1.a", "B::a.x")
alias_offset("C::b1.a", "B")
alias_offset("C::b1.a", "D::c.b2.a.x")

Thanks,
Manman

The program I gave was well typed :slight_smile:

Hi, Daniel:
    Thank you for sharing your insight. I didn't realized it is well-typed -- I'm basically a big nut of any std.
I'd admit std/spec is one of the most boring material on this planet:-).

    So, if I understand correct, your point is:
        if a std call a type-casting (could be one which is in bad-taste:-), TBAA has to respect such std.

   If that is strictly true, TBAA has to reply on point-to analysis. However, that would virtually disable
TBAA as most point-to set has "unknown" element.

    Going back to my previous mail,

In the below example, GCC assumes p and q point to anything because
they are incoming arguments.

------------------------------
typedef struct {
     int x;
}T1;

typedef struct {
     int y;
}T2;

int foo(T1 *p, T2 *q) {
     p->x = 1;
     q->y = 4;
     return p->x;
}
--------------------------

Yes, gcc should assume p and q point to anything, however, the result contradict to the assumption --
It promote the p->x expression.

   If I fabricate a caller by stealing some code from your previous example, see bellow.
I think these code & your previous example (about placement new) share the same std. I'm wondering
if gcc can give a correct result.

    foo_caller() {
        T1 t1;
        T1 *pt1;
        T2 *pt2 = new (pt1) T2;
        foo(pt1, pt2);
     }

The "context" is so scary, I eliminate all of them...

One dumb question, if the compiler does not even know the base-addr of the memory access,
why "offset + size" help?

Assuming above is C11 code, I think the relevant section in the C spec is the following:

This is a paragraph from a C11 draft ("N1570 Committee Draft — April 12, 2011") . Assuming my interpretation of it is correct: It seems to imply that a store to an lvalue can change its subsequent effective type? This would preclude any purely based TBAA solution. And would, in general, require to take access/points-to information into account.

Based on my understanding of her design, following is one obtuse
motivating example:

--------------------------
class A;
class B;

int foo(A* p, B* q) {

   p->a_int_field = 2;
   q->another_int_field = 3;
   return p->a_int_field; // !!!!!
}
----------------------------------

the *-statement can be optimized into "return 2" if optimizer can prove
type-A does not include type-B,
and type-B does not include type-A either.

And my example shows that they can alias regardless of that. Granted, it's somewhat contrived and I'm not 100% sure if it's legal. Once there are pointers, there is little that can be done without knowing at least something about what they point to.

Your example should return alias with llvm's alias analysis framework.
llvm's alias analysis are chained together, basic alias analysis should return alias for your example.
We call TBAA when basicAA says mayalias.

We let basicAA try its best to do point chasing (there is no inter procedural analysis).
For Shuxin's example, if the input parameters can alias, TBAA may still return noalias.
This issue exists with the current TBAA where we only annotate scalar accesses with the scalar types.

-Manman

The program I gave was well typed :slight_smile:

Hi, Daniel:
  Thank you for sharing your insight. I didn't realized it is well-typed -- I'm basically a big nut of any std.
I'd admit std/spec is one of the most boring material on this planet:-).

  So, if I understand correct, your point is:
      if a std call a type-casting (could be one which is in bad-taste:-), TBAA has to respect such std.

If that is strictly true, TBAA has to reply on point-to analysis. However, that would virtually disable
TBAA as most point-to set has "unknown" element.

  Going back to my previous mail,

In the below example, GCC assumes p and q point to anything because
they are incoming arguments.

------------------------------
typedef struct {
    int x;
}T1;

typedef struct {
    int y;
}T2;

int foo(T1 *p, T2 *q) {
    p->x = 1;
    q->y = 4;
    return p->x;
}
--------------------------

Yes, gcc should assume p and q point to anything, however, the result contradict to the assumption --
It promote the p->x expression.

Assuming above is C11 code, I think the relevant section in the C spec is the following:

This is a paragraph from a C11 draft ("N1570 Committee Draft — April 12, 2011") . Assuming my interpretation of it is correct: It seems to imply that a store to an lvalue can change its subsequent effective type? This would preclude any purely based TBAA solution. And would, in general, require to take access/points-to information into account.

---
6.5 Expressions

6: "The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access."
---

This is just before paragraph 6.5 Expressions 7 that is quoted in the current TBAA proposal.

"If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that <<do not modify>> the stored value."

I read this as "A store will set the "effective type" for any subsequent read access" on the same object. So, in the above example, assuming
that p and q point to the same object, the effective type is changed from the first to the second line. Which means that IF p and q pointed to the > same object the read access to "p->x" using the old effective type is undefined. Hence, we may assume that p and q don't point to the same
object.

Yes, C is quite different than C++ here.

GCC will feel free to move these particular stores around, even though
it believes they point anywhere, but won't in my placement new C++
case, because they *must* point to the same memory.