RFC: Generate plain !tbaa tags in place of !tbaa.struct ones

Hello,

The motivation behind this proposal is to make it possible to
propagate TBAA information through scalarizing transformations,
such as SROA, that rewrite accesses to aggregates, e.g., memcpy()
calls, into accesses to scalars, that is, load and store
instructions.

Currently, we decorate instructions that initialize and copy
aggregates with !tbaa.struct tags that generally cannot be
transformed to !tbaa tags. For this reason every time we
scalarize an aggregate, we leave resulting loads and stores
undecorated, meaning optimization of such instructions cannot
benefit from TBAA.

Furthermore, our analysis indicates that the only place where
!tbaa.struct tags may potentially impact code generation is
simplification of memcpy() and memmove() calls, see
SimplifyMemTransfer() in InstCombineCalls.cpp. Ironically, what
the code that makes that sole use of such tags is trying to do is
to construct a !tbaa tag from the information encoded in the
given !tbaa.struct tag. Note that it can only do that if the
!tbaa.struct tag describes a structure with a single member of a
scalar type.

Here's how we propose to resolve the issue in terms of specific
steps:

1. Extend the TBAA facilities in clang to support aggregate types
as final access types. This patch:

\[CodeGen\] Propagate may\-alias'ness of lvalues with TBAA info
https://reviews.llvm.org/D39008

implements this for the needs of fixing issues with
propagation of TBAA information, which in turn is necessary to
support TBAA for unions\. So once this patch is committed, this
step is done\.

2. Generate !tbaa tags in addition to !tbaa.struct tags for
aggregate accesses.

3. Fix the TBAA analysis to support aggregate accesses as
explained in this proposal:

RFC: Resolving TBAA issues
http://lists.llvm.org/pipermail/llvm-dev/2017-August/116652.html

4. Switch the code that simplifies memcpy() and memmove() calls
to !tbaa tags.

5. Remove the support for !tbaa.struct tags.

I guess we would want to remove all code which only purpose is
to generate such tags, but I'm not sure what we should do with
the MD\_tbaa\_struct enumerator\. Some possible options:
a\) Leave it as is, just don't use it\.
b\) Rename to something like MD\_unused\.
c\) Remove, but do not change the values of other MD\_\*
   enumerators\.
d\) Remove and adjust values of other MD\_\* enumerators
   respectively\.
Or, maybe we want some multi\-stage plan here?

Further steps are supposed to include things like fixing SROA to
propagate TBAA information.

Any feedback is highly appreciated.

Thanks,

Hello,

The motivation behind this proposal is to make it possible to
propagate TBAA information through scalarizing transformations,
such as SROA, that rewrite accesses to aggregates, e.g., memcpy()
calls, into accesses to scalars, that is, load and store
instructions.

Currently, we decorate instructions that initialize and copy
aggregates with !tbaa.struct tags that generally cannot be
transformed to !tbaa tags. For this reason every time we
scalarize an aggregate, we leave resulting loads and stores
undecorated, meaning optimization of such instructions cannot
benefit from TBAA.

I understand that we'd like to improve TBAA in a number of different ways, and that probably involves replacing !tbaa.struct with an enhanced !tbaa representation, but I don't understand the point of this proposal. As you state, there are lots of places that could use !tbaa.struct information, such as SROA, but don't. Is the current !tbaa.struct missing some information that is essential for that purpose?

  -Hal

In short, the problem with !tbaa.struct is that in most cases it cannot be converted to !tbaa. For transformations like SROA this means they cannot propagate !tbaa.struct tags for aggregate-accessing instructions like memcpy() calls to the resulting loads and stores.

To clarify further, what this paper proposes is to use !tbaa for all kinds of accesses, including aggregate ones, so we don't need to bother trying to convert them when an aggregate access becomes a series of scalar accesses or vice versa. As I said, in most cases such conversions are not possible anyway, because !tbaa.struct tags do not refer to the type of the aggregate itself and only enumerate its members.

As of today, there is no adequate representation that could be used to describe accesses to individual slices of scalarized aggregates. But at very least we could decorate accesses to slices with tags of their aggregates. That requires aggregate accesses to be described the same way we describe scalar accesses.

To provide some background on importance of this matter, SROA and passing flattened parameters/returning values in clang together produce 75% of all undecorated loads and 90% if all undecorated stores on the LLVM code base under -O1.

To clarify further, what this paper proposes is to use !tbaa for all kinds of accesses, including aggregate ones, so we don't need to bother trying to convert them when an aggregate access becomes a series of scalar accesses or vice versa. As I said, in most cases such conversions are not possible anyway, because !tbaa.struct tags do not refer to the type of the aggregate itself and only enumerate its members.

That makes sense, but I don't believe you actually said *that*. I agree, the fact that the tbaa.struct metadata refers to the scalar types, and not the type itself, will prohibit the conversion in general.

I agree that we should fix this. Given that we're planning to make changes, perhaps major ones, to the TBAA representation, we should account for this requirement as part of that larger discussion. I see no need for a separate thread on this particular topic in the context of those larger changes.

That having been said, if we desire to fix this before those larger changes to the TBAA representation (to handle aggregates, unions, etc.), I think we should do that. In this case, however, we can/should make a small enhancement that addresses this representational deficiency. We can, for example, generate tbaa.struct metadata where the tbaa struct type is the first in the list. This is easy to distinguish from the existing format because in the existing format the first operand is an integer. This will provide the necessary information to allow SROA to create struct access tags for the scalarized accesses.

Thanks again,
Hal

To clarify further, what this paper proposes is to use !tbaa for all kinds of accesses, including aggregate ones, so we don't need to bother trying to convert them when an aggregate access becomes a series of scalar accesses or vice versa. As I said, in most cases such conversions are not possible anyway, because !tbaa.struct tags do not refer to the type of the aggregate itself and only enumerate its members.

That makes sense, but I don't believe you actually said *that*. I agree, the fact that the tbaa.struct metadata refers to the scalar types, and not the type itself, will prohibit the conversion in general.

I agree that we should fix this. Given that we're planning to make changes, perhaps major ones, to the TBAA representation, we should account for this requirement as part of that larger discussion. I see no need for a separate thread on this particular topic in the context of those larger changes.

Sure, we can discuss all the related things here. I don't mind if we should change the subject line to something more appropriate.

That having been said, if we desire to fix this before those larger changes to the TBAA representation (to handle aggregates, unions, etc.), I think we should do that. In this case, however, we can/should make a small enhancement that addresses this representational deficiency. We can, for example, generate tbaa.struct metadata where the tbaa struct type is the first in the list. This is easy to distinguish from the existing format because in the existing format the first operand is an integer. This will provide the necessary information to allow SROA to create struct access tags for the scalarized accesses.

Generating a !tbaa tag in addition to the !tbaa.struct would probably be even simpler. Unfortunately, that alone wouldn't be enough as !tbaa tags in place of/as part of/next to !tbaa.struct tags only make sense if we can represent accesses to aggregates, so we need this to be done first. Adding support for aggregate accesses, in turn, requires us to be able to determine common types for aggregates plus some other things that imply changes to the metadata format.

If we are ready to discuss specific changes to the format, then I would propose the following:

* All types, including aggregate ones, to refer to their parent types. Otherwise, we couldn't determine the common type for a couple of aggregate final access types. Since all type nodes shall have this field, it would be natural to make it first in the list while also making it possible to detect the new format.

* Encode access sizes. This is essential for a better support for accesses to union members. This patch makes the first step in this direction:

https://reviews.llvm.org/D39455

but suffers from being imprecise WRT what specific part of the union object is accessed. Given we know the offset range (the base offset together with the access size), we can determine if two accesses to members of the same union type actually overlap.

* Encode type sizes. This, provided in addition to access sizes, is what we could use to support member arrays. As of now, we describe all accesses to array elements, including elements of member arrays, as if they were standalone objects, because there is no way to represent an access to a range of array elements or a specific array element. This means all TBAA information about the base lvalue gets lost.

Once we can encode the information about the size of the element type and the size of the access, we can determine the range of potentially accessed elements. This is how it is supposed work in terms of TBAA metadata:

struct S {
int i[7];
};

; The good old int. 4 bytes in size. Mutable.
!3 = !{!1, i64 4, i64 0, !"int"}

; A structure containing an array of seven ints.
; We know the element type is int because we explicitly refer to the node of that type.
; We know there are 7 of them because the size of the field is 28 and the size of the type is 4.
!5 = !{!1, i64 28, i64 0, !"_ZTS1S", !3, i64 0, i64 28}

; An access to all or any of the array elements in S.
!7 = !{!5, !3, i64 0, i64 28}

; An access to the element with index 2.
!9 = !{!5, !3, i64 8, i64 4}

The same principle can be applied to describing accesses to slices of scalarized aggregates.

* We would need to generalize the concept of type identifiers. Strings containing mangled names are not enough because tagged types in C do not have mangled names and there are local types in C++ with no linkage and there are tagged types that have no names. As a result, in some cases like this one:

https://bugs.llvm.org/show_bug.cgi?id=8572

we count different types to be the same type.

If we can allow type identifiers to be metadata nodes of unspecified content, then we could generate as detailed identifiers as we need to support the rules of the input language. For example, C's structures and unions are subject to the type compatibility rules that respect things like widths of bit fields and the order of members. And for local types in C++ we could produce 'distinct' metadata nodes.

Thanks,

To clarify further, what this paper proposes is to use !tbaa for all kinds of accesses, including aggregate ones, so we don't need to bother trying to convert them when an aggregate access becomes a series of scalar accesses or vice versa. As I said, in most cases such conversions are not possible anyway, because !tbaa.struct tags do not refer to the type of the aggregate itself and only enumerate its members.

That makes sense, but I don't believe you actually said *that*. I agree, the fact that the tbaa.struct metadata refers to the scalar types, and not the type itself, will prohibit the conversion in general.

I agree that we should fix this. Given that we're planning to make changes, perhaps major ones, to the TBAA representation, we should account for this requirement as part of that larger discussion. I see no need for a separate thread on this particular topic in the context of those larger changes.

Sure, we can discuss all the related things here. I don't mind if we should change the subject line to something more appropriate.

That having been said, if we desire to fix this before those larger changes to the TBAA representation (to handle aggregates, unions, etc.), I think we should do that. In this case, however, we can/should make a small enhancement that addresses this representational deficiency. We can, for example, generate tbaa.struct metadata where the tbaa struct type is the first in the list. This is easy to distinguish from the existing format because in the existing format the first operand is an integer. This will provide the necessary information to allow SROA to create struct access tags for the scalarized accesses.

Generating a !tbaa tag in addition to the !tbaa.struct would probably be even simpler. Unfortunately, that alone wouldn't be enough as !tbaa tags in place of/as part of/next to !tbaa.struct tags only make sense if we can represent accesses to aggregates, so we need this to be done first. Adding support for aggregate accesses, in turn, requires us to be able to determine common types for aggregates plus some other things that imply changes to the metadata format.

If we are ready to discuss specific changes to the format, then I would propose the following:

* All types, including aggregate ones, to refer to their parent types. Otherwise, we couldn't determine the common type for a couple of aggregate final access types. Since all type nodes shall have this field, it would be natural to make it first in the list while also making it possible to detect the new format.

* Encode access sizes. This is essential for a better support for accesses to union members. This patch makes the first step in this direction:

https://reviews.llvm.org/D39455

but suffers from being imprecise WRT what specific part of the union object is accessed. Given we know the offset range (the base offset together with the access size), we can determine if two accesses to members of the same union type actually overlap.

* Encode type sizes. This, provided in addition to access sizes, is what we could use to support member arrays. As of now, we describe all accesses to array elements, including elements of member arrays, as if they were standalone objects, because there is no way to represent an access to a range of array elements or a specific array element. This means all TBAA information about the base lvalue gets lost.

Once we can encode the information about the size of the element type and the size of the access, we can determine the range of potentially accessed elements. This is how it is supposed work in terms of TBAA metadata:

struct S {
  int i[7];
};

; The good old int. 4 bytes in size. Mutable.
!3 = !{!1, i64 4, i64 0, !"int"}

; A structure containing an array of seven ints.
; We know the element type is int because we explicitly refer to the node of that type.
; We know there are 7 of them because the size of the field is 28 and the size of the type is 4.
!5 = !{!1, i64 28, i64 0, !"_ZTS1S", !3, i64 0, i64 28}

; An access to all or any of the array elements in S.
!7 = !{!5, !3, i64 0, i64 28}

; An access to the element with index 2.
!9 = !{!5, !3, i64 8, i64 4}

The same principle can be applied to describing accesses to slices of scalarized aggregates.

* We would need to generalize the concept of type identifiers. Strings containing mangled names are not enough because tagged types in C do not have mangled names and there are local types in C++ with no linkage and there are tagged types that have no names. As a result, in some cases like this one:

https://bugs.llvm.org/show_bug.cgi?id=8572

we count different types to be the same type.

If we can allow type identifiers to be metadata nodes of unspecified content, then we could generate as detailed identifiers as we need to support the rules of the input language. For example, C's structures and unions are subject to the type compatibility rules that respect things like widths of bit fields and the order of members. And for local types in C++ we could produce 'distinct' metadata nodes.

Ivan, I think that all of this sounds reasonably good. I know that you've posted some patches:

https://reviews.llvm.org/D39956
https://reviews.llvm.org/D40438
https://reviews.llvm.org/D39557

Please, however, first post a patch with updated to the LangRef spelling out all of the details of the format and associated semantics. Once we converge on that, we can take care of the implementation.

Thanks again,
Hal

Thanks Hal.

Here's the LangRef patch for the new format:

https://reviews.llvm.org/D40975