Structure Types and ABI sizes

Hi all,

We're hitting some walls here when generating the correct structure
layout for specific C++ ABI requirements, and I was wondering how much
StructLayout could help.

For instance, the ABI has some complicated rules on the size of
derived classes
(http://www.codesourcery.com/public/cxx-abi/abi.html#class-types) and
LLVM struct type cannot reflect that in full.

Example:

// CHECK: %struct.I = type { i32, i8 }
struct I {
  int a;
  char b;
};

// CHECK: %struct.J = type { [8 x i8], i8, [3 x i8] }
struct J : I {
  char c;
};

What happens here is that "c" is placed in the base's tail padding and
there are three bytes padding because of the alignment. The main
problem with this is that, by changing the member (that should be a
structure) to an array, the alignment is lost. As LLVM types don't
have explicit alignment in themselves, it's impossible to recover that
information later and we need to make sure that every single use of
that field gets the correct alignment.

Furthermore, I wonder if that wouldn't impact some optimizations that
take types into account (as Chris has just replied in the vector
discussion)... Not sure...

So, I'm not proposing to have alignment in types nor to make LLVM
struct types conform to a specific ABI of a specific language, I'm
just saying that there should be a cleaner way... Very much like the
union type and bitfields, structure size and alignment problems can be
very hairy. Simplifying the IR and leaving all decisions to the
back-end can be a daunting task, but leaving the front-end to decide
on sizes and alignment is maybe not the best alternative.

StructLayout already knows a few things about structures (like
calculating the offset based on the type's alignment) but it's
ignorant regarding specific language decisions and ABIs. We could
attach some information regarding the language that is being compiled
so the back-end could make some informed choices on how to deal with
structures/unions/bitfields and have less hacks in the front-end.

I understand that cross-compilation between languages would break that
assumption, unless the IR has some kind of flags on it stating the
lang/abi used... but I know very few people like adding information to
the IR... :confused:

Any pointers on how to solve this issue in a better way other than
bloating the front-end?

Renato Golin <renato.golin@arm.com> writes:

Example:

// CHECK: %struct.I = type { i32, i8 }
struct I {
  int a;
  char b;
};

// CHECK: %struct.J = type { [8 x i8], i8, [3 x i8] }
struct J : I {
  char c;
};

What happens here is that "c" is placed in the base's tail padding and
there are three bytes padding because of the alignment. The main
problem with this is that, by changing the member (that should be a
structure) to an array, the alignment is lost. As LLVM types don't

I don't completely understand what you're saying here. Are you saying
you want to have a J member of I repesented in LLVM, so that struct J
becomes:

{ int32, int8, { int8 } }

Do I understand you correctly?

So, I'm not proposing to have alignment in types nor to make LLVM
struct types conform to a specific ABI of a specific language, I'm
just saying that there should be a cleaner way... Very much like the
union type and bitfields, structure size and alignment problems can be
very hairy. Simplifying the IR and leaving all decisions to the
back-end can be a daunting task, but leaving the front-end to decide
on sizes and alignment is maybe not the best alternative.

I think it's really the only alternative. LLVM is a low-level IR.
Expressing things like inheritance and layout rules for such types is
beyond the scope of the language. There are too many variations among
high-level languages and target ABIs. For things like that the frontend
needs to insert its own padding bytes.

There are ways to do that without losing too much information. For
example, we render the above without using arrays at all:

%I = type { i32, i8, i16 }
%J = type { %I, i8, i16 }

There are some places where LLVM can do a better job, I think.
StructLayout should "just work" in more cases. But the kind of
generality you're talking about just isn't going to work very well in a
low-level IR. Nor should it. It's not what the IR is designed to do.

                            -Dave

{ int32, int8, { int8 } }

Do I understand you correctly?

Hi David,

I'm actually looking for answers, not requesting features... :wink:

That structure would actually solve the problem for this specific
case, but not for the general case. There are far too many exceptions
to be worth make a special case for every one of them.

There are ways to do that without losing too much information. For
example, we render the above without using arrays at all:

%I = type { i32, i8, i16 }
%J = type { %I, i8, i16 }

Not if you follow the Itanium C++ ABI.

In your example it works because { i8, i16 } pads nicely to 4 bytes,
so there is no tail padding. If there is tail padding, the size of the
Base class is different from the size of the Base class inside the
Derived one.

So, in my example "B : public A { char }":

%A = type { i32, i8 }
%B = type { %A, i8 }

A has 8 bytes, as it should, but inside B it has only 5, so B's first
field offset is 5, not 8. This is why we have to do:

%B = { [5 x i8], i8, [3 x i8] }

Adding the 3 bytes at the end is NOT the problem, but revoking the
type (and it's natural alignment) from %A is.

There are some places where LLVM can do a better job, I think.
StructLayout should "just work" in more cases. But the kind of
generality you're talking about just isn't going to work very well in a
low-level IR. Nor should it. It's not what the IR is designed to do.

My idea was that StructLayout could have more (optional) sources of
information, to do a better job at figuring out sizes and offsets. We
even thought about creating a Pass that will transform from natural
structures, unions and bitfields to the horrible mess it results to
when lowered, but that's avoiding the problem, not solving them.

The IR was designed for type safety and we have far too many hacks in
the type system that all C++ front-ends have to do. Maybe the original
design wasn't followed so closely, or we need a new design document
that clearly states what the goals are, because the way it is, it's
not clear, and it's definitely not good for C++.

Renato Golin <renato.golin@arm.com> writes:

There are ways to do that without losing too much information. For
example, we render the above without using arrays at all:

%I = type { i32, i8, i16 }
%J = type { %I, i8, i16 }

Not if you follow the Itanium C++ ABI.

In your example it works because { i8, i16 } pads nicely to 4 bytes,

That's why we use { i8, i16 }. It's not by accident. We do adhere to
the Itanium ABI.

so there is no tail padding. If there is tail padding, the size of the
Base class is different from the size of the Base class inside the
Derived one.

Yes, that's true for non-POD types. In that case the base class really
has two different representations which need to be two different types
in LLVM. This is really ugly stuff. I fixed a whole slew of bugs
around just this issue last year. :slight_smile:

In these cases you may have to resort to arrays or at least a bunch of
consecutive i8s. I'm sure we do so though I would have to verify that.

So, in my example "B : public A { char }":

%A = type { i32, i8 }
%B = type { %A, i8 }

A has 8 bytes, as it should, but inside B it has only 5, so B's first
field offset is 5, not 8. This is why we have to do:

%B = { [5 x i8], i8, [3 x i8] }

Wait, that's not what you showed before:

// CHECK: %struct.J = type { [8 x i8], i8, [3 x i8] }
struct J : I {
  char c;
};

%B = { [5 x i8], i8, [3 x i8] } is not correct for the Itanium ABI
because tail padding cannot be overlaid in "POD for the purposes of
layout" types (secs. 1.1 and 2.2). You had it right the first time. :slight_smile:

Adding the 3 bytes at the end is NOT the problem, but revoking the
type (and it's natural alignment) from %A is.

What do you mean by "revoking?" Do you mean inferring the type of %A
within %B given %B's layout? Why do you need to get the alignment
information anyway? The byte offsets are fixed by the ABI so in the
end, bits is bits and addresses is addresses. Ugly casts may be
necessary but nothing too drastic that will seriously prevent
optimization.

My idea was that StructLayout could have more (optional) sources of
information, to do a better job at figuring out sizes and offsets. We
even thought about creating a Pass that will transform from natural
structures, unions and bitfields to the horrible mess it results to
when lowered, but that's avoiding the problem, not solving them.

I'm still not exactly sure what problem you're trying to solve. Is it a
correctness issue in your code generator?

That said, I have thought along similar lines to make frontends easier
to construct. I imagined metadata on struct types to indicate layout
requirements but the current metadata system is not appropriate since it
does not consider metadata to be semantically important for correctness.

But even with that solution, the frontend would still need to add the
metadata to struct types. There's really no way around the frontend
needing to understand the ABI at some level. It has to convey the
language semantics to LLVM, which is by design language-agnostic.

The IR was designed for type safety and we have far too many hacks in
the type system that all C++ front-ends have to do. Maybe the original
design wasn't followed so closely, or we need a new design document
that clearly states what the goals are, because the way it is, it's
not clear, and it's definitely not good for C++.

I'm not sure the IR was designed for type safety. The original
designers can speak to that. But any language that has things like
inttoptr and ptrtoint is inherently not type-safe. The typing helps
certain classes of analysis and transformation but in the case of C++
inheritence there's not a whole lot that applies. You need a
higher-level IR to take care of that stuff.

                               -Dave

That's why we use { i8, i16 }. It's not by accident. We do adhere to
the Itanium ABI.

Oh, I see. But the padding is not a problem, it could be [3 x i8] or {
i8, i16 }, it doesn't matter, since it's never going to be used.

But the [8 x i8] that was the original Base type is, and that's cast'd
away to plain array (with alignment 1).

Yes, that's true for non-POD types. In that case the base class really
has two different representations which need to be two different types
in LLVM. This is really ugly stuff. I fixed a whole slew of bugs
around just this issue last year. :slight_smile:

As you can see, it's my turn now... :wink:

%B = { [5 x i8], i8, [3 x i8] }

Wait, that's not what you showed before:

My bad, different case that one...

What do you mean by "revoking?" Do you mean inferring the type of %A
within %B given %B's layout? Why do you need to get the alignment
information anyway? The byte offsets are fixed by the ABI so in the
end, bits is bits and addresses is addresses. Ugly casts may be
necessary but nothing too drastic that will seriously prevent
optimization.

Ok, that was my first real question. I'm not too focused on
optimizations, so I don't know how much that would actually stop them
from happening.

What I see is that variables get cast'd away to arrays, passed to
routines and cast's back (maybe to some different type) to do some
arithmetic operation. If LLVM can understand that, that's fine.

I'm still not exactly sure what problem you're trying to solve. Is it a
correctness issue in your code generator?

No. Our front-end follows the standard and the ABI to the letter, the
problem starts when I have to match all ABI decisions to LLVM types. I
could convert every single structure into arrays and rely only on our
front-end, but that would make the IR very hard to debug.

That said, I have thought along similar lines to make frontends easier
to construct. I imagined metadata on struct types to indicate layout
requirements but the current metadata system is not appropriate since it
does not consider metadata to be semantically important for correctness.

That's a valid point. The OpenCL guys were also discussing metadata
and I think it's fair to consider it a first class citizen.

But I also understand perfectly well why it hasn't, so far. Because
metadata is SO generic, if you consider it first-class, people will
start abusing of it for little personal modifications and the IR will
stop being standard and diverge.

So maybe, sticking things to the IR without metadata is still the best
course for extra information (like build attributes, target
sub-features, ABIs), but that's a completely different discussion.

But even with that solution, the frontend would still need to add the
metadata to struct types. There's really no way around the frontend
needing to understand the ABI at some level. It has to convey the
language semantics to LLVM, which is by design language-agnostic.

Oh, I totally agree. Some knowledge is best left for the front-end,
but there are some things that could be passed onto StructLayout (such
as POD/nonPOD) with a little bit of effort and make the IR much easier
to understand and debug.

I'm not sure the IR was designed for type safety. The original
designers can speak to that. But any language that has things like
inttoptr and ptrtoint is inherently not type-safe. The typing helps
certain classes of analysis and transformation but in the case of C++
inheritence there's not a whole lot that applies. You need a
higher-level IR to take care of that stuff.

Fair point. And that's where the idea of having the passes came from.
Maybe having a high-level IR with language/ABI/sub-target information
that gets converted to the current low-level IR only at the last time
is the best course of action...

But that unleashes a completely different beast that I'm not willing
to handle right now... :wink:

cheers,
--renato

Renato Golin <renato.golin@arm.com> writes:

That's why we use { i8, i16 }. It's not by accident. We do adhere to
the Itanium ABI.

Oh, I see. But the padding is not a problem, it could be [3 x i8] or {
i8, i16 }, it doesn't matter, since it's never going to be used.

Right. We do it for aesthetics. :slight_smile:

But the [8 x i8] that was the original Base type is, and that's cast'd
away to plain array (with alignment 1).

Yep. But again, I don't think you're losing much.

Yes, that's true for non-POD types. In that case the base class really
has two different representations which need to be two different types
in LLVM. This is really ugly stuff. I fixed a whole slew of bugs
around just this issue last year. :slight_smile:

As you can see, it's my turn now... :wink:

Good luck! :-/

%B = { [5 x i8], i8, [3 x i8] }

Wait, that's not what you showed before:

My bad, different case that one...

Whew! Thought I might have another nasty bug. :slight_smile:

What I see is that variables get cast'd away to arrays, passed to
routines and cast's back (maybe to some different type) to do some
arithmetic operation. If LLVM can understand that, that's fine.

It'll understand it but make sure it's casting to the correct types and
using the correct offsets given the base types. We actually generate a
bunch of ugly ptrtoint + arithmetic + inttptr + GEP stuff. I wrote an
instcombine pass to fold that down to a single GEP where possible. I
don't know if that's valid in general (due to inbounds and other GEP
semantics) but it is for the cases we use it because we "know" where it
came from.

LLVM understands GEP better than ptrtoint/inttoptr, which is why we make
the transform. Whether you can do this is probably dependent on how you
lower things.

I'm still not exactly sure what problem you're trying to solve. Is it a
correctness issue in your code generator?

No. Our front-end follows the standard and the ABI to the letter, the
problem starts when I have to match all ABI decisions to LLVM types. I
could convert every single structure into arrays and rely only on our
front-end, but that would make the IR very hard to debug.

That's true and is the primary reason we try to maintain as much of the
original structure as possible (i.e. not using arrays). We used to emit
more array stuff but I actually put a fair amount of effort to improve
this in our frontend. But we always have to include explicit padding at
some point and that means odd-looking members from time to time. I
haven't found it too terribly burdensome.

That said, I have thought along similar lines to make frontends easier
to construct. I imagined metadata on struct types to indicate layout
requirements but the current metadata system is not appropriate since it
does not consider metadata to be semantically important for correctness.

That's a valid point. The OpenCL guys were also discussing metadata
and I think it's fair to consider it a first class citizen.

I would like to see that too but it might be an uphill battle. :frowning:

But I also understand perfectly well why it hasn't, so far. Because
metadata is SO generic, if you consider it first-class, people will
start abusing of it for little personal modifications and the IR will
stop being standard and diverge.

Absolutely. Even so, I would consider debug info to be a semantic
correctness issue. If the compiler can't produce debuggable code, it is
useless. So we at least have some precendent for metadata being
"important." The key with the debug info is that it's documented. It's
its own little language, really.

Oh, I totally agree. Some knowledge is best left for the front-end,
but there are some things that could be passed onto StructLayout (such
as POD/nonPOD) with a little bit of effort and make the IR much easier
to understand and debug.

I completely agree.

Fair point. And that's where the idea of having the passes came from.
Maybe having a high-level IR with language/ABI/sub-target information
that gets converted to the current low-level IR only at the last time
is the best course of action...

But that unleashes a completely different beast that I'm not willing
to handle right now... :wink:

Oh, come on! Where's your sense of adventure? :slight_smile:

                          -Dave

LLVM understands GEP better than ptrtoint/inttoptr, which is why we make
the transform. Whether you can do this is probably dependent on how you
lower things.

We're using only GEPs from start, so we might not hit all the problems you did.

I was actually surprised that the change to support those ABI
constraints wasn't too big...

That's a valid point. The OpenCL guys were also discussing metadata
and I think it's fair to consider it a first class citizen.

I would like to see that too but it might be an uphill battle. :frowning:

Maybe even steeper than introducing new fields to the IR...

But that unleashes a completely different beast that I'm not willing
to handle right now... :wink:

Oh, come on! Where's your sense of adventure? :slight_smile:

:smiley:

One thing at a time... It's in my TODO list to prepare a document with
all the hacks one has to do to make C++ be translated to IR.

As far as I know, llvm-gcc and clang had the same problems and solved
the same hacky way, so there might be other people that would agree on
some more radical changes. But we need more evidence to start with...
One thing at a time... :wink:

I think the obvious solution here is to use packed structs when the layout of an type as a base class would substantially differ from its layout as a complete object. This loses some alignment information, but frontends need to be aggressive about providing alignment on loads/stores/etc. anyway. I'm actually thinking of changing clang to do this.

John.

Hi renato

Do you remember me?
Your comment on previous mailing list was so helpful for me :slight_smile:

Nowdays I'm implementing modified LLVM IR to make target independent
struct memory layout. Modified IR is changed to original LLVM IR later which
can use general llvm operations (optimizations, code gernerations, etc...)

I am inserting align information to type information to implement modified IR.

For example,

C source code

struct kist {
  char a:7;
  int b:20;
  short c:3;
  int d:15;
};
struct kist kang = {1, 2, 3, 4};

int main(void) {
  kang.d = 1;
  return 0;
}

That's actually a good idea... To have the normal structure when you
use the base directly and the packed version to be embedded into
derived types.

I managed to fix all structure size problems I've seen, but if packed
structs work the same way, it'll be much more elegant. Thanks John!