Hi!
I followed the discussion on structure types with the example
struct I {
int a;
char b;
};
struct J : I {
char c;
};
Dave said that this translates to
%I = type { i32, i8, i16 }
%J = type { %I, i8, i16 }
because the frontend has to communicate the ABI to llvm since llvm is language agnostic.
What I really wonder is why it isn't
%I = type { i32, i8 }
%J = type { %I, i16, i8 }
because llvm at least knows alignment rules by
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16...
Therefore llvm has no other choice than assigning %I a size of 8
since an array may consist of %I elements and size of 5 would violate
the aligment of the i32 member.
If the ABI requires that member c has an offset of 8 instead of 5 then
of course a padding behind %I is necessary in %J.
-Jochen
Jochen Wilhelmy <j.wilhelmy@arcor.de> writes:
struct I {
int a;
char b;
};
struct J : I {
char c;
};
Dave said that this translates to
%I = type { i32, i8, i16 }
%J = type { %I, i8, i16 }
It translates to that in OUR compiler. It's not the only answer.
because the frontend has to communicate the ABI to llvm since llvm is
language agnostic.
Correct.
What I really wonder is why it isn't
%I = type { i32, i8 }
%J = type { %I, i16, i8 }
because llvm at least knows alignment rules by
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16...
Therefore llvm has no other choice than assigning %I a size of 8
since an array may consist of %I elements and size of 5 would violate
the aligment of the i32 member.
I can't quite parse this. %I doesn't get "assigned" a size by anyone.
Do you meant the size of struct I is eight bytes? Yes, that's true.
If the ABI requires that member c has an offset of 8 instead of 5 then
of course a padding behind %I is necessary in %J.
Yes, the padding is required. I believe %J = type { %I, i16, i8 } would
work just as well as long as %I = type { i32, i8 } as in your example.
Our frontend is far from "perfect" in the sense of aesthetics. ![:slight_smile: :slight_smile:](https://emoji.discourse-cdn.com/google/slight_smile.png?v=12)
-Dave
What I really wonder is why it isn't
%I = type { i32, i8 }
%J = type { %I, i16, i8 }
because llvm at least knows alignment rules by
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16...
Therefore llvm has no other choice than assigning %I a size of 8
since an array may consist of %I elements and size of 5 would violate
the aligment of the i32 member.
I can't quite parse this. %I doesn't get "assigned" a size by anyone.
Do you meant the size of struct I is eight bytes? Yes, that's true.
Yes I mean that
%I = type { i32, i8 }
is 8 bytes given the alignment rules (i.e. llvm "assigns" a size of
8 bytes to this struct after parsing it)
Yes, the padding is required. I believe %J = type { %I, i16, i8 } would
work just as well as long as %I = type { i32, i8 } as in your example.
Yes but given the ABI requires the last member to be at offset 5, which may happen
(i.e. no tail padding if I is derived from), then your solution
%I = type { i32, i8, i16 }
is problematic or do you switch struct generation dependent on the ABI?
The question arises to me since I would use an "always working" solution
(with no case distinction) but of course I'm not deep enough in the matter.
-Jochen
Jochen Wilhelmy <j.wilhelmy@arcor.de> writes:
Yes, the padding is required. I believe %J = type { %I, i16, i8 } would
work just as well as long as %I = type { i32, i8 } as in your example.
Yes but given the ABI requires the last member to be at offset 5,
which may happen
(i.e. no tail padding if I is derived from), then your solution
No, this is not true for this example. This is getting into extremely
delicate areas of the Itanium C++ ABI.
In this example, %I is a "POD for the purposes of layout" type. Such
types cannot have their tail padding overlapped when they are inherited
from. So %I is eight bytes in all contexts.
If %I is not a "POD for the purposes of layout" type, that it's tail
padding MUST be overlapped when inherited from. In this case, we
end up creating two types for %I, %I and %I' and use %I' as the
type when it is inherited from.
Fun, eh? :-/
-Dave
Some other fun examples...
** POD-layout:
struct I { int, char }; // size 8 = { i32, i8 };
struct J : I { char }; // size 12 = { %I, i8 };
struct K : J { char }; // size 12 = { [9 x i8], i8, [2 x i8] };
I is POD-layout, but J is NOT.
** Default C-tor:
struct I { int, char }; in C++ has the default and copy constructors
created automatically, right? So I is POD-layout.
But struct A { int, char, A(){} }; has the default constructor
overwritten exactly the same way, but A is not a POD-layout any more.
So:
struct A { int, char }; // size 8 = { i32, i8 };
struct B : A { char }; // size 8 = { [5 x i8], char, [2 x i8] }
struct C : B { char }; // size 8 = { [6 x i8], i8, i8 }
Of course, as David said, all those types have their "normal sized"
components, so there is B-full (8 bytes) and B-inheritable (6
bytes)...
cheers,
--renato
If %I is not a "POD for the purposes of layout" type, that it's tail
padding MUST be overlapped when inherited from. In this case, we
end up creating two types for %I, %I and %I' and use %I' as the
type when it is inherited from.
But this is the question why two types in this case.
if
%I = type { i32, i8 };
then %I has 8 bytes if used directly and when used in %J
%J = type { %I, i8 }
then %I has only 5 bytes. Of course %I' could be
%I' = type { i32, i8, i16 };
or
%I' = type { i32, i8, i8, i16 };
but I don't see the point of this since %I already does the job
or do I miss something?
-Jochen
but I don't see the point of this since %I already does the job
or do I miss something?
If you're saying that:
%I = type { i32, i8 };
has size 5, yes, you're missing the alignment.
According to the standard, the alignment of a structure is the
alignment of its most-aligned member (and some other cases in the ABI,
too).
So, %I has an int (align 4) and a char (align 1), so the final
alignment is 4, so the size is rounded up to 8. LLVM knows that, and
the size of:
%I = type { i32, i8 };
is 8, not 5.
To get size 5 you need the "packed" keyword (or similar attributes) or
transform it to a [5 x i8].
cheers,
--renato
If you're saying that:
%I = type { i32, i8 };
has size 5, yes, you're missing the alignment.
Ah, now I see. But I didn't say that
%I = type { i32, i8 };
has 5 bytes (because it has 8) but I thought that it has
5 bytes when being a member of %J, i.e.
%J = type { %I, i8 }
In this case %I also has 8 bytes right?
I was thinking too much in terms of C++ inheritance.
Then perhaps the tailpadding should be specified explicitly ![:wink: :wink:](https://emoji.discourse-cdn.com/google/wink.png?v=12)
%I = type { i32, i8 }; // 5 bytes
%I' = type { %I, tailpad}; // 8 bytes
%J = type { %I, i8 } // 6 bytes
-Jochen
That would break C code (and whatever else relies on alignment).
I don't see a way of specifying two structures, but I like the idea of
using a packed structure for inheritance and the "normal" one for
types.
cheers,
--renato
%I = type { i32, i8 }; // 5 bytes
%I' = type { %I, tailpad}; // 8 bytes
%J = type { %I, i8 } // 6 bytes
That would break C code (and whatever else relies on alignment).
why would it break C code? of course a C frontend should generate only tailpadded types.
I don't see a way of specifying two structures, but I like the idea of
using a packed structure for inheritance and the "normal" one for
types.
or something like
%J = type { inherit %I, i8 }
the inherit keyword before %I removes the tailpadding
-Jochen
why would it break C code? of course a C frontend should generate only
tailpadded types.
It's not about the size, but the offset. If you had a char field in
the inherited class:
%I' = type { %I, i8, tailpad};
The offset of that i8 has to be 8, not 5. If all structures are
packed, that would be 5, which is correct for non-POD in C++ but wrong
for everything else.
%J = type { inherit %I, i8 }
the inherit keyword before %I removes the tailpadding
That's what the packed is for.
%Base = type { i32, i8 }; // size = 8
%POSDerived = type { %Base, i8 }; // i8 offset = 8, size 12
%Basep = packed type { i32, i8 }; // size = 5
%nonPOSDerived = type { %Basep, i8 }; // i8 offset = 5, size 8
cheers,
--renato
why would it break C code? of course a C frontend should generate only
tailpadded types.
It's not about the size, but the offset. If you had a char field in
the inherited class:
%I' = type { %I, i8, tailpad};
The offset of that i8 has to be 8, not 5. If all structures are
packed, that would be 5, which is correct for non-POD in C++ but wrong
for everything else.
I know therefore in this case %I has to tailpadded. but packing and tailpadding are different
things, aren't they? in a packet type {i8, i32} the i32 type has offset 1 while in a non-tailpadded
type it still has offset 4.
%J = type { inherit %I, i8 }
the inherit keyword before %I removes the tailpadding
That's what the packed is for.
I don't think so because packing removes alignment constraints of all members.
-Jochen