Here's a cute bug in llvm-gcc's struct translation:
struct S242 { char * a;int b[1]; } ;
struct S93 { __attribute__((aligned (8))) void * a; } ;
The second example is padded out to 8 bytes, so both of these look like
{ i8 *, [1 x i32] }
This leads the "struct type factory" StructType::get to think they are the same.
But, the second field is marked as Padding in the second case but not the first,
and CopyAggregate does not copy Padding. When the second type
goes through ConvertType, it is converted to the same llvm Type as the first type,
and the StructTypeConversionInfo info is replaced; later copies of the first type
then think they don't have to copy the padding, producing wrong code.
I'm inclined to remove skipping the Padding in CopyAggregate; that's at best an unimportant optimization, and could result in code that's slower than doing a straightforward rep;movsl or equivalent. Alternatively I can take the Padding bit into account in the StructType::get code somehow. Anyone have a strong opinion?
Hi Dale,
Here's a cute bug in llvm-gcc's struct translation:
struct S242 { char * a;int b[1]; } ;
struct S93 { __attribute__((aligned (8))) void * a; } ;
The second example is padded out to 8 bytes, so both of these look like
{ i8 *, [1 x i32] }
This leads the "struct type factory" StructType::get to think they are
the same.
But, the second field is marked as Padding in the second case but not
the first,
and CopyAggregate does not copy Padding. When the second type
goes through ConvertType, it is converted to the same llvm Type as the
first type,
and the StructTypeConversionInfo info is replaced; later copies of the
first type
then think they don't have to copy the padding, producing wrong code.
there's some mucking around with padding in ConvertUNION presumably
because someone noticed a problem of this kind there.
I'm inclined to remove skipping the Padding in CopyAggregate; that's
at best an unimportant optimization, and could result in code that's
slower than doing a straightforward rep;movsl or equivalent.
The padding info is used when doing an element by element copy instead
of a memcpy (done for small objects in the hope of being faster).
Alternatively I can take the Padding bit into account in the
StructType::get code somehow. Anyone have a strong opinion?
Shouldn't it be a map from the gcc type to the padding info?
That said, you can get rid of the padding info as far as I'm
concerned. However Chris might have a different opinion - I
think he introduced it.
Ciao,
Duncan.
Alternatively I can take the Padding bit into account in the
StructType::get code somehow. Anyone have a strong opinion?
Shouldn't it be a map from the gcc type to the padding info?
That said, you can get rid of the padding info as far as I'm
concerned. However Chris might have a different opinion - I
think he introduced it.
I don't think I introduced it (was it Devang?). However, I agree with duncan: the right fix is to make the StructTypeInfoMap be indexed by a GCC type, not an llvm type. I am not fully familiar with how StructTypeInfoMap works, but I wouldn't be surprised if this caused other subtle miscompilations.
-Chris
Ok! Can you please fix it to index by GCC type? There is a many to one mapping between gcc types and llvm types.
-Chris
This is tricky and probably won't work. Padding info is in llvm struct type and CopyAggregate() is operating on llvm type. There is not any way to map llvm type back to gcc type. Am I missing something ?
EmitAggregateCopy has the gcc type. It would be reasonable to have CopyAggregate walk the GCC type in parallel with the llvm type, at least in simple cases. In more complex cases, it could give up.
-Chris
I don't think so, I have reached the same conclusion. You can pass the gcc type into CopyAggregate, but it's recursive, and there's no way to get the gcc type for the fields. You would have to walk the gcc type in parallel with the llvm type, which at best involves duplicating a lot of code and is quite error prone.
...but giving up in this case is easy enough, ok, I can do that.
Cool, thanks Dale! I think it would be reasonable to give up in nested struct cases, etc. We can always improve it later, and that will get us the obvious case in PR1278.
-Chris