RFC: llvm-convert.cpp Patch

Hi all,

This patch is to fix a problem on PPC64 where an unaligned memcpy is
generated. The testcase is this:

$ cat testcase.c
void Qux() {
  char Bar[11] = {0};
}

What happens is that we produce LLVM code like this:

  call void @llvm.memcpy.i64( i8* %event_list2, i8* getelementptr ([11
x i8]* @C.103.30698, i32 0, i32 0), i64 11, i32 8 )

Notice that it has an 8-byte alignment. However, the Bar variable has
1-byte alignment, because it's being placed into the "redzone". The
generic code for the function is:

$ more testcase.c.t03.generic
Qux ()
{
  static char C.0[11] = {0};
  char Bar[11];

  Bar = C.0;
}

Anyway, it turns out that the gimplifier was generating the correct
alignment, but it was being overridden in assemble_variable():

  /* On some machines, it is good to increase alignment sometimes. */
  if (! DECL_USER_ALIGN (decl))
    {
#ifdef DATA_ALIGNMENT
      align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
#endif
#ifdef CONSTANT_ALIGNMENT
=> if (DECL_INITIAL (decl) != 0 && DECL_INITIAL (decl) != error_mark_node)
        align = CONSTANT_ALIGNMENT (DECL_INITIAL (decl), align);
#endif
    }

By setting the DECL_USER_ALIGN field, we can get the correct alignment.

Comments?

-bw

Index: gimplify.c

$ more testcase.c.t03.generic
Qux ()
{
  static char C.0[11] = {0};
  char Bar[11];

  Bar = C.0;
}

Anyway, it turns out that the gimplifier was generating the correct
alignment, but it was being overridden in assemble_variable():

  /* On some machines, it is good to increase alignment sometimes. */
  if (! DECL_USER_ALIGN (decl))
    {
#ifdef DATA_ALIGNMENT
      align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
#endif
#ifdef CONSTANT_ALIGNMENT
=> if (DECL_INITIAL (decl) != 0 && DECL_INITIAL (decl) != error_mark_node)
        align = CONSTANT_ALIGNMENT (DECL_INITIAL (decl), align);
#endif
    }

Not sure I'm getting it, is this Bar itself or the constructed C.0 that's getting realigned?

I don't get it Bill. llvm should be making memcpy with the right alignment regardless of how Bar is formed. Changing the alignment of bar papers over this problem, but we will still get improper alignment for other cases. Also, DECL_USER_ALIGN isn't appropriate to tweak here... the user isn't playing around with __attribute__((aligned))

-Chris

It's C.0 that's getting realigned.

-bw

Bar isn't the one that's getting realigned with the above code, it's the constructor. The patch only sets the DECL_USER_ALIGN field for the constructor (not Bar) when the constructor's created. My thoughts were that if the constructor was initially given the alignment of the variable (Bar, in this case), then it shouldn't be changed in the assemble_variable() function, unless it's also changing the alignment of Bar (which it's not). I'm actually not sure why it doesn't change Bar's alignment when it's changing Bar's constructor's alignment. That seems like a GCC bug, IMHO, though they probably don't get hit with it or "fix" it later on or something.

My guess is that you want something akin to TreeToLLVM::EmitBuiltinMemCopy where it gets the min of the alignments?

-bw

How about this patch then?

-bw
Index: gcc/llvm-convert.cpp

Hi,

I don't get it Bill. llvm should be making memcpy with the right
alignment regardless of how Bar is formed. Changing the alignment of
bar papers over this problem, but we will still get improper alignment
for other cases. Also, DECL_USER_ALIGN isn't appropriate to tweak
here... the user isn't playing around with __attribute__((aligned))

the fundamental problem is that the DestLoc argument to Emit doesn't
come with alignment (or volatility for that matter). When emitting
an aggregate into DestLoc only the alignment of the source is used,
not that of DestLoc. But that's wrong if DestLoc is less aligned
than the source (either because the source is highly aligned, eg
because the user marked it as being highly aligned, or because the
destination is little aligned as in the testcase). So why not pass
the alignment and volatility in too? Probably Chris doesn't want
to add extra arguments to Emit, so instead we could define a
"DestLoc descriptor" struct:
  struct DLD {
    Value *Ptr;
    unsigned Alignment;
    bool Volatile;
  }
declare one on the stack and pass its address, something like this:
  DLD D = { Ptr, Align, isVolatile };
  Emit(exp, &D);

Ciao,

Duncan.

OK then, it looks like other people have made the points I was going to, but briefly:

Using DECL_USER_ALIGN is wrong; that is for user-specified alignments, and if
you use it for something else, you'll break that.

The alignment on memcpy needs to take both the source & destination into account.
You seem to have gotten this far. Unfortunately that bit of llvm-convert is a maze of
recursive little passages, but your last patch seems to be thinking along the right lines.

Would it be terrible to drop the alignment on memcpy altogether? The source and
destination have alignments and llvm should be able to figure out what the safe
alignment is for memcpy. This is a target-dependent decision anyway; misaligned
loads are a good idea on some targets, not on others.

Hi,

> I don't get it Bill. llvm should be making memcpy with the right
> alignment regardless of how Bar is formed. Changing the alignment of
> bar papers over this problem, but we will still get improper alignment
> for other cases. Also, DECL_USER_ALIGN isn't appropriate to tweak
> here... the user isn't playing around with __attribute__((aligned))

the fundamental problem is that the DestLoc argument to Emit doesn't
come with alignment (or volatility for that matter). When emitting
an aggregate into DestLoc only the alignment of the source is used,
not that of DestLoc. But that's wrong if DestLoc is less aligned
than the source (either because the source is highly aligned, eg
because the user marked it as being highly aligned, or because the
destination is little aligned as in the testcase). So why not pass
the alignment and volatility in too? Probably Chris doesn't want
to add extra arguments to Emit, so instead we could define a
"DestLoc descriptor" struct:
  struct DLD {
    Value *Ptr;
    unsigned Alignment;
    bool Volatile;
  }
declare one on the stack and pass its address, something like this:
  DLD D = { Ptr, Align, isVolatile };
  Emit(exp, &D);

I just did this in a sudden fit of activity. It was quite easy and
looks like it catches a ton of volatility and alignment bugs. Maybe
it will even work and fix the original problem :slight_smile: I will let you
know later.

Ciao,

Duncan.

No, it wouldn't be terrible. This sounds like a good short-term fix.

-Chris

How about this patch then?

How about this one :slight_smile: It passes alignment and volatility around
with DestLoc. It's not finished because I noticed some bugs in
how CopyAggregate and ZeroAggregate handle alignment (problem points
marked with "QQ"). Also, I noticed potential problems with how we
handle call arguments and return results (what if they are strangely
aligned/volatile?), marked with FIXME.

While there, I fixed the following:
- Taught EmitAggregateZero not to do element by element zeroing
if the LLVM type doesn't cover the gcc type. Also, not to do
it if there are too many aggregate elements.
- Added a bunch of missing alignment/volatile stuff on various
loads and stores.
- Fixed up a lot of bogus handling of volatility in the complex
number stuff.

Ciao,

D.

align.diff (42.7 KB)

Whoa, this is awesome Duncan.

One minor nit-pick is that you can use BitCastToType instead of "CastToType(Instruction::BitCast", which is a minor simplification.

Other than that, if it tests out fine, please commit it!

-Chris

Hi,

the fundamental problem is that the DestLoc argument to Emit doesn't
come with alignment (or volatility for that matter). When emitting
an aggregate into DestLoc only the alignment of the source is used,
not that of DestLoc. But that's wrong if DestLoc is less aligned
than the source (either because the source is highly aligned, eg
because the user marked it as being highly aligned, or because the
destination is little aligned as in the testcase).

Exactly! :slight_smile:

So why not pass the alignment and volatility in too?

I'm for that.

Probably Chris doesn't want
to add extra arguments to Emit, so instead we could define a
"DestLoc descriptor" struct:
        struct DLD {
                Value *Ptr;
                unsigned Alignment;
                bool Volatile;
        }
declare one on the stack and pass its address, something like this:
        DLD D = { Ptr, Align, isVolatile };
        Emit(exp, &D);

Emit already takes a parameter that not all of the method calls in
there use. So it should be possible to add another parameter. I
submitted another patch for review that does that -- at least for the
alignment. :slight_smile:

-bw

OK then, it looks like other people have made the points I was going
to, but briefly:

Using DECL_USER_ALIGN is wrong; that is for user-specified
alignments, and if
you use it for something else, you'll break that.

Okay.

The alignment on memcpy needs to take both the source & destination
into account.

I agree. But I was thinking that there was a more fundamental problem,
i.e. that GCC was changing the alignment of the constructor without
taking the destination's alignment into account -- that is, it's not
changing the destination's alignment and/or checking it beforehand.

You seem to have gotten this far. Unfortunately that bit of llvm-
convert is a maze of
recursive little passages,

Tell me about it! :wink:

but your last patch seems to be thinking along the right lines.

:slight_smile:

-bw

> How about this patch then?

How about this one :slight_smile:

Your patch didn't apply cleanly, so I couldn't test it out. :frowning:

-bw

My patch as the short-term fix? :slight_smile:

-bw

Your patch didn't apply cleanly, so I couldn't test it out. :frowning:

It's for 4.2.

D.

Ah! When you commit it, could you back-port it to 4.0?

-bw

No, the compiler has complete control of the alignment of compiler-generated
"constructors" (in gcc's terminology; they have nothing to do with C++
constructors) and normally aligns them restrictively for efficiency. There is no
reason the alignment of the destination should be the same.

If it's a significant amount of work, it might not be worth it...

-Chris