PointerIntPair causing trouble

Hi all,

I’ve located a regression that causes my project to crash. It’s in revision 67979, where PointerIntPair is changed from storing the integer in the upper bits instead of the lower bits. My project is an experimental JIT-compiler in Windows.

So I was wondering if anyone had any clue why the new PointerIntPair implementation might fail. It doesn’t seem very safe to me to store other data into a pointer to begin with, but surely the lower bits must be a safer location than the upper bits?

The actual crash I’m seeing is preceded by an assert in X86InstrInfo::sizeOfImm: “Immediate size not set!”. Tracing backward, I’m seeing a MachineInstr with opcode MOV32rm and NumOperands equal to 6 (which is 5 in earlier revisions). I’m not sure if this actually helps explain the bug but it’s one of the weird things that happen with revision 67979 and subsequent revisions…

All help appreciated.

Nicolas

Hi Nicolas,

I’ve located a regression that causes my project to crash. It’s in revision 67979, where PointerIntPair is changed from storing the integer in the upper bits instead of the lower bits. My project is an experimental JIT-compiler in Windows.

We’re looking into a similar bug right now. We see the problem elsewhere (early on in optimization, during the first instcombine pass) but it sounds like the same issue. We’ll let you know when we know more.

So I was wondering if anyone had any clue why the new PointerIntPair implementation might fail. It doesn’t seem very safe to me to store other data into a pointer to begin with, but surely the lower bits must be a safer location than the upper bits?

Just to be clear, PointerIntPair doesn’t (AFAIK) store the integer in the “high bits” of the pointer. It just uses the “higher parts” of the “low available bits” of the pointer. So, if a pointer is always aligned such that it always has 4 bits clear on the low side, and you only need to store one bit, it’ll be stored in bit 3 (counting from the LSB towards the MSB).

It may be that the alignment assumptions being made are simply wrong on Windows. So far we’ve been trying to track down whether this is actually the cause or if something’s wrong with the Module coming into InstCombine, but given that you’re seeing a similar problem elsewhere, it’s not unlikely that this is the real cause.

After quickly rereading the code for PointerIntPair this actually seems extremely unlikely, since there are nice asserts in the proper places.

Stefanus

FWIW, the reason it does this is to allow composable pointerintpairs, for example, PointerIntPair< PointerIntPair<foo*, 1>, 2> can use the low 3 bits as two different bitfields.

-Chris

Hi Nicolas,

Looks like Preston and I have found the cause of the problem. The issue is with PointerLikeTypeTraits<T*>::NumLowBitsAvailable. This is set to 3, which basically assumes that unless the traits are specialized for a particular pointer type, objects of that type are allocated with malloc() and aligned to 8 bytes.

While PointerLikeTypeTraits is overloaded for Use*, it is not overloaded for User*. lib/VMCore/Use.cpp uses a PointerIntPair<User*, 1, Tag>, and things go bad. Users are typically allocated in an array after a bunch of Uses, which are 12 bytes in size, so they may actually only be aligned to 4 bytes.

The attached patch (which I don’t intend to commit, it’s just a workaround) works around this simply by dropping this down to 2 bits, which gets us past our problem on Windows. I would appreciate if you could verify that this works for you as well.

To actually solve this properly I see two main options:

(1) we could specialize PointerLikeTypeTraits for User*, and leave the default value of NumLowBitsAvailable at 3.
(2) we could drop the default NumLowBitsAvailable to 2 (or even use _alignof and similar compiler-specific extensions to determine it), and allow classes that assert that they are only ever allocated through malloc to relax it back up to 3.

My preference would be for option (2), due to the difficulty of tracking this bug down, and the risk of future similar bugs happening. However, I’d appreciate some feedback from the LLVM developer community on this. Please let me know what you think, and I’ll be happy to prepare a patch.

I’m still wondering why this wasn’t an issue on other platforms. One possibility is that Use is being bumped up to 16 bytes in size, and thus Users never get placed at less than an 8-byte boundary. However, in that case, the whole point of the “use diet” optimization is lost! I’ll investigate and try to find out.

I’m also still not sure why the assertion in PointerIntPair::setPointer() did not get triggered by the User::allocHungOffUses() implementation in Use.cpp. Perhaps the assertion is wrong (it looks reasonable, though) or perhaps there is something else going on I haven’t seen yet. I’ll look into this some more as well.

Let me know if the workaround works for you, and I’d appreciate feedback from anyone (Chris? Gabor?) as to how best to proceed.

Thanks!

Stefanus

pointer_traits_workaround.diff (979 Bytes)

I still don't understand why this is a problem, but I decreased the default to 2 bits. Please verify that this helps,

-Chris

Hi Nicolas,

Looks like Preston and I have found the cause of the problem. The issue is with PointerLikeTypeTraits<T*>::NumLowBitsAvailable. This is set to 3, which basically assumes that unless the traits are specialized for a particular pointer type, objects of that type are allocated with malloc() and aligned to 8 bytes.

While PointerLikeTypeTraits is overloaded for Use*, it is not overloaded for User*. lib/VMCore/Use.cpp uses a PointerIntPair<User*, 1, Tag>, and things go bad. Users are typically allocated in an array after a bunch of Uses, which are 12 bytes in size, so they may actually only be aligned to 4 bytes.

The attached patch (which I don’t intend to commit, it’s just a workaround) works around this simply by dropping this down to 2 bits, which gets us past our problem on Windows. I would appreciate if you could verify that this works for you as well.

To actually solve this properly I see two main options:

(1) we could specialize PointerLikeTypeTraits for User*, and leave the default value of NumLowBitsAvailable at 3.
(2) we could drop the default NumLowBitsAvailable to 2 (or even use _alignof and similar compiler-specific extensions to determine it), and allow classes that assert that they are only ever allocated through malloc to relax it back up to 3.

Something like this device should suffice; no need for extensions.

template
class alignment_of {
struct test {
char a;
T b;
};

public:
enum { value = sizeof(test) - sizeof(T) };
};

// usage: alignment_of::value

TR1’s type_traits header is now available on many platforms now, too.

That's already in llvm/Support/AlignOf.h.

-Chris

Hi Stefanus,

I had actually tried that already, but my project kept crashing. However, after giving it a second try I noticed that the crash was happening elsewhere this time. After disabling the GVN optimization pass everything worked fine. Somewhere between revision 67979 and 67996 the bug that was causing GVN to fail was fixed.

So thanks for tracking this down!

I agree that option (2) is a good solution for now. But in my humble opinion the whole idea of storing other information in pointers is not very robust. One alternative would be to have a global store of all Value objects. They can then be identified with just an index instead of a pointer. This index can be stored in a bitfield that is smaller than 32-bit, leaving bits available for additional data…

Cheers,

Nicolas

I think I've figured out what's going on, and why no assertions are caused by this. It doesn't necessarily have anything to do with User* pointer alignment per se (although I believe that could still cause trouble, though I would expect asserts to catch things in that case).

Here's what I think is happening:

Use::getUser() casts the last element of the Use array to an AugmentedUse, then pulls out the tagged pointer that AugmentedUse adds.

However, in the case where the User is actually located at the end of the Use array, there's no tagged pointer there, just some "random" data corresponding to the first sizeof(User*) bits of a User instance.

The code seems to assume that the relevant bit from the tagged pointer in AugmentedUse is going to be zero in that case -- if that bit is one, it considers the AugmentedUse reference valid and pulls out the pointer.

Since Value (and therefore User) is polymorphic, there's (probably?) going to be a vtable pointer at the beginning of User. So this code seems to be assuming that the relevant bit of that vtable pointer is going to be zero. I'm guessing that virtual tables from MSVC happen to be less strictly aligned than those from GCC/Linux/OS X.

Now I'm wondering what to do about this. If we know that the first sizeof(User*) bits of User are indeed always a virtual table pointer, and that the vtables are aligned to at least 4 bytes, then the current solution (setting the alignment of User* to 4 bytes) will continue work. There's the question of whether we really want to rely on this though.

If we don't want to rely on this, we need to distinguish between indirectly referred Users and at-end-of-Use-array Users differently somehow. Of course we could just always store a User* and set it to null if the User is at the end of the array, but that would increase the size of most Users by sizeof(User*). Maybe we could encode this information somehow into the tag bits of the Uses, but I can't see an easy way to do this.

I'd appreciate your thoughts. In the meantime I will probably commit some additional comments for Use.cpp to make the code a bit easier to understand, as well as putting an assert in place to catch this earlier.

Stefanus

> I still don't understand why this is a problem, but I decreased the
> default to 2 bits. Please verify that this helps,

I think I've figured out what's going on, and why no assertions are
caused by this. It doesn't necessarily have anything to do with User*
pointer alignment per se (although I believe that could still cause
trouble, though I would expect asserts to catch things in that case).

Here's what I think is happening:

Use::getUser() casts the last element of the Use array to an
AugmentedUse, then pulls out the tagged pointer that AugmentedUse adds.

However, in the case where the User is actually located at the end of
the Use array, there's no tagged pointer there, just some "random"
data corresponding to the first sizeof(User*) bits of a User instance.

Hi Stefanus!

Your analysis is perfectly correct. I Was AFK for the last two days,
so I couldn't
tell you this same story.

The algorithm relies on the fact that the LSBit of the first "pointer"
in User
is zero. This is normally the case with VPtrs, which are "normally"
placed there
by ABIs. Originally, I just checked that bit verbatim, and later by
means of
PointerIntPair. The problem arose when PointerIntPair's layout
changed.
I think we could switch back without a loss.

Btw. the alignment of pointers in a 32 bit machine is 4, so I do not
understand
how the traits define 3 free bits for them. Maybe sabre can explain
that change
which caused the breakage.

Is there any acute breakage left?

I assume that there will be problems with layouts on compilers with
non-standard
ABIs that place the Vptr in a different location, or use other means
of dispatch.

I am happy to work on this if you now a serious platform that breaks
my assumption.

Cheers,

    Gabor

Hi Gabor,

Your analysis is perfectly correct. I Was AFK for the last two days,
so I couldn't
tell you this same story.

Thanks, glad I was on the right track :).

The algorithm relies on the fact that the LSBit of the first "pointer"
in User
is zero. This is normally the case with VPtrs, which are "normally"
placed there
by ABIs. Originally, I just checked that bit verbatim, and later by
means of
PointerIntPair. The problem arose when PointerIntPair's layout
changed.
I think we could switch back without a loss.

Btw. the alignment of pointers in a 32 bit machine is 4, so I do not
understand
how the traits define 3 free bits for them. Maybe sabre can explain
that change
which caused the breakage.

Well, that's actually been that way for a long time. What's changed recently is that the high available bits are actually used now, whereas before, if you only asked for one bit, only the LSB would be used.

As the (now inconsistent) comments in the code explain, the traits class assumed that pointers (not the pointees, the pointers themselves!) got allocated by malloc unless otherwise specialized, and that malloc will return 8-byte-aligned pointers. Obviously this is not true in general; the trait should probably be using AlignOf anyways, since pointers might even be less aligned than that on say 16-bit systems. I'm planning to clean this up.

Is there any acute breakage left?

I don't believe so, at least not under Windows.

I assume that there will be problems with layouts on compilers with
non-standard
ABIs that place the Vptr in a different location, or use other means
of dispatch.

I am happy to work on this if you now a serious platform that breaks
my assumption.

I don't know of any. Out of curiosity, do you have any reference that guarantees that MSVC will always place a pointer at the start of a User?

As I mentioned I have a patch pending that I'll commit soon that adds a bunch of comments and will document this assumption in the code. I was also wondering if we could get rid of AugmentedUse. Is there any reason we don't just cast the actual end of the list (i.e. the result of getImpliedUser) to a PointerIntPair (or a void* passed to it) directly? Unless there's some specific reason why AugmentedUse exists, it seems to me like it only makes the code more confusing. If you're OK with this change I'll make it along with my comment changes. I wouldn't mind renaming getImpliedUser to getEndOfUses or something like that either...

I also want to add an assert to catch having a non-zero bit in there directly when the User is constructed in the future. I'm just not sure where to put it yet.

Stefanus

I had made a toy language a month ago to catch back up to the latest
svn LLVM api and for some reason anytime I used a compare operator (<,
=, or > are all this toy language has) that was inside a function
definition (a prime example is this code "(begin (define (fib n) (if
(< n 3) 1 (+ (fib (- n 1)) (fib (- n 2))))) (fib (+ 1 3)))" of the toy
language, even if reduced so all it does is return the (< n 3) part),
then anytime I JITed the function and called it, when the JIT compiled
it, it always crashed. I ran out of time to mess with it due to RL
issues, but I tried to debug that one section of code for 3 days
straight, tried every combination I could think of and so forth. When
I debug into the JIT compiler and followed it, it always crashed when
it got the Uses of the instruction that used the compare (which was
always a zero-extend to 32-bit integer since every variable in this
toy language is a 32-bit integer, but it also did it if I removed that
instruction, changed it to something else, etc...) it went into the
function, and ended up going through about 8 or so other functions
that did not look like they did anything but route to other
instructions (testing class type, removing references if necessary, so
on and so forth, would not std::tr1::type_traits be *much* better for
this stuff, just including Boost.Type_Traits would make it work on
other platforms, and less code in your base to deal with), and the
final instruction call returned a pointer to what should have been the
compare, but it was always in some other random (non-allocated)
memory. Although I cannot check right now (still real-life issues in
the way, I could check in two weeks, maybe sooner if I get a free
moment while at home), but from what I remember of the pointer value,
I would bet real money that the bit pattern got shifted from what the
Compare instruction pointer value was supposed to be.

Hence, I bet the issue I was seeing and tried to fix for a couple of
days before I ran out of time is this exact issue, and yes, I am using
Visual Studio 2005. I have the toy language at my local home svn, if
anyone wants access I can give you all the link so you can download
and try it out (built with LLVM trunk SVN about a month or two ago,
and Boost 1.40, hence, Boost trunk SVN) to confirm that this is the
issue, or if you fix this issue in LLVM trunk, then I can resync and
rebuild LLVM and see if that fixes the problem I have, the toy
language code "(begin (define (tes n) (< n 1)))" reliably causes the
bug I had been experiencing (as does the full Fibonacci function
above).

Actually, I am *very* curious if this is the bug. I can try to see if
it is now that I know what to look for (or if you fix it in SVN then I
will first make sure the bug still exists in mine, when it does then I
will update LLVM to the latest trunk and test again, if it was fixed
that I will be giving many thanks), but the earliest that I might be
able to check will be this coming Thursday night.

If it is the same problem (and it very much sounds like it is), it should be fixed in trunk now. Chris reduced the alignment assumption on pointers to 2 a few days ago. Sorry if that wasn't clear from the thread. The discussion since then has mostly been about whether we can improve on this, or if the status quo is OK.

Stefanus

Also, it does not only happen with the JIT, but also some passes, for
example, if I add this pass:
m_FunctionPassManager->add(llvm::createPromoteMemoryToRegisterPass());
It happens inside that one as well, so it is not just a JIT occurrence.

Wonderful then, I will give it a try at my next available opportunity
(probably this coming Thursday night, unless I decide to shirk a duty
or two...).