Explanation of the llvm::PointerIntPair data structure

Hi ,
I'm studying the llvm/clang source I came accross this line ;

97 mutable llvm::PointerIntPair<const llvm::MemoryBuffer *, 2> Buffer;

in SourceManager.h and I am curious as to the rationale behind using the PointerPair data structure ?I read the documentation about bit mangling and I would be greatful if someone could provide an explanation for the rationale behind this technique.

thanks ,
que

Hi ,
I'm studying the llvm/clang source I came accross this line ;

97: mutable llvm::PointerIntPair<const llvm::MemoryBuffer *, 2> Buffer;

in SourceManager.h and I am curious as to the rationale behind using the
PointerPair data structure ?I read the documentation about bit mangling
and I would be greatful if someone could provide an explanation for the
rationale behind this technique.

thanks ,
que

If a type requires a certain minimum alignment in memory (say 4 bytes)
then it is known that certain low bits must always be zero. (in
decimal, if a number is always a multiple of 10 then the lowest digit
is always zero - as in binary if the number is always a multiple of 2
the lowest bit is zero (and if it's a multiple of 4 the lowest two
bits are zero, etc))

Using this, we can store some other integer (if it's small enough) in
those low bits of a pointer and save space. It may seem petty, but it
can be a substantial space saving - since the pointer itself probably
has 4 or 8 byte alignment, simply putting a char (the smallest
addressable int) after the pointer in a struct would double the size
(two pointers, not just pointer + char size). In data structures with
many pointers, this may be quite expensive.

- David

The rationale is performance. The more things you can fit inside caches the better. It’s also used as a building block for type safe unions of 2, 3 and 4 pointers that take up space for one pointer.