Introducing an Alignment object in LLVM

Alignment in LLVM is currently represented with an unsigned, sometimes an uint32_t, uint64_t or uint16_t.
FWIU the value has the following possible semantics:

  • 0 means alignment is unknown,
  • 1 means no alignment requirement,
  • a power of two means a required alignment in bytes.

Using unsigned throughout the codebase has several disadvantages:

  • comparing alignments may compare different things, what does A < B or std::min(A, B) mean when A or B are allowed to be 0?
  • integer promotion can kick in and turn a bool in an alignment when calling a function (1)
  • masking may lead to incorrect result when value is 0 (2)
  • integer promotion leads to not obviously correct code in the presence of signed and unsigned values (3)
  • dividing an alignment by 2 can change the associated semantic from unaligned to unknown alignment (4)
  • developers have to be defensive to make sure assumptions hold (5)
  • checking that an offset is aligned is sometimes done backward Alignment % Offset == 0 instead of Offset % Alignment == 0 (6) (7)
  • MachineConstantPoolEntry::Alignment encodes special information in its topmost bit (8) but AsmPrinter::GetCPISymbol seems to use it directly (9) instead of calling getAlignment() (10)

I have a patch to introduce alignment object in LLVM.
This patch does not fix the code but replaces the unsigned value by a type so it’s easier to introduce proper semantic later on.

The patch (11) is too big to be sent to Phabricator, arc diff complains about server’s post_max_size being too small.

I would like to seek guidance from the community:

  • Is this patch worthwhile?
  • If so, how should it be reviewed? Should it be split?

– Guillaume
PS: If you intend to have a look at it you should start with llvm/include/llvm/Support/Alignment.h

1 -
2 -
3 -
4 -
5 -
6 -
7 -
8 -
9 -
10 -

11 -

Without looking at the patch, I like this idea. Numeric quantities in general (alignment, bits, bytes, etc) are often confusing in the codebase IMO, and what you propose would help alleviate this problem.

Can you fix this incrementally? i.e. add an alignment class which has implicit conversions to the current unsigned convention, and incrementally replace it throughout the codebase. Once everything is fixed, remove the implicit conversions.

Woah this is a good idea.

I’d ask that alignment come in different bit sizes and endienesses so that we can add an alignment type to ELF types. I would love to review this and add it to llvm-objcopy. We have special functions to handle all of these ‘zero’ cases. Several other bits of code I’ve seen/written have to find maximum alignment and I’d imagine the mistake of not accounting for zero is common.

Where’s the patch? Add me as a reviewer and I’ll look at it today or Monday.

@JF Bastien : Indeed I think incremental fixing is the way to go - not my preferred option but the only feasible one considering the size of the patch.

I’ll start by introducing the Alignment object and its unittests and we can start the discussion from here.

@Jake Ehrlich : Can you point me to source code where endianness matters? I never encountered it when I refactored the code.

Also I understand why different bit sizes can help but I’m not convinced it’s worth the additional complexity (conversion, assignment, construction, comparisons between different bit sizes).
uint32_t seems to be a good fit for now, we can start with this, do the refactoring and introduce a templated type afterwards if it works for you?

For the record, the type is now in:
I’ll follow up with a bunch of patches and incrementally change the code base to use the new type.