[RFC] Small Bitfield utilities

TL;DR: Have support in ADT for typed values packing into opaque scalar types

There are places in LLVM where we need to pack typed fields into opaque values.

For instance, the XXXInst classes in llvm/include/llvm/IR/Instructions.h that extract information from Value::SubclassData via getSubclassDataFromInstruction().
The bit twiddling is done manually: this impairs readability and prevent consistent handling of out of range values (e.g. https://github.com/llvm/llvm-project/blob/435b458ad0a4630e6126246a6865748104ccad06/llvm/include/llvm/IR/Instructions.h#L564).
More importantly, the bit pattern is scattered throughout the implementation making it hard to pack additional fields or check for overlapping bits.

I think it would be useful to have first class support for this use case in LLVM so here is a patch to get the discussion started https://reviews.llvm.org/D81580. And here the Instruction code rewritten with it https://reviews.llvm.org/D81662, I’ve added comments to highlight problems in the existing logic.

Feedback would be highly appreciated.

The code is now fairly stable and well tested but the syntax is not yet final.
I would like to get some feedback from the community on alternative syntaxes that have been suggested to me.

First, the designed API defines three functions:

  • get: unpacks the value from storage,

  • set: packs the provided value and stores it in storage,

  • test: efficiently returns whether the field is non zero.

Now, the three syntaxes:

  1. Current API, purely functional
    code: https://reviews.llvm.org/D81580

uint8_t Storage = 0;
using Amount = Bitfield<unsigned, 0, 4>;

setField(Storage, true);
unsigned Value = getField(Storage);
bool Cond = testField(Storage);

  1. A more Object Oriented API
    code: https://reviews.llvm.org/D82058

uint8_t Storage = 0;
using Amount = Bitfield<unsigned, 0, 4>;

bitfield(Storage) = true;
unsigned Value = bitfield(Storage);
bool Cond = bitfield(Storage).isSet();

  1. In between
    code: https://reviews.llvm.org/D82058#inline-754912

uint8_t Storage = 0;
using Amount = Bitfield<unsigned, 0, 4>;

Amount::fieldOf(Storage) = true;
unsigned Value = Amount::fieldOf(Storage);
bool Cond = Amount::isSet(Storage);

Proposal 1 has the inconvenience of polluting the llvm namespace but is also simpler implementation wise.
2 and 3 are more involved and add their own gotchas (cast operator, lifetime considerations) but feel more natural.

I would really appreciate guidance from the community on which syntax is preferred, or any other suggestions (isSet is not a great name)

Thank you!

Hi Guillaume,

Thanks for the RFC. I’m generally +1 on the concept. Making bit field manipulation easier seems like a good overall goal given its prevalence in LLVM.

As for the syntax, I tend to prefer that we don’t pollute the namespace. Have you considered pushing the methods into the Bitfield class? Maybe something like:

uint8_t Storage = 0;
using Amount = Bitfield::Element<unsigned, 0, 4>;

Bitfield::set<Amount>(Storage, true);
unsigned Value = Bitfield::get<Amount>(Storage);
bool Cond = Bitfield::test<Amount>(Storage);

^ Seems good to me because it’s clear what the API is doing, even without knowing what Amount is. WDYT?

– River

Hi River,

I’ve to say I really like your suggestion.
I went ahead and implemented it as https://reviews.llvm.org/D82454

I’ll wait a bit more for other people to chime in but overall it has my preference.

Documentation and code have been polished.
I’ll go ahead and submit the version #3 of the design.

Thank you to everyone who contributed!