Extending C to support non power of two bit variables

Hello,

I would like to extend the C language, using clang, to include variable
types which are not of power of two bit size. For example, I would like
to add the variable type "int5" which will be an integer of 5 bits.

Can you recommend the best course of action to implement such a
feature ?

Thank you,
Nadav Rotem

The first bit is actually figuring out how to represent these types in
C code. I'd suggest you do something similar to the existing mode
attribute for types. (See
http://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Variable-Attributes.html#index-g_t_0040code_007bmode_007d-attribute-2207
for the gcc docs; it shouldn't be too hard to find the implementation
in clang.) This way, you can do something like "typedef int int5
__attribute__((bitsized_integer(5)));" to get arbitrary integers.
(For the implementation of a new attribute, see
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20080526/005881.html.)

To support these new types for the AST and semantic analysis, you have
to add an additional type to the AST (see Type.h and ASTContext for
how types are implemented), and you need to make sure all the existing
code handles this new integer type appropriately for
promotions/compatibility/etc. You probably want to look at the C99
standard's rules for integer types and promotion to get an idea of
what is needed for a correct implementation.

For the codegen bit, you can just map the types to the LLVM
arbitrary-width integer types, which should have the right behavior.
The LLVM codegen support for non-power-of-two integer types still
isn't quite mature; however, the situation is steadily improving, so
hopefully it will be stable in a few months.

That's just a rough outline, but I think I included everything
important; if you have any more specific questions, please ask.

-Eli

Thank you Eli!

You really helped me in directing me to the right places.

You're welcome.

I hoped there
was an easier way to do this. Would such a patch be accepted into
clang ? It seems like if I were to implement non power of two variables
I would have to maintain a fork of clang.

I think it would be accepted; for the attribute itself, it seems
potentially useful, and the attribute implementation itself isn't very
much code. Having the types around would be useful for other things,
like an implementation of the mode attribute that isn't dependent on
the target having appropriate regular integer types.

A couple of small things which I just thought of: one, I'm not sure
whether the LLVM arbitrary-width integers do exactly what you expect
in terms of data layout. The way it works is that the data layout is
the same as that of the next-largest integer type (see
http://llvm.org/docs/LangRef.html#datalayout). The other thing is
something I didn't quite state explicitly: because of the C integer
promotion rules, an int5 will be immediately promoted to int whenever
its value is used.

And a minor side-note: you forgot to CC the list on your reply. Make
sure to click reply-to-all rather than reply when replying to a
mailing list message.

-Eli

I hoped there
was an easier way to do this. Would such a patch be accepted into
clang ? It seems like if I were to implement non power of two variables
I would have to maintain a fork of clang.

I think it would be accepted; for the attribute itself, it seems
potentially useful, and the attribute implementation itself isn't very
much code. Having the types around would be useful for other things,
like an implementation of the mode attribute that isn't dependent on
the target having appropriate regular integer types.

I agree. If the patch was done well, I think it would be great to have support for this in clang.

A couple of small things which I just thought of: one, I'm not sure
whether the LLVM arbitrary-width integers do exactly what you expect
in terms of data layout. The way it works is that the data layout is
the same as that of the next-largest integer type (see
http://llvm.org/docs/LangRef.html#datalayout). The other thing is
something I didn't quite state explicitly: because of the C integer
promotion rules, an int5 will be immediately promoted to int whenever
its value is used.

Right. Arbitrary bit sized values are most useful for scalar expressions. For example, if you are synthesizing hardware from LLVM IR, an i13 multiply is cheaper than an i16 multiply etc.

-Chris

>> I hoped there
>> was an easier way to do this. Would such a patch be accepted into
>> clang ? It seems like if I were to implement non power of two
>> variables
>> I would have to maintain a fork of clang.
>
> I think it would be accepted; for the attribute itself, it seems
> potentially useful, and the attribute implementation itself isn't very
> much code. Having the types around would be useful for other things,
> like an implementation of the mode attribute that isn't dependent on
> the target having appropriate regular integer types.

I agree. If the patch was done well, I think it would be great to
have support for this in clang.

> A couple of small things which I just thought of: one, I'm not sure
> whether the LLVM arbitrary-width integers do exactly what you expect
> in terms of data layout. The way it works is that the data layout is
> the same as that of the next-largest integer type (see
> http://llvm.org/docs/LangRef.html#datalayout). The other thing is
> something I didn't quite state explicitly: because of the C integer
> promotion rules, an int5 will be immediately promoted to int whenever
> its value is used.

Right. Arbitrary bit sized values are most useful for scalar
expressions. For example, if you are synthesizing hardware from LLVM
IR, an i13 multiply is cheaper than an i16 multiply etc.

Thank your for quick response. I agree. I plan to extend Clang for use
in my high level hardware synthesis system. Since I wrote the
hardware-backend for my system, I am free to synthesize hardware with a
reduced size. I found the arbitrary-width integers support in LLVM to be
very helpful. I now base my project on the gcc frontend but I hope to
migrate it to Clang in the near future.

Thank you,
Nadav