Large integers as first-class values

russellw · February 28, 2010, 7:54pm

LLVM supports integers up to about 8 million bits. This is a wonderful
feature that I would like to expose in the language I'm designing, so
that if you were, say, implementing SHA512, you could write the code
in terms of variables of type [int 512], and have it all work with
near optimal efficiency, the size known at compile time, and (unlike
the case where you were using an arbitrary precision integer class) no
heap allocation.

So my question is, is it okay to go ahead and do this, or are there
any caveats in terms of efficiency or correctness? In particular, I
remember reading something about there being problems with returning
integers larger than two machine words, but I can't find it again; is
there currently any such problem, or is it the case that if there was,
it's fixed now?

Eli_Friedman1 · February 28, 2010, 8:58pm

In terms of correctness, it should work except for the fact that the
LLVM code generators don't implement more complicated operations on
such integers, like multiplication, division, and variable-width
shifts. The issues with returning large integers are fixed, at least
on x86.

In terms of efficiency, the generated code is likely to be less than
ideal; juggling 512-bit numbers takes a lot of registers, and
everything will be unrolled. This might be okay for a 512-bit number,
but it would be a complete mess for a 2048-bit number.

Overall, for arbitrary uses, you're probably better off using a more
conventional bignum library.

-Eli

russellw · February 28, 2010, 9:02pm

But not on other platforms?

What's the largest integer such that something like 'return ((a * b) /
c) >> d' works correctly on all major platforms?

Eli_Friedman1 · February 28, 2010, 9:07pm

In terms of correctness, it should work except for the fact that the
LLVM code generators don't implement more complicated operations on
such integers, like multiplication, division, and variable-width
shifts. The issues with returning large integers are fixed, at least
on x86.

But not on other platforms?

If I recall correctly, there was some platform-specific work involved,
and I'm not sure it got done on all platforms.

What's the largest integer such that something like 'return ((a * b) /
c) >> d' works correctly on all major platforms?

Twice the size of a pointer, i.e. 64 bits on 32-bit platforms and 128
bits on 64-bit platforms.

-Eli

russellw · February 28, 2010, 9:10pm

Okay, thanks. Do I understand correctly that this is likely to
continue to be the case, so language support for large integers will
need to be implemented by other means?

Eli_Friedman1 · February 28, 2010, 9:19pm

Yes; there are no plans to change this.

-Eli

me22 · February 28, 2010, 9:51pm

Maybe it would be worth adding "iInf" to LLVM to use all the
pre-existing optimizations, then have passes to lower it to GMP or
other implementations...

russellw · February 28, 2010, 10:38pm

I see where you're coming from, but I don't think that would be
useful. There are applications where a dependency on GMP is okay, but
not across the board for a core language feature. I think I'm just
going to have to bite the bullet and go ahead and implement arbitrary
precision integers as part of my standard library.

Topic		Replies	Views
Integer questions LLVM Dev List Archives	11	106	September 10, 2008
On large vectors LLVM Dev List Archives	5	67	February 7, 2013
A C function which accepts a large integer type? LLVM Dev List Archives	5	64	May 4, 2013
An unlimited-width integer type Clang Frontend	3	120	November 8, 2019
Bignum development LLVM Dev List Archives	20	105	June 17, 2010

Large integers as first-class values

Related topics