[RFC] Coding Standards: "prefer `int` for, regular arithmetic, use `unsigned` only for bitmask and when you, intend to rely on wrapping behavior."

JohnReagan · June 12, 2019, 9:01pm

vector.size() returns a size_t, which on 64-bit platforms can represent
types values larger than those that can fit into an int64_t. So to turn
your argument around, since it's theoretically possible to have a vector
with more items than an int64_t can represent, isn't it already worth it
to use size_t, which is an unsigned type?

That's not true on my platform. I have 64-bit pointers so intptr_t is
64-bits, but the largest thing you can allocate is only 32-bits big so
size_t (and ptrdiff_t) are 32-bits.

JF_Bastien2 · June 13, 2019, 4:00pm

It runs LLVM?

JohnReagan · June 13, 2019, 4:19pm

Yes. We currently build LLVM 3.4.2 on our OpenVMS Itanium box with an
older EDG/Intel C++03 compiler to create legacy cross-compilers to our
OpenVMS x86 box (well, VirtualBox). We do have a few tweaks to the
relocations to access static data always through the GOT (including
CodeGen's static data). Our linker sees references to code (which might
be in 64-bit space) and creates trampolines in 32-bit space. That lets
any legacy code from the VAX-days to continue to take the address of a
routine and save it into some INTEGER*4 Fortran variable.

So mostly a small memory model with a few things from medium/large.

I've been unable to build the companion clang 3.4.2 however as its use
of templates seems to push our old compiler over the edge of sanity.

We're working on bootstrapping 8.0.0. Compile 8.0.0 on Linux, move
objects to our OpenVMS Itanium box to use the cross-linker (which can
handle Linux objects as well as our own), move the resulting image to
our OpenVMS x86 box... (and the same thing for libcxx, compiler-rt,
libcxxabi, etc.) and with a wave of a magic wand, we end up with a
native clang. And with a little more waving, our other legacy compilers
(our C, BLISS, Pascal, COBOL, BASIC, and Fortran)

I'm planning on submitting a lightning talk for this fall on "When 3
memory models isn't enough."

JF_Bastien2 · June 13, 2019, 4:25pm

Yes. We currently build LLVM 3.4.2 on our OpenVMS Itanium box with an
older EDG/Intel C++03 compiler to create legacy cross-compilers to our
OpenVMS x86 box (well, VirtualBox). We do have a few tweaks to the
relocations to access static data always through the GOT (including
CodeGen's static data). Our linker sees references to code (which might
be in 64-bit space) and creates trampolines in 32-bit space. That lets
any legacy code from the VAX-days to continue to take the address of a
routine and save it into some INTEGER*4 Fortran variable.

So mostly a small memory model with a few things from medium/large.

I've been unable to build the companion clang 3.4.2 however as its use
of templates seems to push our old compiler over the edge of sanity.

We're working on bootstrapping 8.0.0. Compile 8.0.0 on Linux, move
objects to our OpenVMS Itanium box to use the cross-linker (which can
handle Linux objects as well as our own), move the resulting image to
our OpenVMS x86 box... (and the same thing for libcxx, compiler-rt,
libcxxabi, etc.) and with a wave of a magic wand, we end up with a
native clang. And with a little more waving, our other legacy compilers
(our C, BLISS, Pascal, COBOL, BASIC, and Fortran)

I'm planning on submitting a lightning talk for this fall on "When 3
memory models isn't enough.”

This is a rare occurrence… but you leave me speechless. I don’t even know where to start.

pogo59 · June 14, 2019, 1:40pm

From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of JF
Bastien via llvm-dev
Sent: Thursday, June 13, 2019 12:25 PM
To: John Reagan
Cc: llvm-dev@lists.llvm.org
Subject: Re: [llvm-dev] [RFC] Coding Standards: "prefer `int` for, regular
arithmetic, use `unsigned` only for bitmask and when you, intend to rely
on wrapping behavior."

>
> Yes. We currently build LLVM 3.4.2 on our OpenVMS Itanium box with an
> older EDG/Intel C++03 compiler to create legacy cross-compilers to our
> OpenVMS x86 box (well, VirtualBox). We do have a few tweaks to the
> relocations to access static data always through the GOT (including
> CodeGen's static data). Our linker sees references to code (which might
> be in 64-bit space) and creates trampolines in 32-bit space. That lets
> any legacy code from the VAX-days to continue to take the address of a
> routine and save it into some INTEGER*4 Fortran variable.
>
> So mostly a small memory model with a few things from medium/large.
>
> I've been unable to build the companion clang 3.4.2 however as its use
> of templates seems to push our old compiler over the edge of sanity.
>
> We're working on bootstrapping 8.0.0. Compile 8.0.0 on Linux, move
> objects to our OpenVMS Itanium box to use the cross-linker (which can
> handle Linux objects as well as our own), move the resulting image to
> our OpenVMS x86 box... (and the same thing for libcxx, compiler-rt,
> libcxxabi, etc.) and with a wave of a magic wand, we end up with a
> native clang. And with a little more waving, our other legacy compilers
> (our C, BLISS, Pascal, COBOL, BASIC, and Fortran)
>
> I'm planning on submitting a lightning talk for this fall on "When 3
> memory models isn't enough.”

This is a rare occurrence… but you leave me speechless. I don’t even know
where to start.

The word-size migration is rare but not unique. The HP/Tandem NonStop
line has moved from 16-bit to 32-bit to 64-bit over the decades, and we
needed to handle mixed-size pointers and "int" weirdness at each stage.
OpenVMS is a special snowflake in terms of its LLVM bootstrapping process,
and the mind just boggles at what John and his team have had to contend
with. But the size part is certainly familiar to me, and actually simpler
than what NonStop had to deal with. (We weren't using LLVM, clearly, but
container size etc. is not a particularly LLVM-specific issue.)
--paulr

Chris_Bieneman2 · June 14, 2019, 4:33pm

I have not chimed in on the general signed/unsigned discussion, but my experience porting software to various embedded systems and game consoles over the years has taught me to always prefer using explicitly sized integer types except where size_t should be used.

While it is generally true that most modern 64-bit architectures all agree on the meanings of short, long, unsigned, int, and others. There is no guarantee of that. I would strongly support a coding standard that suggested preferring explicitly sized integer types (i.e. int64_t) over int.

-Chris

David_A_Greene · June 14, 2019, 6:32pm

+1.

-David

Chris Bieneman via llvm-dev <llvm-dev@lists.llvm.org> writes:

clattner · June 14, 2019, 10:59pm

Is there any reason not to use size_t or ssize_t for indexes? That is the technically correct type to use after all.

-Chris

Topic		Replies	Views
VectorType supported sizes Clang Frontend	1	68	April 2, 2020
LLVM & Large memory 64-bit systems LLVM Dev List Archives	1	67	December 16, 2004
On large vectors LLVM Dev List Archives	5	67	February 7, 2013
How to handle size_t in front ends? LLVM Dev List Archives	1	75	May 7, 2008
Inconsistent use of size_t in SmallVector.h Clang Frontend	1	79	June 4, 2012

[RFC] Coding Standards: "prefer `int` for, regular arithmetic, use `unsigned` only for bitmask and when you, intend to rely on wrapping behavior."

Related topics