RFC: On non 8-bit bytes and the target for it

This RFC is to ask whether the community is interested in further discussion of iN bytes support. Last time the issue was on the agenda in May and the discussion was triggered by Jesper Antonsson’s patches (see https://lists.llvm.org/pipermail/llvm-dev/2019-May/132080.html).

It seems that, while some downstream areas benefit from non-8-bit bytes support, this feature is barely maintainable given the lack of utilization targets in the upstream. The reason why I would like to again raise the matter is that we, the TON Labs team, would like to upstream our backend solution.

The backend generates code for TON virtual machine designed to run smart contracts in TON blockchain (see the original specifications for TVM and TON respectively at https://test.ton.org/tvm.pdf and at https://test.ton.org/tblkch.pdf).

The target has the following key particularities:

  • stack-based virtual machine
  • 257-bit wide integers, signed magnitude representation
  • no float point arithmetic support
  • persistent storage
  • no “native” memory; modeling is possible by costly
  • presence of custom types (it is exactly the reason for upstreaming)

Given that the TVM only operates with 257 bits wide numbers, we changed LLVM in downstream to get a 257 bits byte. At the moment, we have a hacky implementation with a new byte size hardcoded. For a reference: the scope was to change approximately 20 files in LLVM and about a dozen in Clang. Later on, we plan to integrate the new byte size with data layout according to https://archive.fosdem.org/2017/schedule/event/llvm_16_bit/. And if the community decides to move on, we will upstream and maintain it.

We realize that a 257 bits byte is quite unusual, but for smart contracts it is ok to have at least 256 bits numbers. The leading VM for smart contracts, Ethereum VM, introduced this practice and other blockchain VMs followed. Thus, while TVM might be the first LLVM-based target for blockchain that needs the feature, it is not necessarily the last one. We also found mentions of 12, 16 and 24 bits wide bytes in non-8-bits byte discussions in the past (in reverse chronological order: https://lists.llvm.org/pipermail/llvm-dev/2019-May/132080.html, http://lists.llvm.org/pipermail/llvm-dev/2017-January/109335.html, http://lists.llvm.org/pipermail/llvm-dev/2017-January/108901.html, http://lists.llvm.org/pipermail/llvm-dev/2015-March/083177.html, http://lists.llvm.org/pipermail/llvm-dev/2014-September/076543.html, http://lists.llvm.org/pipermail/llvm-dev/2009-September/026027.html).

Our Toolchain is going to be based only on OSS. It allows using the backend without getting any proprietary software. Also, we hope that implementation for a target similar to TVM would help to generalize some concepts in LLVM and to make the whole framework better suit non-mainstream architectures.

Aside from non-i8 bytes, we would like to bring stack machine support in the Target Independent Code generator. The matter will be discussed at the developers’ meeting, see http://llvm.org/devmtg/2019-10/talk-abstracts.html#bof2.

LLVM and Clang for TVM are available at (https://github.com/tonlabs/TON-Compiler). It is currently under LLVM 7 and it can only produce assembler; we have not specified our object file format yet). Moreover, we have introduced custom IR types to model Tuples, Slices, Builders, Cells from the specification. We are going to do an LLVM update and consider using opaque types before starting to upstream.

I’d like to understand what programming model you see programmers using. You don’t need 257 bits per byte if you only offer 257 bit integers. Rather, bytes aren’t really a thing at that point. LLVM kinda handles iN already, and your backend would legalize everything to exactly this type and nothing else, right? Would it be sufficient to expose something like int with Size=257 for your programming environment?

It would also be useful to understand what other changes you’re proposing, especially your mention of Tuples, Slices, Builders, Cells.

To add to what JF says:

Typically, a byte means some combination of:

1. The smallest unit that can be indexed in memory (irrelevant for you, you have no memory).
2. The smallest unit that can be stored in a register in such a way that its representation is opaque to software (i.e. you can't tell the bit order of a byte in a multi-byte word). For you, it's not clear if this is 257 bits or something smaller.
3. The smallest unit that is used to build complex types in software. Since you have no memory, it's not clear that you can build structs or arrays, and therefore this doesn't seem to apply.

From your description of your VM, it doesn't sound as if you can translate from any language with a vaguely C-like abstract machine, so I'm not certain why the size of a byte actually matters to you. LLVM IR has a quite C-like abstract machine, and several of these features seem like they will be problematic for you. There is quite a limited subset of LLVM IR that can be expressed for your VM and it would be helpful if you could enumerate what you expect to be able to support (and why going via LLVM is useful, given that you are unlikely to be able to take advantage of any existing front ends, many optimisations, or most of the target-agnostic code generator.

David

Just to clarify, the VM doesn’t have memory indeed, but we emulate the memory with dictionaries (address → value) which are native to TVM. Thus you can work with arrays and structures in TVM.
However, access to a dictionary is very expensive in terms of gas (fee you pay for a contract execution in the blockchain). We really don’t want to have unaligned memory access things like that. Aside from that, our “ALU” only support 257-bit operations and handling overflows of smaller types is an additional expense for a user. So we set sizeof(char) == sizeof(short) == sizeof(int) == sizeof(long) == sizeof(long long) == 1 byte == 257 bits in C. Luckily, the C spec allows it. We do not have a specification requirement of doing so, but we found it natural from implementation and user experience point of view.

Our goal is to allow using general-purpose languages to develop smart contracts since we believe it was a shortcoming of Etherium to focus solely on Solidity. That why we decided to use LLVM. As for the LLVM specification coverage, at the moment we support operations with memory (they are probably not well tested yet, but there is a bunch of tests on arrays) and structures, all integer arithmetic and bitwise operations, control-flow instruction excluding exception handling stuff and indirectbr, comparisons, extensions and truncations (we do have smaller values than i257 that are stored in persistent memory, where a user pays for data storage; but persistent memory is a different story, it will likely to become a different address space in future, but now it’s only accessible through intrinsics). We also support memcpy and memset in non-persistent memory.

As for Slices, Builders and the rest, we aren’t that crazy to really propose them being upstreamed - it’s very specific to our VM. It’s an implementation detail at the moment - we did introduced these entities as types, basically because of time pressure on the project. We want to switch to opaque types if it’s possible without losing the correctness of our backend. If it’s impossible well, we will probably start looking for a way to change the framework so that a target could introduce it’s own type, but I really hope it won’t be the case.

So the scope of the changes we’d like to introduce:

  1. Getting rid of byte size assumption in LLVM and Clang (adding byte size to data layout, removing magic number 8 (where it means size of byte) from LLVM and Clang, introducing the notion of byte for memcpy and memset). The C spec doesn’t have this constraint, so I’m not sure that LLVM should be more restrictive here.
  2. Adding support for stack machines in the backend (generalizing algorithms of converting register-based instruction to stack-based ones, the generic implementation of scheduling appropriate for a stack machine and implementation of stack-aware (i.e. configurable) reassociation). It was discussed during BoF talk at the recent conference. We are going to summarize the results soon.
  3. The backend itself.

So basically, we believe that (1) is beneficial for Embecosm, Ericsson and other companies that were actively involved in the previous iterations of non-8-bits byte discussion in the past. (3) fixes the main concern of the community: the testability of these changes. (2) benefits WebAssembly and further stack machines implemented in LLVM.

Hi Dmitriy,

I can confirm that Ericsson remains interested in the byte-size issue.
We would be more than happy to contribute/collaborate on patches,
suggestions and reviews in that area, should your upstreaming effort
win community approval.

Best regards, Jesper

Right. A 257-bit target is a bit crazy, but there are lots of other targets that only have 16-bit or 32-bit addressable memory. I’ve heard various people saying that they all have out-of-tree patches to support non-8-bit-byte targets, but because there is no in-tree target that uses them, it is very difficult to merge these patches up stream.

I for one would love to see some of these patches get upstreamed. If the only problem is one of testing, then maybe we could make a virtual target exist, or maybe we could accept the patches without test cases (so long as they doesn’t break 8-bit-byte targets obviously).

-Chris

Thanks, Chris, for supporting the idea to have non-8-bits byte in LLVM.

I want to clarify the scope and then analyze the options we have.

The scope:

  1. BitsPerByte or similar variable should be introduced to data layout; include/CodeGen/ValueTypes.h and some other generic headers also need to be updated and probably become dependent on the data layout.
  2. Magic number 8 should be replaced with BitsPerByte. We found that 8 is used as “size of a byte in bits” in Selection DAG, asm printer, analysis and transformation passes. Some of the passes are currently independent of any target specific information. In downstream, we changed about ten passes before our testing succeeded, but we might have missed some cases due to the incompleteness of our tests.
  3. &255 and other bits manipulations. We didn’t catch many of that with our downstream testing. But again, at the moment, our tests are not sufficiently good for any claims here.
  4. The concept of byte should probably be introduced to Type.h. The assumption that Type::getInt8Ty returns type for a byte is baked into the code generator, builtins (notably memcpy and memset) and more than ten analysis and transformation passes.

Noteworthy to say, that these changes should apply to the upcoming patches as well to the existing ones, and if we decide to move on, and developers should no longer assume that byte is 8-bits wide with an exception for target-dependent pieces of code.

The options we have.

  1. Perform 1 - 4 w/o any testing in upstream. It seems a very fragile solution to me. Without any non-8-bit target in upstream, it’s unlikely that contributors will differentiate between getInt8Ty() and getByteTy(). So I guess that after a couple of months, we’ll get a mix of 8s and BitsPerBytes in code, and none of the tests will be regressed. The remedy is probably an active contributor from downstream who is on top of the trunk and checks new patches against its tests daily.
  2. Test with a dummy target. It might work if we have a group of contributors who is willing to rewrite and upstream some of their downstream tests as well as to design and implement the target itself. The issue here might be in functional tests, so we’d probably need to implement a dummy virtual machine to run them because lit tests are unlikely to catch all issues from paragraphs (2) and (3) of the scope described.
  3. TON labs can provide its crazy target or some lightweight version of it. From the testing point of view, it works similar to the second solution, but it doesn’t require any inventions. I could create a separate RFC about the target to find out if the community thinks it’s appropriate.

I'm not great at history, are there any historically iconic targets
that aren't 8-bit but are otherwise sane? I'd prefer to spend the
project's resources supporting something like that than either an
invented target or a speculative crypto-currency oddity.

Cheers.

Tim.

PDP10 is iconic enough?

Joerg

I’d note that GCC removed its last upstream target with a BITS_PER_UNIT != 8 in version 4.3 in 2008 (that was TMS320C3x/C4x), and there have been none added since. AFAIK, they’re in option #1 mode – no testing upstream, but maybe with downstream forks that still use the ability to set it to other values, and besides, a constant is nicer than a magic “8” anyways.

Last time this was discussed, the LLVM project already came to a consensus that it’s reasonable to remove magic "8"s from the code, at least where it arguably helps code clarity – and if that helps downstream forks with weird byte-sizes too, that’s wonderful.

But, it’s not at all clear to me that it’s at all worthwhile to do more than that (e.g. changing core stuff like datalayout, introducing weird and otherwise-irrelevant targets, or trying to figure out how to test the functionality for changing the byte-width without a target).

Is it relevant to any modern compiler though?

I strongly agree with Tim. As I said in previous threads, unless people will have actual testable targets for this type of thing, I think we shouldn’t add maintenance burden. This isn’t really C or C++ anymore because so much code assumes CHAR_BIT == 8, or at a minimum CHAR_BIT % 8 == 0, that we’re supporting a different language. IMO they should use a different language, and C / C++ should only allow CHAR_BIT % 8 == 0 (and only for small values of CHAR_BIT).

  1. Test with a dummy target. It might work if we have a group of contributors who is willing to rewrite and upstream some of their downstream tests as well as to design and implement the target itself. The issue here might be in functional tests, so we’d probably need to implement a dummy virtual machine to run them because lit tests are unlikely to catch all issues from paragraphs (2) and (3) of the scope described.
  2. TON labs can provide its crazy target or some lightweight version of it. From the testing point of view, it works similar to the second solution, but it doesn’t require any inventions. I could create a separate RFC about the target to find out if the community thinks it’s appropriate.

I’m not great at history, are there any historically iconic targets
that aren’t 8-bit but are otherwise sane? I’d prefer to spend the
project’s resources supporting something like that than either an
invented target or a speculative crypto-currency oddity.

PDP10 is iconic enough?

Is it relevant to any modern compiler though?

I strongly agree with Tim. As I said in previous threads, unless people will have actual testable targets for this type of thing, I think we shouldn’t add maintenance burden.

+1: we should have a testable target in the first place to motivate the maintenance/support in LLVM. Someone should write a HW simulator for such an “academic” architecture and a LLVM backend for it! :slight_smile:

This isn’t really C or C++ anymore because so much code assumes CHAR_BIT == 8, or at a minimum CHAR_BIT % 8 == 0, that we’re supporting a different language. IMO they should use a different language, and C / C++ should only allow CHAR_BIT % 8 == 0 (and only for small values of CHAR_BIT).

I’m missing the link between the LLVM support for non 8-bits platforms and “they should use a different language” than C/C++, can you clarify?

Best,

2. Test with a dummy target. It might work if we have a group of contributors who is willing to rewrite and upstream some of their downstream tests as well as to design and implement the target itself. The issue here might be in functional tests, so we'd probably need to implement a dummy virtual machine to run them because lit tests are unlikely to catch all issues from paragraphs (2) and (3) of the scope described.
3. TON labs can provide its crazy target or some lightweight version of it. From the testing point of view, it works similar to the second solution, but it doesn't require any inventions. I could create a separate RFC about the target to find out if the community thinks it's appropriate.

I'm not great at history, are there any historically iconic targets
that aren't 8-bit but are otherwise sane? I'd prefer to spend the
project's resources supporting something like that than either an
invented target or a speculative crypto-currency oddity.

PDP10 is iconic enough?

Is it relevant to any modern compiler though?

I strongly agree with Tim. As I said in previous threads, unless people will have actual testable targets for this type of thing, I think we shouldn’t add maintenance burden.

Strongly agreed here as well.

Philip

  1. Test with a dummy target. It might work if we have a group of contributors who is willing to rewrite and upstream some of their downstream tests as well as to design and implement the target itself. The issue here might be in functional tests, so we’d probably need to implement a dummy virtual machine to run them because lit tests are unlikely to catch all issues from paragraphs (2) and (3) of the scope described.
  2. TON labs can provide its crazy target or some lightweight version of it. From the testing point of view, it works similar to the second solution, but it doesn’t require any inventions. I could create a separate RFC about the target to find out if the community thinks it’s appropriate.

I’m not great at history, are there any historically iconic targets
that aren’t 8-bit but are otherwise sane? I’d prefer to spend the
project’s resources supporting something like that than either an
invented target or a speculative crypto-currency oddity.

PDP10 is iconic enough?

Is it relevant to any modern compiler though?

I strongly agree with Tim. As I said in previous threads, unless people will have actual testable targets for this type of thing, I think we shouldn’t add maintenance burden.

+1: we should have a testable target in the first place to motivate the maintenance/support in LLVM. Someone should write a HW simulator for such an “academic” architecture and a LLVM backend for it! :slight_smile:

This isn’t really C or C++ anymore because so much code assumes CHAR_BIT == 8, or at a minimum CHAR_BIT % 8 == 0, that we’re supporting a different language. IMO they should use a different language, and C / C++ should only allow CHAR_BIT % 8 == 0 (and only for small values of CHAR_BIT).

I’m missing the link between the LLVM support for non 8-bits platforms and “they should use a different language” than C/C++, can you clarify?

C code with 257 bits per byte is nominally C, but it’s realistically incompatible with any existing C code. It’s therefore not the C people use, it’s the C the standard says should exist because historically it might have been a good idea.

Similarly, C with signed magnitude or ones’ complement integers isn’t the same C we all use.

From: llvm-dev <llvm-dev-bounces@lists.llvm.org> On Behalf Of JF Bastien via

[..]

Is it relevant to any modern compiler though?

I strongly agree with Tim. As I said in previous threads, unless people will have
actual testable targets for this type of thing, I think we shouldn’t add
maintenance burden. This isn’t really C or C++ anymore because so much code
assumes CHAR_BIT == 8, or at a minimum CHAR_BIT % 8 == 0, that we’re
supporting a different language. IMO they should use a different language, and
C / C++ should only allow CHAR_BIT % 8 == 0 (and only for small values of
CHAR_BIT).

We (Synopsys ASIP Designer team) and our customers tend to disagree: our customers do create plenty of cpu architectures
with non-8-bit characters (and non-8-bit addressable memories). We are able to provide them with a working c/c++ compiler solution.
Maybe some support libraries are not supported out of the box, but for these kind of architectures that is acceptable.
(Besides that, llvm is also more than just c/c++)

Greetings,

Jeroen Dobbelaere

From: llvm-dev <llvm-dev-bounces@lists.llvm.org> On Behalf Of JF Bastien via

[..]

Is it relevant to any modern compiler though?

I strongly agree with Tim. As I said in previous threads, unless people will have
actual testable targets for this type of thing, I think we shouldn’t add
maintenance burden. This isn’t really C or C++ anymore because so much code
assumes CHAR_BIT == 8, or at a minimum CHAR_BIT % 8 == 0, that we’re
supporting a different language. IMO they should use a different language, and
C / C++ should only allow CHAR_BIT % 8 == 0 (and only for small values of
CHAR_BIT).

We (Synopsys ASIP Designer team) and our customers tend to disagree: our customers do create plenty of cpu architectures
with non-8-bit characters (and non-8-bit addressable memories). We are able to provide them with a working c/c++ compiler solution.
Maybe some support libraries are not supported out of the box, but for these kind of architectures that is acceptable.

That’s the kind of use case I’d happily support if we had upstream testing, say though a backend. I’m also happy if we remove magic numbers.

Can you share the values you see for CHAR_BIT?

(Besides that, llvm is also more than just c/c++)

Agreed, I bring up C and C++ because they were the languages discussed in the previous proposals.

My main concern in this discussion is that we're conflating several concepts of a 'byte':

  - The smallest unit that can be loaded / stored at a time.

  - The smallest unit that can be addressed with a raw pointer in a specific address space.

  - The largest unit whose encoding is opaque to anything above the ISA.

  - The type used to represent `char` in C.

  - The type that has a size that all other types are a multiple of.

In POSIX C (which imposes some extra constraints not found in ISO C), when lowered to LLVM IR, all of these are the same type:

  - Loads and stores of values smaller than i8 or not a multiple of i8 may be widened to a multiple of i8. Bitfield fields that are smaller than i8 must use i8 or wider operations and masking.

  - GEP indexes are not well defined for anything that is not a multiple of i8.

  - There is no defined bit order of i8 (or bit order for larger types, only an assumption that, for example, i32 is 4 i8s in a specific order specified by the data layout).

  - char is lowered to i8.

  - All ABI-visible types have a size that is a multiple of 8 bits.

It's not clear to me that saying 'a byte is 257 bits' means changing all of these to 257 or changing only some of them to 257 (which?). For example, when compiling C for 16-byte-addressible historic architectures, typically:

  - char is 8 bytes.

  - char* and void* is represented as a pointer plus a 1-bit offset (sometimes encoded in the low bit, so the load / store sequence is a right shift one, a load, and then a mask or mask and shift depending on the low bit).

  - Other pointer types are 16-bit aligned.

IBM's 36-bit word machines use a broadly similar strategy, though with some important differences and I would imagine that most Synopsis cores are going to use some variation on this approach.

This probably involves a quite different design to a model with 257-bit registers, but most of the concerns don't exist if you don't have memory that can store byte arrays and so involve very different design decisions.

TL;DR: A proposal for supporting non-8-bit bytes needs to explain what their expected lowerings are and what they mean by a byte.

David

I agree - there are a lot of weird accelerators with LLVM backends, many of them aren’t targeted by C compilers/code. The ones that do have C frontends often use weird dialects or lots of builtins, but they are still useful to support.

I find this thread to be a bit confusing: it seems that people are aware that such chips exists (even today) but some folks are reticent to add generic support for them. While I can see the concern about inventing new backends just for testing, I don’t see an argument against generalizing the core and leaving it untested (in master). If any bugs creep in, then people with downstream targets can fix them in core.

-Chris

>
> > From: llvm-dev <llvm-dev-bounces@lists.llvm.org> On Behalf Of JF
> > Bastien via
>
> [..]
> > Is it relevant to any modern compiler though?
> >
> > I strongly agree with Tim. As I said in previous threads, unless
> > people will have
> > actual testable targets for this type of thing, I think we
> > shouldn’t add
> > maintenance burden. This isn’t really C or C++ anymore because so
> > much code
> > assumes CHAR_BIT == 8, or at a minimum CHAR_BIT % 8 == 0, that
> > we’re
> > supporting a different language. IMO they should use a different
> > language, and
> > C / C++ should only allow CHAR_BIT % 8 == 0 (and only for small
> > values of
> > CHAR_BIT).
>
> We (Synopsys ASIP Designer team) and our customers tend to
> disagree: our customers do create plenty of cpu architectures
> with non-8-bit characters (and non-8-bit addressable memories). We
> are able to provide them with a working c/c++ compiler solution.
> Maybe some support libraries are not supported out of the box, but
> for these kind of architectures that is acceptable.
> (Besides that, llvm is also more than just c/c++)

I agree - there are a lot of weird accelerators with LLVM backends,
many of them aren’t targeted by C compilers/code. The ones that do
have C frontends often use weird dialects or lots of builtins, but
they are still useful to support.

I find this thread to be a bit confusing: it seems that people are
aware that such chips exists (even today) but some folks are reticent
to add generic support for them. While I can see the concern about
inventing new backends just for testing, I don’t see an argument
against generalizing the core and leaving it untested (in
master). If any bugs creep in, then people with downstream targets
can fix them in core.

Thanks Chris! This is what we would like to see as well!

We have a 16bit byte target downstream and we live pretty much on top-
of-tree since we pull from llvm every day. Every now and then we find
new 8bit byte assumptions in the code that break things for us that we
fix downstream.

If we were allowed, we would be happy to upstream such fixes which
would make life easier both for us (as we would need to maintain fewer
downstream diffs) and (hopefully) for others living downstream with
other non-8bit byte targets.

Now, while we try to fix things in ways that would work for several
different byte sizes, what _we_ actually really test is 16bit bytes, so
I'm sure we fail to generalize things enough for all sizes, but at
least our contributions will make things more general than today.

And I imagine that if other downstream targets use other byte sizes
than us they would also notice when things break and would also pitch
in and generalize it further so that it in the end works for all users.

/Mikael

-Chris

_______________________________________________
LLVM Developers mailing list
llvm-dev@lists.llvm.org

https://protect2.fireeye.com/v1/url?k=8c219edf-d0a845d0-8c21de44-0cc47ad93e1a-b9df048a1ecb44b1&q=1&e=95c12902-023a-4b29-913c-87a467fe82d9&u=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-dev

David, just to clarify a misconception I might have introduced, we do not have linear memory in the sense that all data is stored as a trie. We do support arrays, structures and GEPs, however, as well as all relevant features in C by modeling memory.

So regarding concepts of byte, all 5 statements you gave are true for our target. Either due to the specification or because of performance (gas consumption) issues. But if there are architectures that need less from the notion of byte, we should try to figure out the common denominator. It’s probably ok to be less restrictive about a byte.