16-bit bytes support

Hi.

I'm working on a backend for the [DCPU16](https://github.com/techcompliant/TC-Specs/blob/master/CPU/DCPU.md), a fictional CPU. The main subtlety is that the bytes are 16 bits instead of 8. There is already a [working backend](https://github.com/krasin/llvm-dcpu16), but it does a lot of source modification to support 16 bit words. I try to update it to latest llvm, but it obviously fails since the new code assumes 1 word == 8 bits. Any idea of a robust way to do such backend?

Have a good day,
Mikaël

Hi Mikaël!

16-bit byte was a major pain back in the day, and we never fixed all known failures. In part, it’s because C standard really wants 8-bit chars.

16-bit byte was a major pain back in the day, and we never fixed all
known failures. In part, it's because C standard really wants 8-bit chars.

So no real solution?

Btw, why is DCPU16 still a thing? :slight_smile:

https://github.com/techcompliant/. It's a separate team not related to Mojang which took the idea. They are on alpha now.

Also because https://github.com/FrOSt-Foundation/cFrOSt :wink:

16-bit byte was a major pain back in the day, and we never fixed all

known failures. In part, it's because C standard really wants 8-bit chars.

So no real solution?

My memory is cloudy, since that fun was in 2012, but if I remember
correctly, the major mistake I made with that backend is the decision to
pack to 8-bit chars into one 16-bit memory word. It made impossible to have
pointers to odd chars in the string, and complicated everything. The port
might have been cleaner if we had one 8-bit char in one 16-bit word. In
this case, half of the memory for the strings is wasted, but some things
would have been easier.

Another issue was the pointer arithmetic, and there's no good answer to
that: the fixes were intrusive and non-upstreamable, and they would have
been the same, if this port is done again.

The real solution would be to modify DCPU16 to be friendlier to C
compilers. One way to achieve that is to make the registers 32-bit and
allow addressing memory at 8-bit boundaries. It's okay to keep the amount
of RAM available at low numbers, if it adds fun.

Btw, why is DCPU16 still a thing? :slight_smile:

https://github.com/techcompliant/. It's a separate team not related to
Mojang which took the idea. They are on alpha now.

Oh, yes, I have heard <https://github.com/llvm-dcpu16/llvm-dcpu16/pull/196&gt;
of them a half a year ago. Are they rigid about using the pristine DCPU16?
If not, changes like I mentioned above, would make the problem to deliver a
decent LLVM backend much easier.

Also because https://github.com/FrOSt-Foundation/cFrOSt :wink:

Oh!

Note that the C specification doesn’t require that the representation of pointers to different types be the same. This was explicitly included to make this kind of system possible, where int* or long* would be a simple pointer and char* and void* would be a pointer plus the low bit stored in another machine word. You can represent this in LLVM by using two different address spaces to represent pointers to i8 and pointers to anything else.

David