I'm working on defining the instructions and implementing the lowering code for a Z80 backend. For now, the backend supports only the native CPU-supported datatypes, which are 8 and 16 bits wide (i.e. no 32 bit long, float, ... yet).

So far, a lot of the simple stuff like immediate loads and return values is very straightforward, but now I got stuck with ISD::TRUNCATE, as in:

     typedef unsigned char uint8_t;
     uint8_t Func(uint8_t val1) { return val1 + val1; }

built with -O0 results in:

     target datalayout = "e-m:o-S8-p:16:8-p1:8:8-i16:8-i32:8-a:8-n8:16"
     target triple = "z80"
     ; Function Attrs: noinline nounwind optnone
     define dso_local zeroext i8 @Func(i8 zeroext %val1) #0 {
       %val1.addr = alloca i8, align 1
       store i8 %val1, i8* %val1.addr, align 1
       %0 = load i8, i8* %val1.addr, align 1
       %conv = zext i8 %0 to i16
       %1 = load i8, i8* %val1.addr, align 1
       %conv1 = zext i8 %1 to i16
       %add = add nsw i16 %conv, %conv1
       %conv2 = trunc i16 %add to i8
       ret i8 %conv2

I looked into the X86 backend, which has a Z80-like register design, i.e. being able to access the subregs AL (and AH) from AX directly, without any specific truncation operation necessary. But, to be honest, I do not really understand from the code where and how the i16 to i8 case is handled.

So returning an 8 bit result would simply require loading the lower 8 bits ("AL" on X86) from the resulting value 16 bit (%add) into the 8 bit return register, as defined by the calling convention.
(Or to be Z80 specific: The 16 bit add operation will be "ADD HL,DE", calling conv defined register "A" be the i8 return value, so the last two IR lines should emit something like "LD A,L / RET".)

That said, what is the correct way to implement ISD::TRUNCATE this in the backend, using the CPU's capability that truncating i16 to i8 is simply accessing an i16' register's subreg?

Should this be handled in "LowerOperation" or in "PerformDAGCombine"?
Or could this be done with a target-independent combine?
Would returning true in "isTruncateFree" suffice?
Is any lowering code needed at all?

The X86 backend seems to do both, "setTargetDAGCombine(ISD::TRUNCATE)", but then also registering a lot of MVTs via "setOperationAction(...,Custom)", depending on things like soft-float.
I guess I'm

And second:
In my case, with only i16 and i8 data types, And are there other truncation operations to be supported? Is there any scenario where i8 to i1 is needed? My first guess was for conditional branching, but my tests showed that it works with flags, comparing "not equal" or "not zero", so I assume not.


The X86 i16->i8 case is handled with these two patterns in X86InstrCompiler.td. One for 32-bit mode where we have to be careful to ensure we are starting from AX/BX/CX/DX. 64-bit uses a separate simpler pattern since SP/BP/SI/DI gain SPL/BPL/SIL/DIL in 64-bit mode.

def : Pat<(i8 (trunc GR16:$src)),

def : Pat<(i8 (trunc GR16:$src)),
(EXTRACT_SUBREG GR16:$src, sub_8bit)>,

Ah, I see... Clever, no custom code required.

I was hoping for that, but wasn't sure, looking at the X86 code.

Thanks Craig