Learning GlobalISel by implementing a backend for fun - a couple of problems to start off with

Good day all,

I’m learning the internals of LLVM IR, and GlobalISEL in particular by implementing a backend for a long-obsolete 8-bit processor.

Question about GEP:
if I compile the following C function to LLVM-IR:

char foo(char *p, char i) {
	return p[i];

I get IR that looks like (I’ve edited this a bit from output from another backend)

define dso_local signext i8 @foo(i8* nocapture readonly %0, i8 signext %1) local_unnamed_addr #0 {
  %3 = sext i8 %1 to i16
  %4 = getelementptr inbounds i8, i8* %0, i16 %3
  %5 = load i8, i8* %4, align 1, !tbaa !2
  ret i8 %5

That “sext” bothers me. My CPU is perfectly capable of adding an 8-bit quantity to a 16-bit pointer without it, yet I can’t get rid of it. I guess a later Combine pass in GlobalISel could fix it, but I’d rather not have it in the first place if possible. All backends seem to do it by default, so I’m guessing it’s somewhat hard-coded?


(Other questions to follow)

The extension is semantically what’s happening with 16-bit pointers. The IR isn’t intended to directly look like any given target’s addressing modes. You can handle folding out the extension as an optimization when selecting your memory instructions

Thanks! Slightly surprising, but I can live with that!