[RFC] Proposal to make LLVM-IR endian agnostic

One of the projects I am working on with others is to make LLVM-IR endian agnostic.

So, I am sending out this proposal for feedback to the LLVM community. I’ve attached

pretty version of the proposal in PDF format and pasted a 80-column safe text version

below.

I’m looking forward to comments and feedback.

Thanks,

Micah Villmow

Text of Proposal:

RFC-SPIR.pdf (263 KB)

FWIW here is another way to do it (which is approximately what ClamAV does currently) by introducing just one intrinsic:
declare i1 @llvm.is_bigendian()

The advantage is that you can implement htonl() and ntohl() like functionality without using a temporary memory location.
Actually I think having the 2 intrinsics you suggest and the is_bigendian() intrinsic would be optimal:
you can use your 2 intrinsics for initial codegen, and mem2reg can transform it to is_bigendian().

For load/store:
<type> %val = load <type>* %ptr
<type> %sval = bswap.i<type> %val
%result = <type> select @llvm.is_bigendian(), %val, %sval

For htonl():
<type> %sval = bswap.i<type> %val
%result = <type> select @llvm.is_bigendian(), %val, %sval

(store is similar, byteswap before the store)

At bytecode JIT time / assembly emission time @llvm.is_bigendian() is a known constant, and constant propagation is
used to throw away the unwanted code path, so it becomes either:

<type> %result = load <type>* %ptr

or

<type> %val = load <type>* %ptr
<type> %result = bswap.i<type> %val

Best regards,
--Edwin

Hi Micah,

I’m no core developer, but FWIW here are my thoughts:

I’m general I think the patch is too OpenCL oriented, and I have some niggling qualms about other parts. Specifically (comments inline):

Hi Edwin,

FWIW here is another way to do it (which is approximately what ClamAV does currently) by introducing just one intrinsic:
declare i1 @llvm.is_bigendian()

why is an intrinsic needed? It is easy to write a small LLVM IR function
that computes this. For example:

define i1 @is_big_endian() {
   %ip = alloca i16
   store i16 1, i16* %ip
   %cp = bitcast i16* %ip to i8*
   %c = load i8* %cp
   %r = icmp eq i8 %c, 0
   ret i1 %r
}

Ciao, Duncan.

Hi Edwin,

FWIW here is another way to do it (which is approximately what ClamAV does currently) by introducing just one intrinsic:
declare i1 @llvm.is_bigendian()

why is an intrinsic needed?

You are right its not.

It is easy to write a small LLVM IR function

that computes this. For example:

define i1 @is_big_endian() {
   %ip = alloca i16
   store i16 1, i16* %ip
   %cp = bitcast i16* %ip to i8*
   %c = load i8* %cp
   %r = icmp eq i8 %c, 0
   ret i1 %r
}

Indeed that can be optimized away, but it needs inlining to be run to make the callers completely go away.

Best regards,
--Edwin

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu]
On Behalf Of Török Edwin
Sent: Monday, October 03, 2011 2:00 PM
To: llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] [RFC] Proposal to make LLVM-IR endian agnostic

> One of the projects I am working on with others is to make LLVM-IR
endian agnostic.
>
>
>
> So, I am sending out this proposal for feedback to the LLVM
community. I've attached
>
> pretty version of the proposal in PDF format and pasted a 80-column
safe text version
>
> below.
>
>
>
> A second smaller set could be:
>
> declare <type> @llvm.portable.load.<type>(<type>* ptr, i32 alignment,
>
> i1 host, i1 littleEndian, i1 atomic, i1 volatile,
>
> i1 nontemporal, i1 singlethread)
>
>
>
> declare void @llvm.portable.store.<type>(<type> data, <type>* ptr,
>
> i32 alignment, i1 host, i1 littleEndian, i1 atomic, i1 volatile,
>
> i1 nontemporal, i1 singlethread)

FWIW here is another way to do it (which is approximately what ClamAV
does currently) by introducing just one intrinsic:
declare i1 @llvm.is_bigendian()

[Villmow, Micah] I think the big difference in our requirements is that we can have both big endian(host) and little endian(device), or vice versa, accesses to the same pointer. So a global is_bigendian intrinsic would not work for what we are attempting to accomplish.

________________________________
From: "Villmow, Micah" <Micah.Villmow@amd.com>
To: "llvmdev@cs.uiuc.edu" <llvmdev@cs.uiuc.edu>
Sent: Monday, October 3, 2011 1:36 PM
Subject: [LLVMdev] [RFC] Proposal to make LLVM-IR endian agnostic

One of the projects I am working on with others is to make LLVM-IR endian agnostic.

So, I am sending out this proposal for feedback to the LLVM community. I’ve attached
pretty version of the proposal in PDF format and pasted a 80-column safe text version
below.

I’m looking forward to comments and feedback.

Thanks,
Micah Villmow

--snip--

Hello Micah,

Without having read a lot into your plan I'd like to make a few suggestions: Some game systems use mixed-endian datalayouts as a form of lockouts for homebrew software. While I believe it isn't a terribly effective mechanism, it does leave LLVM unable to be used for such game systems. I think LLVM should allow some sort of swizzle mechanism to allow such mixed-endian datalayouts. (I think swizzle is the correct term.)

Also, as a co-developer of Clang's AROS backend, it would be really handy to have an endian-agnostic bitcode format since our OS covers about 5 different CPU architectures, some of which are big-endian. We were hoping to base a superset of the ELF loader that would be endian-agnostic based on PNaCl's bitcode format.

Thanks for taking this challenge on,

--Samuel Crow

One of the projects I am working on with others is to make LLVM-IR endian agnostic.

So, I am sending out this proposal for feedback to the LLVM community. I’ve attached

pretty version of the proposal in PDF format and pasted a 80-column safe text version

below.

I’m looking forward to comments and feedback.

I wonder if this could be handle specifying that certain address spaces
have one or another endianness, which is not necessarily the same as the
processor endianness.

Your main requirement seems to be that you need to access to banks of
memory, with different endianess, and that you the first stage IR to be
able to be run on either endianness processor, without change. I would
assume that any given pointer points either to host memory, device
memory, or private memory, and that these pointers never get mixed. This
seems an ideal use of memory spaces.

  Tom

From: Tom Prince [mailto:tom.prince@ualberta.net]
Sent: Tuesday, October 04, 2011 10:59 AM
To: Villmow, Micah; James Molloy; llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] [RFC] Proposal to make LLVM-IR endian agnostic

I wonder if this could be handle specifying that certain address spaces
have one or another endianness, which is not necessarily the same as
the
processor endianness.

Your main requirement seems to be that you need to access to banks of
memory, with different endianess, and that you the first stage IR to be
able to be run on either endianness processor, without change. I would
assume that any given pointer points either to host memory, device
memory, or private memory, and that these pointers never get mixed.
This
seems an ideal use of memory spaces.

[Villmow, Micah] This was brought up but breaks down when you have
different endianness memory operations to the same pointer. For example,
the program wants a big endian load from offset 0x100 from pointer 'a'
and a little endian load from offset 0x100 of pointer 'a' from the global
address space. Since LLVM correctly stores the address space in the type,
this would require multiple types for each pointer, duplicating information
and requiring casts between address spaces(which is illegal). The assumption
that device and host memory are unique is not always valid, as they are
both under the more general 'global' address space. Since the compiler does
not know which part of global memory maps to host and which part to device
unless the programmer makes it explicit, there is a need to specify it in the
instruction.