RFC: Multiple program address spaces

Fixing llvm-dev@llvm.org to llvm-dev@lists.llvm.org

Apologies everyone for my initial email address mistake. Here's the
original RFC email. Thanks to Thomas for pointing out my error and
forwarding it to the correct address.

Hello all,

TL;DR; The current design for the implementation of reference types in
the WebAssembly backend requires the use of multiple program address
spaces. We propose an implementation of multiple address spaces in
D91428 [1] - this is a backwards compatible change.

# Problem

Currently the default program address and the default data address space
is the same, namely AS0. We can at the moment, change the program
address space with P<n> in the data layout string. This allows harvard
architectures to separate code and data into different address spaces.
However, only one program address space is allowed.

At Igalia [2], we are interested in implementing support for reference
types [3] on the WebAssembly backend. After discussions with Thomas
Lively and Andy Wingo, the design we have for reference types involves
having funcrefs and externrefs living in a different address spaces
(non-integral) from normal code/data. Since funcrefs are callable, we
also need to be able to call them. However, as things stand if we use
`P1` in the data layout, normal function calls will cease to work.

# Design Summary

The reference types implementation introduces two (reference) types:
funcrefs and externrefs.
Funcrefs are references to functions that can be called, while
externrefs are opaque. Because of the way they interact with memory they
need to live in a separate address space. However, to call funcrefs
which live in a separate address space, this address space needs to be
marked as a program address space. Since we wish that normal function
calls keep working as well, we need AS0 to be a program address space
too. This is what the solution to this RFC addresses. The Data Layout
string for WebAssembly would therefore contain ni:1-P0-P1.

# Solution

As mentioned in the TL;DR; the proposed implementation is live in D91428
[1]. The patch is small and backwards compatible, and there should be no
visible changes, unless you use multiple program address spaces.

With the patch, you should be able to use multiple Px-Py-... to mark
your address spaces as code address spaces. The first one will be
considered the default program address space. So, the order in which the
Ps show up in the data layout string _matters_.

The program address spaces are kept in a small vector without
duplicates. Therefore P0-P1-P0 is the same as P0-P1, and P0 is the
default address space.

# Refs

[1] https://reviews.llvm.org/D91428
[2] https://www.igalia.com
[3] https://webassembly.github.io/reference-types/core/

Regards,