Building a stable bitcode format for PNaCl - based on LLVM IR

Hello,

[The first paragraph is safe to skip if you already know what PNaCl is.]
The Portable Native Client (PNaCl) project is a toolchain for producing portable bitcode from C and C++ code and running in securely and efficiently on the web via Native Client. For more details see this presentation from the last Google I/O: https://developers.google.com/events/io/sessions/325679543and http://www.chromium.org/nativeclient/pnacl/building-and-testing-portable-native-client

PNaCl uses a subset of LLVM IR as its bitcode. Our goal is a single bitcode file that can be “translated” on a target machine to a sandboxed native executable for the target architecture and executed. This presents a number of challenges with architecture independence and backwards compatibility.

This is a document we’ve been using internally to coordinate the effort to simplify LLVM IR to the level where it’s suitable to serve as a portable, backwards compatible bitcode. After a general introduction, it presents concrete steps the PNaCl toolchain performs to simplify LLVM IR, with some discussion of their pros/cons. This is based on a few years of observing changes in LLVM IR and their meaning for PNaCl.

We’ve considered the points made by Dan in his “LLVM IR is a compiler IR” post (http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-October/043719.html) and have also discussed this with him a couple of times since then. We believe that the changes described by the attached document, together with other PNaCl-specific characteristics, make the chosen subset suitable for the task.

Any comments and questions are very welcome. Our goal in publishing this is to facilitate an open discussion, as well as serve as a reference point for other projects interested in using LLVM IR for portable and/or stable use cases. Note that the document is a work-in-progress and some details may change. We intend to publish a more structured reference for PNaCl bitcode at some point in the future.

The document is available as a PDF here:
https://docs.google.com/a/chromium.org/viewer?a=v&pid=sites&srcid=Y2hyb21pdW0ub3JnfGRldnxneDo0OWYwZjVkYWFjOWNjODE1

And in text here:
https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/stability-of-the-pnacl-bitcode-abi

Eli

Instead of a blacklist, why not a whitelist? Given the size of LangRef, you’re bound to leave something out of your blacklist that needs to be there (also, future additions to LLVM IR will need to be added to the blacklist; are you sure you can catch all of them?). A whitelist seems much less prone to breakage or unexpected behavior.

– Sean Silva

Hi Sean,

Which blacklist are you referring to? In all places where we specifically
allow or disallow certain constructs (such as specific instructions,
intrinsics, linkage modes and so on) we use a whitelisting strategy.

Eli

What I'm saying is that the approach to defining the format seems to be
basically "the format is LLVM IR, except ...". The "except ..." is
effectively a blacklist. You are starting with LLVM IR and then removing
(i.e. blacklisting certain aspects)

-- Sean Silva

I just think it's a more useful discussion format for people knowledgeable
about LLVM. Dumping a huge LangRef-like reference manual on people is less
discussion-friendly :slight_smile: As I've mentioned, in reality (= code), the
approach is whitelisting so we shouldn't miss things that get added in
future LLVMs.

Eli

Well, PNaCl is defining a subset of LLVM IR. Language subsets can be
defined by whitelisting or blacklisting features. Defining a subset of a
language doesn't mean we're inherently doing blacklisting in PNaCl.

We are deferring to LLVM's definition of the language semantics for the
features that PNaCl whitelists, and I think that's what you're referring
to. For example, we currently don't define what "phi" instructions mean;
we rely on LLVM's definition. If this became a problem, we could write
down our own definition to make PNaCl's language better defined. For
example, we could specify whether a phi node requires 1 or 2 entries for a
basic block B if there are 2 incoming edges from B. I think the LLVM
Language Reference currently does not specify this.

Cheers,
Mark

Is it possible to use PNaCl infrastructure (i.e. translation and execution in a sandbox) without a Chrome ?

I mean a something like a standalone VM like Java or Mono/C#.

Dmitri

I must have not seen that comment. Cool.

-- Sean Silva

Yes. The NaCl tool 'sel_ldr' will run a program inside a sandbox outside
of the web browser. We do a lot of the testing of PNaCl this way.

Cheers,
Mark

Mark Seaborn wrote:

For example, we could specify whether a phi node requires 1 or 2 entries
for a basic block B if there are 2 incoming edges from B. I think the
LLVM Language Reference currently does not specify this.

Going on a tangent, the answer is that it must have 2 incoming edges and the verifier does verify this. Feel free to add text to the LangRef.

Nick

From the provided documentation I understood that in memory data structures of a PNaCl program are incompatible to the host program because ABIs are different (e.g. PNaCl pointers are always 32-bit even when running on x86_64 platform).
So PNaCl program can't access any data structures of the host program directly. The only communication way is by using syscalls, but the document does not specify syscalls in detail.

How the syscalls are represented in LLVM IR ? What kind of data structures can be passed ?

Best,

Dmitri

From the provided documentation I understood that in memory data
structures of a PNaCl program are incompatible to the host program because
ABIs are different (e.g. PNaCl pointers are always 32-bit even when running
on x86_64 platform).
So PNaCl program can't access any data structures of the host program
directly. The only communication way is by using syscalls, but the document
does not specify syscalls in detail.

We should probably clarify in the final documentation, but the goal of
PNaCl is to not only be portable and fast but also be safe for the user,
and the way this is achieved is through NaCl's SFI. Specifically for
syscalls:
  http://www.chromium.org/nativeclient/reference/anatomy-of-a-sys

In a way the syscalls offered are defined by the embedding sandbox: NaCl
through Chrome and sel_ldr have documented interfaces, and the NaCl SDK
offers POSIX-like interfaces built on top of these.

But this discussion is about stable bitcode format, or do you want to restrict the set of syscalls on the LLVM level ?
I am interested for my project in having a stable portable bitcode format like the one you propose, but not as a part of a browser and I possibly need to have an extended set of syscalls. The documentation does not provide an example how the syscall is represented in the LLVM bitcode. Is this just a function call ?

Best,

Dmitri

Yes - it's just a function call. The system calls are the same for NaCl and
PNaCl - they are part of the NaCl ABI but not part of the portable bitcode
language.

Eli

Yes, any calls to interact with the world outside the sandbox are just done
via function calls. The PNaCl program defines an entry point, _start(),
which gets passed a data structure containing a pointer to an interface
query function. The user code can call this query function to get function
pointers for further interfaces such as write(), mmap(), thread_create(),
and others. These interfaces are defined in
https://src.chromium.org/viewvc/native_client/trunk/src/native_client/src/untrusted/irt/irt.h?revision=11525.
This interface layer is already used by NaCl, so it's somewhat orthogonal
to PNaCl's subset of LLVM IR.

Cheers,
Mark

The document is available as a PDF here:
https://docs.google.com/a/chromium.org/viewer?a=v&pid=sites&srcid=Y2hyb21pdW0ub3JnfGRldnxneDo0OWYwZjVkYWFjOWNjODE1

And in text here:
https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/stability-of-the-pnacl-bitcode-abi

First, sorry for jumping in an old thread. I had this marked but had
not been able to read it before.

The document talks about ABI stability, but it is a bit unclear how
far this goes. From older discussions, there are/were 3 interesting
cases

1 Loading a stand alone application (pexe) in a new VM/browser.
2 Using a native library compiled with SDK N in an IR application
compiled with SDK M.
3 Using a IR library compiled with SDK N in an IR application compiled
with SDK M.

The document discusses 1. Items 2 and 3 are probably harder, but not
mentioned. Is there any support for user provided dynamic libraries?

Cheers,
Rafael

Hi Rafael,

In the first release, there is no support for user-provided dynamic
libraries. The user is expected to compile her whole application statically
into a single .pexe and distribute that. We do plan to add "dynamic"
libraries in future releases, and we tried to construct the ABI spec in a
way that would not interfere with these plans.

Eli