Proposal: Adding aligned instruction bundle support to MC

Hello,

We (the Portable Native Client team) would like to start upstreaming
our LLVM modifications which contain support for Software Fault
Isolation (SFI) as required for sandboxing programs to run under
Native Client. Since the "total patch size" is quite big, we are
splitting the effort to manageable chunks that can be committed,
tested and reviewed separately as independently as feasible.

One of the first things we'd like to start with is support for aligned
instruction bundles in MC (assembler) level. This support exists in
gas since binutils version 2.23
(Bundle directives (Using as)).
Succinctly, the initial proposal is to add the following directives:

.bundle_align_mode <num>
.bundle_lock
.bundle_unlock

With the following semantics:

When aligned instruction bundle mode ("bundling" in short) is enabled
(.bundle_align_mode was encountered with an argument > 0, which is the
power of 2 to which the bundle size is equal), single
instructions and groups of instructions between .bundle_lock and
.bundle_unlock directives cannot cross a bundle boundary.

For example, consider the following:

.bundle_align_mode 4
mov1
mov2
mov3

Assuming that each of the mov instructions is 7 bytes long and mov1 is
aligned to a 16-byte boundary, two bytes of NOP padding will be
inserted between mov2 and mov3 to make sure that mov3 does not cross a
16-byte bundle boundary.

A slightly modified example:

.bundle_align_mode 4
mov1
.bundle_lock
mov2
mov3
.bundle_unlock

Here, since the bundle-locked sequence "mov2 mov3" cannot cross a
bundle boundary, 9 bytes of NOP padding will be inserted between mov1
and mov2.

For information on how this ability is used for software fault
isolation by Native Client, see the following resources:

* [native_client] Contents of /data/site/NaCl_SFI.pdf [PDF link]
* Overview of Native Client for ARM
* Other papers listed at
Research Papers

We want to start with this feature because it is self-contained in MC,
makes LLVM more compatible with gas, level and can be easily tested.
Future upstreaming efforts will be able to build upon this
functionality.

Any comments, ideas and suggestions are welcome.

Eli

Hi Eli,

we aresplitting the effort to manageable chunks that can
be committed, tested and reviewed separately as
independently as feasible

I’d be interested in experimenting with this stuff sooner than later. Could you dump all the patches in a publicly accessible repo? How about Github?

Thanks,
Greg

Hi Greg,
Our repos are public already, hosted on the Chromium infrastructure.
Our builds are based out of the Native Client repository. The nacl git
mirror is at chromium Git repositories - Git at Google
(no gitweb, sorry). That repo has pointers to a bunch of other repos
which make pnacl (LLVM, clang, binutils, and a few others. see the
file pnacl/DEPS for more).
But if you just want to see the code and the diff against upstream,
our LLVM repo is in
https://gerrit.chromium.org/gerrit/gitweb?p=native_client%2Fpnacl-llvm.git;a=shortlog;h=refs%2Fheads%2Fmaster
All of our clang changes are upstream already.

-Derek

Just a note that I'm interested in using this feature for a control-flow integrity implementation that I'm working on as part of my research.

-- John T.

But if you just want to see the code and the diff against upstream…

If others are interested as well, here’s all your changes via Github’s “compare” feature:

https://github.com/garious/llvm/compare/master…chromium

Fascinating stuff guys. What a great contribution.

-Greg

Thanks for the feedback, and any help reviewing future patches will be
most appreciated :slight_smile:
It's also important to state that we plan to significantly clean-up
and refactor our code prior to upstreaming. Since in some areas the
changes are significant, this also means some refactoring of existing
LLVM code. My recent patches and commits in MC are the beginning of
this effort.

Eli

Maybe I’m jumping the gun, but some initial thoughts after browsing the MC layer changes:

  • .bundle_align_start → .bundle_align_mode N ?

  • .bundle_align_end → .bundle_align_mode 0 ?

  • Add unit test showing .bundle_lock/unlock can be nested.

I like the way you modified the existing ARM tests to show how the existing instructions are predicated. Overall, I think the concept of Native Client is well-documented and easy to follow. Has anyone done work to show that assembly in this form can be disassembled to a memory-safe assembly language like TAL?

groups of instructions between .bundle_lock and
.bundle_unlock directives cannot cross a bundle boundary

Can this be relaxed to: A data instruction cannot fall on a bundle boundary. ?

Thanks,
Greg

Maybe I'm jumping the gun, but some initial thoughts after browsing the MC
layer changes:

Greg, it's great to have early comments on this.

* .bundle_align_start -> .bundle_align_mode N ?
* .bundle_align_end -> .bundle_align_mode 0 ?

The initial proposal does not cover .bundle_align{start|end} on
purpose, to keep things simple. We do plan to add them, eventually,
probably as attributes on .bundle_lock.

* Add unit test showing .bundle_lock/unlock can be nested.

Yes.

I like the way you modified the existing ARM tests to show how the existing
instructions are predicated.

Thanks. When upstreaming, I should remember using the new FileCheck
feature I recently added (variable references on the same line that
define them) which should make matching BICs more elegant.

Overall, I think the concept of Native Client
is well-documented and easy to follow. Has anyone done work to show that
assembly in this form can be disassembled to a memory-safe assembly language
like TAL?

groups of instructions between .bundle_lock and
.bundle_unlock directives cannot cross a bundle boundary

Can this be relaxed to: A data instruction cannot fall on a bundle
boundary. ?

I'm not sure what you mean, can you elaborate?

Eli

Hi Eli,

How will these bundles interact with ARM codegen? The constant island pass in particular.

-Jim

How will these bundles interact with ARM codegen? The constant island pass in particular.

Hi Jim,

This is a great question.

From the compiler's point of view, these bundles indeed pose a problem

for the constant island pass. At this point for the NaCl platform we
disable this pass, generating instructions like movt instead (NaCl
requires at least v7 for ARM). However, as a future enhancement we'd
like to follow gcc's path and adjust the pass to coexist with
potential bundling by computing worst-case estimates for the sizes of
bundle-locked instruction sequences.

From the assembler's point of view, assembly code that refers to data

using labels should generate errors in the event that these labels
become too too far to reach, similarly to the current situation in
gas.

Eli

groups of instructions between .bundle_lock and
.bundle_unlock directives cannot cross a bundle boundary

Can this be relaxed to: A data instruction cannot fall on a bundle
boundary. ?

I’m not sure what you mean, can you elaborate?

Nevermind, what I suggested was not strict enough. I see now that the purpose of unlock/lock is to prevent particular instructions from landing on the bundle boundary. For example, around a ‘bic’ and an indirect branch, where the compiler inserted the bic to make the branch safe.

-Greg

Correct. Having a load and the bic protecting it separated by a bundle
boundary is not safe because malicious code can arrange to jump to the
load without the bic being executed and thus access arbitrary memory,
escaping the sandbox. For x86 bundle locking is also important to make
sure jumps into the middle of instructions are not allowed.

Eli

How will these bundles interact with ARM codegen? The constant island pass in particular.

Hi Jim,

This is a great question.
From the compiler's point of view, these bundles indeed pose a problem
for the constant island pass. At this point for the NaCl platform we
disable this pass, generating instructions like movt instead (NaCl
requires at least v7 for ARM). However, as a future enhancement we'd
like to follow gcc's path and adjust the pass to coexist with
potential bundling by computing worst-case estimates for the sizes of
bundle-locked instruction sequences.

That's going to be challenging. Not impossible, and you can probably leverage the existing basic-block alignment code, but very tricky.

I'm curious. Have you done any code size comparisons using movw/movt instead of constant pools? What was the impact?

-Jim

This is a great question.
From the compiler's point of view, these bundles indeed pose a problem
for the constant island pass. At this point for the NaCl platform we
disable this pass, generating instructions like movt instead (NaCl
requires at least v7 for ARM). However, as a future enhancement we'd
like to follow gcc's path and adjust the pass to coexist with
potential bundling by computing worst-case estimates for the sizes of
bundle-locked instruction sequences.

That's going to be challenging. Not impossible, and you can probably leverage the existing basic-block alignment code, but very tricky.

I'm curious. Have you done any code size comparisons using movw/movt instead of constant pools? What was the impact?

Even if we did, it was probably a while ago and they have to be done
again. We'll look into it when upstreaming the relevant parts.

Eli