Intel MPX support (instrumentation pass similar to gcc's Pointer Checker)

Hello,

As far as I know, there is no MPX pass in LLVM (though the x86-64
backend already declares MPX registers and instructions). I wonder if
anyone is currently working on the LLVM pass for MPX instrumentation,
similar to Pointer Checker in gcc. If yes, could anyone elaborate on
the status and accessability to other researchers? And if any help is
needed?

Prof. Santosh Nagarakatte, the author of SoftBound/HardBound/WatchDog
Lite, answered that he is not currently
involved in MPX. But he pointed to the SoftBoundCETS prototype at
GitHub - santoshn/softboundcets-34: SoftBoundCETS for LLVM+Clang version 34 . Therefore, I was
thinking about adapting SoftBound to MPX, as a drop-in replacement for
gcc's PointerChecker. Could anyone comment on this?

First, is MPX hardware available now? I wouldn't mind getting my hands on one.

Second, I think you should have a solid understanding of the different memory safety approaches (namely, the tradeoffs between referent approaches vs. approaches that extend the pointer representation (called fat pointer approaches)). In short, fat pointers provide stronger security guarantees but introduce compatibility problems with third-party code (even if they don't change the size or representation of the pointer). Referent object approaches can be made more compatible but have looser memory safety semantics.

I believe the MPX hardware was designed to implement fat pointer approaches, but you can probably do referent approaches or even some hybrid of the two. It is not clear to me what the "best" approach is, and "best" probably depends on what you are trying to accomplish and what assumptions you make about which parts of the system you are willing to recompile with the memory safety checks.

I recommend reading up on the different memory safety approaches. The Memory Safety Menagerie (Memory Safety Menagerie) provides some sources though I have let it fall a little out of date.

Third, I think using either the SoftBound and/or SAFECode source bases is a good place to start. SoftBound is probably the best starting place since it probably fits the MPX hardware better, but you might find useful stuff in the SAFECode source base as well.

If you have more specific questions about the project as you go, please feel free to ask. I've done a little work on memory safety (http://llvm.org/pubs/2007-SOSP-SVA.pdf).

Regards,

John Criswell

First, is MPX hardware available now? I wouldn't mind getting my hands on
one.

It is available at least in the mobile versions of the recent Intel
Skylake CPUs. I am currently playing with Alienware 15 R2 with the
following CPU: Intel(R) Core(TM) i7-6820HK. Interestingly, my
preliminary experiments indicate that adding MPX bounds checking via
Pointer Checker in gcc is usually slower than using software-only
AddressSanitizer.

Thanks for the other pointers!

This corresponds with other results that I have seen. The last time I looked at the output from gcc, it also did not generate pointer updates that were safe in the presence of concurrency (they must be bracketed in transactions if you want the MPX metadata and the pointer updates to be atomic) and the overhead of this is likely to be even more.

I am particularly impressed with Intel for creating a hardware implementation that is both slower than a software-only version and can not (due to its fail-open policy being embedded in the hardware) be used for security.

David

I’ve recently played with the GCC implementation of pointer checker on a real hardware,
my recent impressions are here: https://github.com/google/sanitizers/wiki/AddressSanitizerIntelMemoryProtectionExtensions
(there is also some old pre-hardware content).

In short, I totally agree with what David says above: MPX is a disaster.
(Usual disclaimer: my opinion here is too biased)

I am glad that LLVM already has the support for MPX instructions, but I see no good reason to add the MPX-based checker to LLVM.
Yes, it will allow us to detect intra-object overflows, something that asan can not do by default, but it’s not worth the extreme complexity of the MPX-based checker.

–kcc

I continue playing with Intel MPX and its support in modern compilers.
All experiments were done on the Alienware (Dell) 15 R2, Ubuntu 15.10
(linux 4.2.0), gcc version is 5.2.1, icc version 2016.1.150. The
benchmark suite is PARSEC 3.0, all versions with 1 thread and default
configs.

As I described previously, PointerChecker in gcc produces very
inefficient code. My experiments show overheads over native of up to
9.5X (on "raytrace"), with common overheads of 3X ("bodytrack",
"fluidanimate", "streamcluster"). At the same time, AddressSanitizer
performs much better -- 1.3X on "raytrace", 1.7X on "bodytrack" and so
on.

Recently I played with MPX support on Intel C/C++ Compiler (icc). This
implementation looks *much* better, with the following example
overheads: 1.2X on "raytrace", 1.25X on "bodytrack", 1.08X on
"streamcluster". So the common overheads are in the range of 15%-25%!

Please note that gcc-mpx and gcc-asan versions were compared against
gcc-native, and icc-mpx version was compared against icc-native.

We would like to compile a small technical report with all our
measurements (performance and memory overhead) and put it online.
We'll do it in the near future, I will write an update here when it's
done. Please tell me if anyone is interested in any specific
benchmarks (I want to test PARSEC and some case-studies: PostgreSQL,
Memcached, SQLite3). Any feedback is welcome.

I continue playing with Intel MPX and its support in modern compilers.
All experiments were done on the Alienware (Dell) 15 R2, Ubuntu 15.10
(linux 4.2.0), gcc version is 5.2.1, icc version 2016.1.150. The
benchmark suite is PARSEC 3.0, all versions with 1 thread and default
configs.

As I described previously, PointerChecker in gcc produces very
inefficient code. My experiments show overheads over native of up to
9.5X (on "raytrace"), with common overheads of 3X ("bodytrack",
"fluidanimate", "streamcluster"). At the same time, AddressSanitizer
performs much better -- 1.3X on "raytrace", 1.7X on "bodytrack" and so
on.

Recently I played with MPX support on Intel C/C++ Compiler (icc). This
implementation looks *much* better, with the following example
overheads: 1.2X on "raytrace", 1.25X on "bodytrack", 1.08X on
"streamcluster". So the common overheads are in the range of 15%-25%!

That's interesting.
Are you sure you are instrumenting both reads and writes with icc?

is not
Please note that gcc-mpx and gcc-asan versions were compared against
gcc-native, and icc-mpx version was compared against icc-native.

We would like to compile a small technical report with all our
measurements (performance and memory overhead) and put it online.
We'll do it in the near future, I will write an update here when it's
done. Please tell me if anyone is interested in any specific
benchmarks (I want to test PARSEC and some case-studies: PostgreSQL,
Memcached, SQLite3). Any feedback is welcome.

SPEC2006 is well know so it could be useful. Especially 483.xalancbmk
Besides, maybe you could take something that is not strictly a benchmark.
E.g. take pdfium_test (pdfium - Git at Google) and feed
several large pdf files to it.

I continue playing with Intel MPX and its support in modern compilers.
All experiments were done on the Alienware (Dell) 15 R2, Ubuntu 15.10
(linux 4.2.0), gcc version is 5.2.1, icc version 2016.1.150. The
benchmark suite is PARSEC 3.0, all versions with 1 thread and default
configs.

As I described previously, PointerChecker in gcc produces very
inefficient code. My experiments show overheads over native of up to
9.5X (on "raytrace"), with common overheads of 3X ("bodytrack",
"fluidanimate", "streamcluster"). At the same time, AddressSanitizer
performs much better -- 1.3X on "raytrace", 1.7X on "bodytrack" and so
on.

Recently I played with MPX support on Intel C/C++ Compiler (icc). This
implementation looks *much* better, with the following example
overheads: 1.2X on "raytrace", 1.25X on "bodytrack", 1.08X on
"streamcluster". So the common overheads are in the range of 15%-25%!

Please note that gcc-mpx and gcc-asan versions were compared against
gcc-native, and icc-mpx version was compared against icc-native.

We would like to compile a small technical report with all our
measurements (performance and memory overhead) and put it online.
We'll do it in the near future, I will write an update here when it's
done. Please tell me if anyone is interested in any specific
benchmarks (I want to test PARSEC and some case-studies: PostgreSQL,
Memcached, SQLite3). Any feedback is welcome.

Two comments.

First, it would be interesting to compare SoftBound to MPX to see how the two compare. To the best of my understanding, MPX is intended to be a hardware implementation that does what SoftBound does.

Second, when you write your report, please keep in mind that what Asan and MPX do are not really the same thing. MPX is designed to track bounds information for each pointer while Asan only tracks where memory objects are located. As a result, their security guarantees are not the same: MPX should be able to catch pointers that jump from one memory object into another while Asan does not (Asan distributes memory objects far away from each other in the address space so that out of bounds accesses are likely to point into unallocated memory). The difference is subtle, but I think it's important.

Regards,

John Criswell

Recently I played with MPX support on Intel C/C++ Compiler (icc). This
implementation looks *much* better, with the following example
overheads: 1.2X on "raytrace", 1.25X on "bodytrack", 1.08X on
"streamcluster". So the common overheads are in the range of 15%-25%!

That's interesting.
Are you sure you are instrumenting both reads and writes with icc?

Yes, here are the exact flags I add to the usual build configuration:
  -xHOST -check-pointers-mpx:rw

Note "rw" which stands for protecting read and write accesses. In the
future, I will analyze how different flags affect ASan / SoftBoundCETS
/ gcc-mpx / icc-mpx.
I will also use a set of microbenchmarks/benchmarks (e.g., RIPE) to
test the protection provided.

SPEC2006 is well know so it could be useful. Especially 483.xalancbmk
Besides, maybe you could take something that is not strictly a benchmark.
E.g. take pdfium_test (pdfium - Git at Google) and feed
several large pdf files to it.

Thanks, I will report the SPEC2006 numbers as well.

>> Recently I played with MPX support on Intel C/C++ Compiler (icc). This
>> implementation looks *much* better, with the following example
>> overheads: 1.2X on "raytrace", 1.25X on "bodytrack", 1.08X on
>> "streamcluster". So the common overheads are in the range of 15%-25%!
> That's interesting.
> Are you sure you are instrumenting both reads and writes with icc?

Yes, here are the exact flags I add to the usual build configuration:
  -xHOST -check-pointers-mpx:rw

Interesting, looking forward to reading your report!

Note "rw" which stands for protecting read and write accesses. In the
future, I will analyze how different flags affect ASan / SoftBoundCETS
/ gcc-mpx / icc-mpx.
I will also use a set of microbenchmarks/benchmarks (e.g., RIPE) to
test the protection provided.

> SPEC2006 is well know so it could be useful. Especially 483.xalancbmk
> Besides, maybe you could take something that is not strictly a benchmark.
> E.g. take pdfium_test (pdfium - Git at Google) and feed
> several large pdf files to it.

Thanks, I will report the SPEC2006 numbers as well.

Note that SPEC2006 has several know bugs that trigger under asan.
https://github.com/google/sanitizers/wiki/AddressSanitizerRunningSpecBenchmarks
has a patch that makes SPEC2006 pass with asan.
Some of these bugs and maybe others may also trigger with an MPX checker.

--kcc

>> Recently I played with MPX support on Intel C/C++ Compiler (icc). This
>> implementation looks *much* better, with the following example
>> overheads: 1.2X on "raytrace", 1.25X on "bodytrack", 1.08X on
>> "streamcluster". So the common overheads are in the range of 15%-25%!
> That's interesting.
> Are you sure you are instrumenting both reads and writes with icc?

Yes, here are the exact flags I add to the usual build configuration:
  -xHOST -check-pointers-mpx:rw

Interesting, looking forward to reading your report!

Note "rw" which stands for protecting read and write accesses. In the
future, I will analyze how different flags affect ASan / SoftBoundCETS
/ gcc-mpx / icc-mpx.
I will also use a set of microbenchmarks/benchmarks (e.g., RIPE) to
test the protection provided.

> SPEC2006 is well know so it could be useful. Especially 483.xalancbmk
> Besides, maybe you could take something that is not strictly a
benchmark.
> E.g. take pdfium_test (pdfium - Git at Google) and
feed
> several large pdf files to it.

Thanks, I will report the SPEC2006 numbers as well.

Note that SPEC2006 has several know bugs that trigger under asan.

https://github.com/google/sanitizers/wiki/AddressSanitizerRunningSpecBenchmarks
has a patch that makes SPEC2006 pass with asan.
Some of these bugs and maybe others may also trigger with an MPX checker.

Another note: please also try to document the memory footprint.
One of unfortunate features of MPX is its large metadata storage which may
in
theory consume as much as 4x more RAM than the application itself.

--kcc

Dmitrii, all,

Please note, that GCC 5.3 had a significant update to the MPX code quality - please, use this version as reference.

Regards,
Sergos

Thank you Sergey and Konstantin for useful suggestions. We are
currently bootstrapping the infrastructure for our experiments. We
would like to make a sufficiently comprehensive report, with not only
the performance/memory overhead numbers, but also discussing and
evaluating security guarantees. I will also examine the available
source codes (ASan, gcc-mpx, SoftBound) and will spend some pages on a
discussion of the different approaches (trying to do science, you see
:)).

Btw, I will target only deterministic memory-safety no-code-changes
approaches that protect against spatial errors (I will probably
include also ASan and SoftBoundCETS with temporal errors' protection
in the results as well). The only technique (except Pointer Checker,
ASan, and SoftBound) I know of is Baggy Bounds Checking from MSR, but
it seems to be closed-source and Windows-oriented. If anyone can
suggest some other technique that could be evaluated here, please
inform me.

Anyway, before putting the techreport online, I will send the draft to
everyone who took part in this conversation, just to be on the safe
side and correct any bugs/wrong conclusions.

Thank you Sergey and Konstantin for useful suggestions. We are
currently bootstrapping the infrastructure for our experiments. We
would like to make a sufficiently comprehensive report, with not only
the performance/memory overhead numbers, but also discussing and
evaluating security guarantees. I will also examine the available
source codes (ASan, gcc-mpx, SoftBound) and will spend some pages on a
discussion of the different approaches (trying to do science, you see
:)).

Btw, I will target only deterministic memory-safety no-code-changes
approaches that protect against spatial errors (I will probably
include also ASan and SoftBoundCETS with temporal errors' protection
in the results as well). The only technique (except Pointer Checker,
ASan, and SoftBound) I know of is Baggy Bounds Checking from MSR, but
it seems to be closed-source and Windows-oriented. If anyone can
suggest some other technique that could be evaluated here, please
inform me.

There is also a family of tools originated from Electric Fence
<http://elinux.org/Electric_Fence&gt;,
they mostly have historical interest due to huge slowdown/memory
consumption.

Are you looking for bug detection mechanisms, or also for production
hardening techniques?
ASan is a bug detection tool. ASan can
<https://blog.torproject.org/blog/tor-browser-55a4-hardened-released&gt; be
used for hardening, but that's not it's primary purpose.
Same is true (IMHO) about Pointer Checker and SoftBound.

Hardening is an entirely different subject, although there is a bit of
intersection,
e.g. I know that parts of UBSan (-fsanitize=signed-integer-overflow) are
used for hardening.
In LLVM, also have a look at clang.llvm.org/docs/ControlFlowIntegrity.html
and
http://clang.llvm.org/docs/SafeStack.html

Anyway, before putting the techreport online, I will send the draft to
everyone who took part in this conversation, just to be on the safe
side and correct any bugs/wrong conclusions.

I would appreciate this.

--kcc

Thank you Sergey and Konstantin for useful suggestions. We are
currently bootstrapping the infrastructure for our experiments. We
would like to make a sufficiently comprehensive report, with not only
the performance/memory overhead numbers, but also discussing and
evaluating security guarantees. I will also examine the available
source codes (ASan, gcc-mpx, SoftBound) and will spend some pages on a
discussion of the different approaches (trying to do science, you see
:)).

Btw, I will target only deterministic memory-safety no-code-changes
approaches that protect against spatial errors (I will probably
include also ASan and SoftBoundCETS with temporal errors' protection
in the results as well). The only technique (except Pointer Checker,
ASan, and SoftBound) I know of is Baggy Bounds Checking from MSR, but
it seems to be closed-source and Windows-oriented. If anyone can
suggest some other technique that could be evaluated here, please
inform me.

There is also a family of tools originated from Electric Fence
<http://elinux.org/Electric_Fence&gt;,
they mostly have historical interest due to huge slowdown/memory
consumption.

I can assure you that they are still widely used for QA :slight_smile: