Intel MPX support (instrumentation pass similar to gcc's Pointer Checker)

Hello,

even though the study of Intel MPX took much longer than expected, we have finally finished it. Currently, it is published in two formats:

  • as a technical report: * and as a webpage: This work contains evaluation of MPX from perspectives of performance (Phoenix, PARSEC, and SPEC benchmark suites), security (RIPE and found bugs in benchmarks), and usability (false positives and required changes in applications). Additionally, we’ve analyzed various implementation aspects of Intel MPX and tested it on real-world applications. We would appreciate your feedback.

Regards,
Oleksii Oleksenko

><i> On Tue, Feb 9, 2016 at 7:22 AM, Dmitrii Kuvaiskii <
</i>><i> [Dmitrii.Kuvaiskii at tu-dresden.de](http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev)> wrote:
</i>> >><i> Thank you Sergey and Konstantin for useful suggestions. We are
</i>>><i> currently bootstrapping the infrastructure for our experiments. We
</i>>><i> would like to make a sufficiently comprehensive report, with not only
</i>>><i> the performance/memory overhead numbers, but also discussing and
</i>>><i> evaluating security guarantees. I will also examine the available
</i>>><i> source codes (ASan, gcc-mpx, SoftBound) and will spend some pages on a
</i>>><i> discussion of the different approaches (trying to do science, you see
</i>>><i> :)).
</i>>> >><i> Btw, I will target only deterministic memory-safety no-code-changes
</i>>><i> approaches that protect against spatial errors (I will probably
</i>>><i> include also ASan and SoftBoundCETS with temporal errors' protection
</i>>><i> in the results as well). The only technique (except Pointer Checker,
</i>>><i> ASan, and SoftBound) I know of is Baggy Bounds Checking from MSR, but
</i>>><i> it seems to be closed-source and Windows-oriented. If anyone can
</i>>><i> suggest some other technique that could be evaluated here, please
</i>>><i> inform me.
</i>>> > ><i> There is also a family of tools originated from Electric Fence
</i>><i> <[http://elinux.org/Electric_Fence](http://elinux.org/Electric_Fence)>,
</i>><i> they mostly have historical interest due to huge slowdown/memory
</i>><i> consumption.
</i>
I can assure you that they are still widely used for QA :)

><i> Are you looking for bug detection mechanisms, or also for production
</i>><i> hardening techniques?
</i>><i> ASan is a bug detection tool. ASan can
</i>><i> <[https://blog.torproject.org/blog/tor-browser-55a4-hardened-released](https://blog.torproject.org/blog/tor-browser-55a4-hardened-released)> be
</i>><i> used for hardening, but that's not it's primary purpose.
</i>><i> Same is true (IMHO) about Pointer Checker and SoftBound.
</i>> ><i> Hardening is an entirely different subject, although there is a bit of
</i>><i> intersection,
</i>><i> e.g. I know that parts of UBSan (-fsanitize=signed-integer-overflow) are
</i>><i> used for hardening.
</i>><i> In LLVM, also have a look at clang.llvm.org/docs/ControlFlowIntegrity.html
</i>><i> and
</i>><i> [http://clang.llvm.org/docs/SafeStack.html](http://clang.llvm.org/docs/SafeStack.html)
</i>> >> >><i> Anyway, before putting the techreport online, I will send the draft to
</i>>><i> everyone who took part in this conversation, just to be on the safe
</i>>><i> side and correct any bugs/wrong conclusions.
</i>>> > ><i> I would appreciate this.
</i>> ><i> --kcc
</i>> > >> >><i> On Tue, Feb 9, 2016 at 3:24 PM, Sergey Ostanevich <[sergos.gnu at gmail.com](http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev)>
</i>>><i> wrote:
</i>>>><i> Dmitrii, all,
</i>>>> >>><i> Please note, that GCC 5.3 had a significant update to the MPX code
</i>>><i> quality -
</i>>>><i> please, use this version as reference.
</i>>>> >>><i> Regards,
</i>>>><i> Sergos
</i>>>> >>><i> On Tue, Feb 9, 2016 at 12:49 AM, Kostya Serebryany via llvm-dev
</i>>>><i> <[llvm-dev at lists.llvm.org](http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev)> wrote:
</i>>>>> >>>> >>>> >>>><i> On Thu, Feb 4, 2016 at 10:40 AM, Kostya Serebryany <[kcc at google.com](http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev)>
</i>>><i> wrote:
</i>>>>>> >>>>> >>>>> >>>>><i> On Thu, Feb 4, 2016 at 4:59 AM, Dmitrii Kuvaiskii
</i>>>>>><i> <[Dmitrii.Kuvaiskii at tu-dresden.de](http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev)> wrote:
</i>>>>>>> >>>>>>>><i> Recently I played with MPX support on Intel C/C++ Compiler (icc).
</i>>>>>>>>><i> This
</i>>>>>>>>><i> implementation looks *much* better, with the following example
</i>>>>>>>>><i> overheads: 1.2X on "raytrace", 1.25X on "bodytrack", 1.08X on
</i>>>>>>>>><i> "streamcluster". So the common overheads are in the range of
</i>>><i> 15%-25%!
</i>>>>>>>><i> That's interesting.
</i>>>>>>>><i> Are you sure you are instrumenting both reads and writes with icc?
</i>>>>>>> >>>>>><i> Yes, here are the exact flags I add to the usual build configuration:
</i>>>>>>><i>    -xHOST -check-pointers-mpx:rw
</i>>>>>> >>>>> >>>>><i> Interesting, looking forward to reading your report!
</i>>>>>>> >>>>>> >>>>>><i> Note "rw" which stands for protecting read and write accesses. In the
</i>>>>>>><i> future, I will analyze how different flags affect ASan / SoftBoundCETS
</i>>>>>>><i> / gcc-mpx / icc-mpx.
</i>>>>>>><i> I will also use a set of microbenchmarks/benchmarks (e.g., RIPE) to
</i>>>>>>><i> test the protection provided.
</i>>>>>>> >>>>>>><i> SPEC2006 is well know so it could be useful. Especially
</i>>><i> 483.xalancbmk
</i>>>>>>>><i> Besides, maybe you could take something that is not strictly a
</i>>>>>>>><i> benchmark.
</i>>>>>>>><i> E.g. take pdfium_test ([https://pdfium.googlesource.com/pdfium/](https://pdfium.googlesource.com/pdfium/)) and
</i>>>>>>>><i> feed
</i>>>>>>>><i> several large pdf files to it.
</i>>>>>>> >>>>>><i> Thanks, I will report the SPEC2006 numbers as well.
</i>>>>>>> >>>>> >>>>><i> Note that SPEC2006 has several know bugs that trigger under asan.
</i>>>>>> >>>>> >><i> [https://github.com/google/sanitizers/wiki/AddressSanitizerRunningSpecBenchmarks](https://github.com/google/sanitizers/wiki/AddressSanitizerRunningSpecBenchmarks)
</i>>>>>><i> has a patch that makes SPEC2006 pass with asan.
</i>>>>>><i> Some of these bugs and maybe others may also trigger with an MPX
</i>>><i> checker.
</i>>>>> >>>> >>>><i> Another note: please also try to document the memory footprint.
</i>>>>><i> One of unfortunate features of MPX is its large metadata storage which
</i>>><i> may
</i>>>>><i> in
</i>>>>><i> theory consume as much as 4x more RAM than the application itself.
</i>>>>> >>>><i> --kcc
</i>>>>> >>>>> >>>>> >>>>><i> --kcc
</i>>>>>> >>>>>><i> --
</i>>>>>>><i> Yours sincerely,
</i>>>>>>><i> Dmitrii Kuvaiskii
</i>>>>>> >>>>> >>>> >>>> >>>><i> _______________________________________________
</i>>>>><i> LLVM Developers mailing list
</i>>>>><i> [llvm-dev at lists.llvm.org](http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev)
</i>>>>><i> [http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev](http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev)
</i>>>>> >>> >> >><i> --
</i>>><i> Yours sincerely,
</i>>><i> Dmitrii Kuvaiskii
</i>>> > > > ><i> _______________________________________________
</i>><i> LLVM Developers mailing list
</i>><i> [llvm-dev at lists.llvm.org](http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev)
</i>><i> [http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev](http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev)
</i>>

Hi Oleksenko,

Thanks for the report, I’d like to reading through your report more details. But I hope to know below info as well:

  1. Code size data. I care it because code size is critical for firmware. If possible, please compare them with and without normal LZMA compression.

  2. Benchmark build flag, especially for the optimization level. And what optimization level do these technologies support? E.g. O1, O3, Oz, Os, LTO, etc.

Sorry for such a weird text above, it was meant to be posted as a continuation of an old thread.

  1. Code size data. I care it because code size is critical for firmware. If possible, please compare them with and without normal LZMA compression.

Yes, we could measure it.

  1. Benchmark build flag, especially for the optimization level. And what optimization level do these technologies support? E.g. O1, O3, Oz, Os, LTO, etc.

It’s described on the “Methodology” page: Best, Oleksii

I LOVE this paper!

From the comments I’ve sent you earlier you did not address just two:

  • I’ve heard that MPX is incompatible with ILP32 (64-bit registers, 32-bit pointers),

but I don’t know the details.

  • Did you investigate a possibility of false positives in cases where MPX-instrumented and non-instrumented code is mixed?

Here are the examples from 3+ years ago:
https://github.com/google/sanitizers/wiki/AddressSanitizerIntelMemoryProtectionExtensions#false-positive-with-un-instrumented-code

They might have been fixed already, but the general problem may remain.

Or did I miss these in the final paper?

Anyway, these are two additional subjects I’d like to see covered, and their absence doesn’t undermine the paper usefulness.
Thanks for the amazing work!

–kcc

Thank you!

  • I’ve heard that MPX is incompatible with ILP32 (64-bit registers, 32-bit pointers),

but I don’t know the details.

Maybe there are some corner cases that I’m not aware of, but in most cases it does support 32-bit mode. At some point, we even tried running 32-bit version of RIPE and GCC-MPX successfully detected 262 out of 417 attacks (155 were probably undetected because of suboptimal default compilation flags).

  • Did you investigate a possibility of false positives in cases where MPX-instrumented and non-instrumented code is mixed?

Here are the examples from 3+ years ago:
https://github.com/google/sanitizers/wiki/AddressSanitizerIntelMemoryProtectionExtensions#false-positive-with-un-instrumented-code

They might have been fixed already, but the general problem may remain.

It is buried deep in the text, but we mention this issue. It’s in the end of 3.1.Hardware-Instruction set: “[…] the pointer created/altered in legacy code is considered “boundless”: this
allows for interoperability but also creates holes in Intel MPX defense” + footnote on the same page.

Yet, you are right: (1) we’ve mentioned only false negatives, not false positives and (2) we should make it more explicit because it’s an important issue which appears in real applications (e.g., x264 from PARSEC).

We will add this notes to the updated version of the report.

Best,
Oleksii

Just wanted to say thank you for publishing this. I ran across your micro-benchmark numbers a few days ago through another avenue, and found them incredibly helpful. I also found your explanations of the various instruction semantics be much easier to understand than the Intel docs. Thank you; this has saved me a lot of time and has triggered some experimental follow up work of my own.

Philip