Implementing linux randstruct plugin for clang?

Hi all,

Recently our team was asked about the possibility of implementing functionality equivalent to the Linux kernel's randstruct gcc plugin for clang. Essentially, what the plugin does is modify the AST to put the members of structs marked with "__attribute__((randomize_layout))" in a random order (order chosen at compile-time, and controlled by a provided seed). The idea is that this provides security hardening by making it harder for an attacker to guess where a field is stored in memory. See Randomizing structure layout [LWN.net] for more details.

I can see the following possible approaches for implementing this with clang:

1. A source-rewriting plugin which generates a new version of the source code with reordered structs. This probably doesn't require any changes to clang itself, but it introduces a bunch of complexity interacting with the build system for a project.

2. Some new kind of plugin which hooks deeply into semantic analysis; not sure what this would look like.

3. Modifying clang's structure layout code. This is probably easiest to write the code for, but merging it to the clang repo would require consensus that this is actually generally useful.

Has anyone else looked at this?

-Eli

I thought a little about implementing PAX_RANDSTRUCT for clang a couple months back but I didn’t get to the point where I implemented anything. My notes indicate that my two ideas at the time I left it were: (1) see if there’s a point between parsing and codegen that a clang plugin could insert a TreeTransform to rewrite struct definitions, or (2) perform randomization as an LLVM pass that rewrites types and the GEPs, GVs, etc. referencing those types.

For approach #2 you would need to hook into offsetof()–Linux defines offsetof() as both __builtin_offsetof() and ((size_t) &((TYPE *)0)->MEMBER) in different places. The latter would transform into a GEP and shouldn’t require special casing, while the former might mean it’s not doable as an LLVM pass alone. Thinking about it now, if you wanted to implement the unoptimized version of RANDSTRUCT (where elements are randomized across cache lines) you would also need to hook into sizeof() as well. I might be missing other code constructs that would further complicate the pass approach.

The very latest point you can reorder the members of a struct without imposing weird restrictions is in RecordLayoutBuilder (which is run as part of semantic analysis). After that, constant-folding of sizeof/offsetof/etc. starts happening.

-Eli

I thought a little about implementing PAX_RANDSTRUCT for clang a couple months back but I didn't get to the point where I implemented anything. My notes indicate that my two ideas at the time I left it were: (1) see if there's a point between parsing and codegen that a clang plugin could insert a TreeTransform to rewrite struct definitions, or (2) perform randomization as an LLVM pass that rewrites types and the GEPs, GVs, etc. referencing those types.

For approach #2 you would need to hook into offsetof()--Linux defines offsetof() as both __builtin_offsetof() and ((size_t) &((TYPE *)0)->MEMBER) in different places. The latter would transform into a GEP and shouldn't require special casing, while the former might mean it's not doable as an LLVM pass alone. Thinking about it now, if you wanted to implement the unoptimized version of RANDSTRUCT (where elements are randomized across cache lines) you would also need to hook into sizeof() as well. I might be missing other code constructs that would further complicate the pass approach.

The very latest point you can reorder the members of a struct without imposing weird restrictions is in RecordLayoutBuilder (which is run as part of semantic analysis). After that, constant-folding of sizeof/offsetof/etc. starts happening.

How does this work across translation units?

  -Hal

The way the gcc plugin works is that there's a global seed, generated by the build system, which is passed to the compiler. The order of a struct only varies based on the global seed and properties of the struct itself, so every translation unit will consistently shuffle a given struct the same way.

-Eli