FYI, we've posted a component of Spectre mitigation on llvm-commits

Sending a note here as this seems likely to be of relatively broad interest.

Thread:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180101/513630.html

Review link:
https://reviews.llvm.org/D41723#

It seems the review link is getting wider coverage (Reddit, HN, ...)
and Phabricator is struggling under the load.

Best,

Alex

The folks working on phab are busily propping it up. It should be relatively healthy now.

It looks like this is producing code of the following form.

  call next
loop:
  pause
  jmp loop
next:
  mov [rsp], r11
  ret

As I understand it, the busy loop is to cause the speculative execution to be trapped in the loop. Was something like ud2 considered? I presume that would stop the speculative execution without involving any of the execution units the way the busy loop does.

Many thanks to those good people!

Have you considered developing the patch description into a blog post
for blog.llvm.org, maybe after the patch lands?

Best,

Alex

Sending a note here as this seems likely to be of relatively broad interest.

It looks like this is producing code of the following form.

call next
loop:
pause
jmp loop
next:
mov [rsp], r11
ret

As I understand it, the busy loop is to cause the speculative execution to be trapped in the loop. Was something like ud2 considered? I presume that would stop the speculative execution without involving any of the execution units the way the busy loop does.

The pause instruction will also avoid tying up execution resources in speculative contexts, so I wouldn’t expect it to be significantly different.

The folks working on phab are busily propping it up. It should be relatively
healthy now.

Many thanks to those good people!

Have you considered developing the patch description into a blog post
for blog.llvm.org, maybe after the patch lands?

We can try to put something together. There are a bunch of blog posts going out, and this whole thing was rushed a bit at the end, so it may take us a week or so to get things in order.

Got it. The Software Developer Manual isn't entirely clear on this point (to me at least) and IACA shows a number of ports in use during the 4 or 5 Uops pause takes.

Thank you,

Steve

As I understand it, the busy loop is to cause the speculative execution to be trapped in the loop. Was something like ud2 considered? I presume that would stop the speculative execution without involving any of the execution units the way the busy loop does.

The pause instruction will also avoid tying up execution resources in speculative contexts, so I wouldn’t expect it to be significantly different.

Got it. The Software Developer Manual isn’t entirely clear on this point (to me at least) and IACA shows a number of ports in use during the 4 or 5 Uops pause takes.

That’s if it executes. But pause doesn’t get speculatively executed at all.

Sadly, this makes IACA and other tools challenging to use. Instead you want to construct a tight careful benchmark, control frequency and everything else, and look at exact retire rates of instructions etc.

But if you find instruction sequences that reliably benchmark as faster, please share this on the review thread. =D

Thanks for the notification, Chandler.

I also wanted to note that I’ve just posted another component for Spectre mitigation (variant 1), see https://reviews.llvm.org/D41760 and https://reviews.llvm.org/D41761.
I believe this is completely complementary to the retpoline mitigation you pointed to at https://reviews.llvm.org/D41723#, which is targeted at mitigating variant 2.

Thanks,

Kristof

Awesome, replied.

We’ve been working on similar things, but didn’t have them ready-to-publish due to slightly lower urgency (there are reasonable ways to locally mimic these kinds of things in sensitive areas like the Linux kernel, and even finding code patterns for variant #1 is substantially harder). We have some significantly different APIs we’d like to discuss here based on experience trying to implement these on x86 and deploy them to a reasonably large body of code. Hopefully more details soon as folks have time.

I would be very interested in discussing these approaches. This is something we're currently putting a lot of thought into as well and have come up with a number of potential variants. Comparing notes would be worthwhile.

Philip