Improving support for the "Cold" Calling Convention

Hi all,

I'm interested in fixing PR14481 (http://llvm.org/PR14481), regarding
teaching LLVM that coldcc functions do not clobber any registers.

As a brief motivation why: this can have a significant performance
benefits and is very useful for instrumenting code with many calls
that are executed only very rarely. Ubsan's runtime calls are a good
example of this.

Attached is my attempt at fixing this from the caller's perspective
for X86/X86-64.

The patch works for me (review welcome), but there are still some
large pieces missing that I'd appreciate your thoughts and guidance
on:

* How to best implement the other half: generating function
definitions that actually preserve all registers. Ideally this would
be done in a way that only preserves *clobbered* registers, but
naively saving/restoring all registers regardless would be a good
starting point.

* Changing the behavior of a calling convention will likely break
codes that use it expecting the existing behavior, which presently
seems to be an analysis hint. Is this something to be concerned
about? Does anyone know of code that uses coldcc in this manner, or
at all?

* A convention that "preserves all" seems like it could be neatly done
in a target-agnostic manner. I'm not sure if there is sufficient
interest to justify such a solution, and am unfortunately not
sufficiently familiar with the CodeGen architecture to propose a
solution, but thought I'd mention it. Might even be easier to tackle
it this way than with the approach taken in the patch :).

Thank you for your time,

~Will

0001-Add-support-for-ColdCC-on-X86-X86-64.patch (3.55 KB)

Hi Will,

Thanks for working on this, I think it would be great to have better support for ColdCC.

Unfortunately, the description in CallingConv.h of ColdCC as "preserves all registers" is probably a bit naïve. Some more work is required before it makes sense.

As a trivial example: Your patch claims that ColdCC calls preserve the instruction pointer, but if that were true, you would have an infinite loop.

It is only possible to preserve registers that can actually be saved - the ColdCC function may be calling other functions without the ColdCC calling convention, so it would need to save any registers they may clobber.

You also need to figure out how ColdCC functions return values. In %rax? Does a ColdCC function need to preserve the argument registers?

The x87 floating point stack registers and the aliasing MMX registers have their own set of problems that probably aren't worthwhile trying to fix.

Finally, I don't think it is possible to preserve the AVX registers either. If your ColdCC function is compiled without AVX support, it won't clobber the high part of the %ymm registers itself, but you could be calling other functions that were compiled with AVX support. Those functions may clobber %ymm registers, and the ColdCC function has no way of saving them.

/jakob