Implementing stack probes

I am trying to implement stack probes for our SHAVE target, and I see that the compiler injects references to ‘__stack_chk_guard’ and ‘__stack_chk_fail’. The code that gets generated is horribly wrong, but in order to understand how to fix it I was wondering if there is a clear statement of how the mechanism is supposed to work?

The variable ‘__stack_chk_guard’ appears to be a pointer to an unsigned integer. Where is this supposed to reside, and what value should it contain? And the function ‘__stack_chk_fail’ is called when the test fails - presumably this just aborts.

We have done nothing to support these hooks, so the junk instructions that come out are unsurprising. I am presuming that we need to handle their lowering in a target specific way, but before I can do that I am wondering if there is a clear definition of the semantics of how the probe is supposed to work?

Thanks,

MartinO

I am trying to implement stack probes for our SHAVE target, and I see that
the compiler injects references to ‘__stack_chk_guard’ and
‘__stack_chk_fail’. The code that gets generated is horribly wrong, but in
order to understand how to fix it I was wondering if there is a clear
statement of how the mechanism is supposed to work?

__stack_chk_guard is loaded and the resulting value is stored on the
stack. The location on the stack contains the "canary". Before the
function returns, the canary and __stack_chk_guard are compared again. If
they compare unequal, __stack_chk_fail is called. Typically, the
implementation of __stack_chk_fail is expected to abort the program.

I believe your libc is responsible for implementing these symbols.

In our case we have to implement our libc as this is for an embedded multicore device, and this why I need to understand the semantics of the mechanism. What the compiler inserts and what the runtime library support expects have to work together and I do not understand the mechanism in order to implement them.

However, from what you say, the “canary” is just a value (could be anything) that is planted in the function’s own stack frame, and on return if it is altered then the check decides that a stack overflow occurred and aborts.

But this means that a stack overflow that corrupts an adjacent data region which will be visited at another time will not be detected unless by coincidence something writes to that region during the lifetime of the function. It also means that if the function’s own stack is corrupted by overlapping writes, then its own return address may also be clobbered along with the canary, and there is no guarantee it will resume control at all.

This seems like a very error prone approach. Or am I misunderstanding the semantics?

In a simple runtime context like ours, the stack is a simple area of memory. We have a programmer specified stack begin address, and the programmer also determines how much stack to allocate to the program (specified at link-time).

What I would really like to do, is test if there is enough room on the stack before I use the reservation, and abort if there is not. Our stack grows from high address to low, so currently the function prologue does something like:

SP -= #bytes to reserve

execute code

SP += #bytes to reserve

return

I would like:

SP -= #bytes to reserve

OPTIONAL: update SP high-water mark

IF: SP < __top_of_stack THEN abort

execute code

SP += #bytes to reserve

return

The symbol ‘__top_of_stack’ can be provided by the linker when laying out memory for the program. This approach would work well as there is no chance that the function’s stack frame will clobber data as the test and abort happen before the reserved stack is used.

I don’t see how the ‘_stack_chk’ approach can do this, or ensure that the frame does not corrupt adjacent memory.

Thanks,

MartinO

Your understanding matches mine. It is not a particularly robust check, but can help detect some programming mistakes. PS4 has it on by default and we’ve caught some real bugs with it.

You could force it off for your target if you really don’t want it. Look for OPT_fstack_protector in Driver.

–paulr

Hi, it’s not about on or off, I am trying to implement it for our target and it does not yet work, but I want to understand what it is supposed to do before I implement the lowering for it, or the runtime library support for it.

Thanks,

MartinO