Apparently broken ASM with Boost/Python/Clang-14?

I’m diagnosing a SEGV we only get with clang on our CI systems, and trying to work out if I’ve hit a bug in clang or if it’s something that we are doing wrong. Godbolt reproduction here, although it needs -O3 to reproduce on Godbolt, for some reason it happens at -O2 on our systems.

Basically, on Clang 13/15 the code works fine and with my limited knowledge of assembly looks okay - on Clang 14 there are two very early calls to _Py_Dealloc on output lines 12, 15:

        xorps   xmm0, xmm0
        movups  xmmword ptr [rsp + 16], xmm0
        mov     qword ptr [rsp + 32], 0
        mov     qword ptr [rsp + 8], offset .L.str.1
        call    _Py_Dealloc

Am I correct in interpreting these as being wrong? Except for the positioning, %rdi isn’t set up to pass the values (which is done in the later calls, with checks for empty pointers; lines 36, 43), which I understood to be part of the calling convention?

This only happens when -DNDEBUG is defined; so normally I’d put it down to the library (e.g. boost), but the fact that this goes away in Clang 15 seems to confuse that. Is this a bug that was fixed? (if so, how would I work out which one?). A problem in the Boost library? Undefined behaviour we are somehow relying on?

What version of clang 14 do you have? I notice godbolt is still using 14.0.0, while I have 14.0.6 locally (from debian); the failure doesn’t reproduce for me with 14.0.6. However, I don’t know if this is due to other environmental differences, or if it’s actually a change between 14.0.0 and 14.0.6.