Strange assembly behavior

Hi folks, gurus and experts –

I am not sure this is the proper list to post to, but I try nevertheless. I am getting a strange .o output from an assembly file parsed by clang.

From:
.text
.text
.globl _ATL_UGEMV
.align 6
_ATL_UGEMV:

[…]
      addpd %xmm0, %xmm1
      movaps %xmm1, 112-128(%r9)

      sub $-128, %r9
      sub $-128, %rdx
      sub $8*2, %rbx
      jnz LOOPM

      cmp $0, %rcx
      jz MCLEAN

      mov %rcx, %rbx
LOOPMCU:
      movsd -128(%r9), %xmm1
      movsd -128(%rdx), %xmm0
[…]

      dec %rbx
      jnz LOOPMCU

MCLEAN:
      prefetchnta 12*8+64(%r8)
      add $12*8, %r8
      add %r15, %rdx
      mov %r11, %r9
      mov %rdi, %rbx
      sub $12, %rsi
      jnz LOOPN

      movq -8(%rsp), %rbp
      movq -16(%rsp), %rbx
      movq -24(%rsp), %r12
      movq -32(%rsp), %r13
      movq -40(%rsp), %r14
      movq -48(%rsp), %r15
      ret

I get this after ‘clang -x assembler’ (clang version 3.3 (trunk 173279)):

[…]
dmvn_sse.o[0x6f8]: addpd %xmm0, %xmm1
dmvn_sse.o[0x6fc]: movaps %xmm1, -16(%r9)
dmvn_sse.o[0x701]: subq $-128, %r9
dmvn_sse.o[0x705]: subq $-128, %rdx
dmvn_sse.o[0x709]: subq $16, %rbx
dmvn_sse.o[0x70d]: jne 0xc7 ; ATL_UGEMV + 199
dmvn_sse.o[0x713]: cmpq $0, %rcx
dmvn_sse.o[0x717]: je 0x71d ; ATL_UGEMV + 1821
dmvn_sse.o[0x71d]: movq %rcx, %rbx
dmvn_sse.o[0x720]: movsd -128(%r9), %xmm1
dmvn_sse.o[0x726]: movsd -128(%rdx), %xmm0
[…]
dmvn_sse.o[0x7e7]: decq %rbx
dmvn_sse.o[0x7ea]: jne 0x720 ; ATL_UGEMV + 1824

The ‘jz MCLEAN’ has been replaced by a jump to the next instruction, the code after MCLEAN: is discarded and does not even appear in the .o file (if I believe lldb), so that the function abruptly ends up at the ‘jnz LOOPMCPU’. Needless to say, this code fails to run. Could anybody tell me what’s wrong?

Thanks a lot!
Vincent

Hi Vincent,

A jump to the next instruction is often a sideeffect of looking at a non-relocated object with relocations to apply. Have you run this through objdump with the -r flag (objdump -dr $FILE) and seen if there is a relocation to apply to the jump?

I’m concerned about the lack of the code after MCCLEAN:, but it could be that the debugger is just not showing it to you for some reason - I’d check with objdump first as that never lies :slight_smile:

Cheers,

James

Hi James!

A jump to the next instruction is often a sideeffect of looking at a non-relocated object with relocations to apply. Have you run this through objdump with the -r flag (objdump -dr $FILE) and seen if there is a relocation to apply to the jump?

I'm concerned about the lack of the code after MCCLEAN:, but it could be that the debugger is just not showing it to you for some reason - I'd check with objdump first as that never lies :slight_smile:

Thanks. You’re right, of course: everything is back using objdump. I don’t understand why lldb won’t show the epilog code. Thanks do much for the hint; by the way, -dr does not work: you have to type -d -r.

This is good news for clang, but not for me: I have still to figure out why this code fails. May be I’ll be back later…

Have a great day,
Cheers!
Vincent