lldb test failures on 32bit

Hi Mike,

I think I’ve tracked down the sources of both of these problems.

The problem with not being able to call functions in the target seems to be a failure in the MCJIT relocation mechanism. Because LLDB is generating IR with absolute addresses for function calls, the JITed code contains relocations with absolute values rather than symbols. This is a problem I fixed a short time ago, but it seems to have come undone again (at least in this particular case). The attached ‘reloc-fix-32.patch’ (to be applied to the LLVM repository) should fix that.

I need to do a bit of investigation to settle some questions about why this condition came back or was specific to the 32-bit case before I commit this, but I think this is correct.

The problem where you lose source after stepping seems to be a matter of incorrect stack unwinding. There were two problems lurking here.

First, the RegisterContext::ConvertBetweenRegisterKinds() function wasn’t making any provision for a 32-bit inferior running on a 64-bit target. The way the x86-64 register context class is implemented it defines 64-bit registers and 32-bit registers in the same RegisterInfo structure, and there is some overlap in how these get mapped to DWARF/GDB/GCC register numbers. RegisterContext::ConvertBetweenRegisterKinds() was just iterating through the list and returning the first match it found, which was the 64-bit register.

I added a special case to call RegisterContext::ConvertRegisterKindToRegisterNumber() when the target kind is eRegisterKindLLDB. This invokes the RegisterContext_x86_64 overload of that method which knows how to distinguish the 32-bit and 64-bit registers. I’m not convinced that this is the best way to solve this problem, but it works.

The second issue was that the ABIMacOSX_i386 plug-in (which also gets used for 32-bit inferiors on Linux) was rejecting call frame addresses that weren’t 8-byte aligned whereas, at least on Linux, 4-byte alignment is allowed. If 32-bit processes on MacOSX require 8-byte alignment then we’ll need to do some additional checking, but for now I just modified it to only check for 4-byte alignment.

Both of the stack unwinding issues should be fixed by the attached ‘stack-fix-32.patch’ file.

Can you try out these patches and verify that they work for you?

Thanks,

Andy

reloc-fix-32.patch (579 Bytes)

stack-fix-32.patch (1.55 KB)

I applied both patches, and the ‘expr (int)printf(“blah\n”)’ statement now works, but “n” over the printf() statement in the code still throws me somewhere else entirely. I’m on the call printf() asm instruction down below, I type “ni”, and I wind up at address 0x80486f0.

Is this working for you?

And thanks for looking at this Andrew. Very cool we can call functions now in 32-bit targets…
-Mike

Hmmm…

The program I was testing with passed a format argument to printf, and that works. If I just pass a string, I do see the failure you describe below.

The logging output says it ran into an invalid RegisterContext while unwinding. I’ll take a closer look on Monday.

-Andy

I think the i386 problems are due to RegisterContext_i386 not doing its job:

bool
RegisterContext_i386::ReadAllRegisterValues(DataBufferSP &data_sp)
{
    return false;
}

bool
RegisterContext_i386::WriteAllRegisterValues(const DataBufferSP &data)
{
    return false;
}

It also seems that WriteGPR() and WriteFPR() are not implemented???

The ReadAllRegisterValues() call is used to backup the entire register state prior to running expressions and WriteAllRegisterValues() is used to restore them after running expressions.

Actually, RegisterContext_i386 doesn't get used in the case of a 32-bit inferior on a 64-bit host. In that scenario we use RegisterContext_x86_64 and do some mapping under the covers for 32-bit targets.

-Andy

So we have two code paths for i386? One for native 32 on 32 and one for cross 32 on 64? Can/should this be fixed?

Greg

I think the issue is the behavior of ptrace. The data structures we need with ptrace are determined by the host regardless of the target. This is, of course, assuming a local target. I believe we can be sure we have that by the time the RegisterContext is created.

-Andy

Hi Mike,

I investigated this a little further and it seems that the problem is that when LLDB tries to unwind the stack from the point of the ‘jmp’ instruction in the printf stub it incorrectly calculates the call frame address. The log output indicates that it is using ESP+12 as the call frame address based on information from the FDE table. Consequently, it is looking for the return address at ESP+8, but that’s wrong. (I believe it’s actually at ESP-4.) It’s possible that this is still a matter of the register context getting some registers confused because of the 32-bit mapping but if so the point of failure is less obvious.

It turns out that the variation of the code that I had that seemed to be working was actually working for the wrong reason. It was still looking for the frame 1 pc in the wrong location, but that location just happened to contain a non-zero value so the validity check passed over it.

That’s as far as I’ve gotten in my investigation. I’m going to be on vacation for a week after tomorrow and I have some other things I need to get done before then, so if you want to pick this up from here feel free to do so. If not, I’ll try to get back to it in early August.

-Andy

I’m starting to look at this now. I think there are some symbol issues on 32-bit as well. When running the code below in x64, the “disassemble -n main” recognizes the symbol stub for printf for the call statement, and the “disassemble -a addr” works as well.

I’ll continue looking at this next week. Thanks Andy.
-Mike

mikesart@mikesart64:~/data/src/blah/build$ lldb – hello_world
Current executable set to ‘hello_world’ (i386).

(lldb) b main
Breakpoint 1: where = hello_world`main + 24 at hello_world.cpp:6, address = 0x080485b8

(lldb) r
Process 5933 launched: ‘/home/mikesart/data/src/blah/build/hello_world’ (i386)
Process 5933 stopped

  • thread #1: tid = 5933, 0x080485b8 hello_worldmain(argc=1, argv=0xffe770f4) + 24 at hello_world.cpp:6, name = 'hello_world', stop reason = breakpoint 1.1 frame #0: 0x080485b8 hello_worldmain(argc=1, argv=0xffe770f4) + 24 at hello_world.cpp:6
    3
    4 int main( int argc, char *argv[] )
    5 {
    → 6 printf(“hello world.\n”);
    7 }

(lldb) disassemble -n main
hello_world`main at hello_world.cpp:5:
0x80485a0: pushl %ebp
0x80485a1: movl %esp, %ebp
0x80485a3: subl $0x18, %esp
0x80485a6: movl 0xc(%ebp), %eax
0x80485a9: movl 0x8(%ebp), %ecx
0x80485ac: leal 0x80486a0, %edx
0x80485b2: movl %ecx, -0x4(%ebp)
0x80485b5: movl %eax, -0x8(%ebp)
→ 0x80485b8: movl %edx, (%esp)
0x80485bb: calll 0x80484d0
0x80485c0: movl $0x0, %ecx
0x80485c5: movl %eax, -0xc(%ebp)
0x80485c8: movl %ecx, %eax
0x80485ca: addl $0x18, %esp
0x80485cd: popl %ebp
0x80485ce: ret

(lldb) disassemble -a 0x80484d0
error: Could not find function bounds for address 0x80484d0

(lldb) disassemble -s 0x80484d0
0x80484d0: jmpl *0x804a008
0x80484d6: pushl $0x10
0x80484db: jmp 0x80484a0 ; hello_world…plt + 0
hello_world`_start + 64:
0x80484e0: xorl %ebp, %ebp

In the ObjectFileMachO, we tend to make up symbols for all of the PLT stubs that we have and we give them a type of eSymbolTypeTrampoline. We do this by parsing the data in the mach-o binary and making synthetic symbols. We actually take all undefined symbols (which are useless to us in the debugger) and turn them into trampoline symbols to make the symbols useful. This sounds like a fix that needs to happen in ObjectFileELF.cpp.

Greg

For the Linux 64-bit hello_world, it looks like the below. Is this what you would expect?

(lldb) disassemble -n main
hello_world`main at hello_world.cpp:5:
0x400770: pushq %rbp
0x400771: movq %rsp, %rbp
0x400774: subq $0x20, %rsp
0x400778: leaq 0x40089c, %rax
0x400780: movl %edi, -0x4(%rbp)
0x400783: movq %rsi, -0x10(%rbp)
→ 0x400787: movq %rax, %rdi
0x40078a: movb $0x0, %al
0x40078c: callq 0x400660 ; symbol stub for: printf
0x400791: movl $0x0, %ecx
0x400796: movl %eax, -0x14(%rbp)
0x400799: movl %ecx, %eax
0x40079b: addq $0x20, %rsp
0x40079f: popq %rbp
0x4007a0: ret

(lldb) disassemble -a 0x400660
hello_world`symbol stub for: printf:
0x400660: jmpq *0x20099a(%rip) ; GLOBAL_OFFSET_TABLE + 24
0x400666: pushq $0x0
0x40066b: jmpq 0x400650 ; hello_world…plt + 0

For reference, gdb 7.6 looks like this:

(gdb) disassemble main
Dump of assembler code for function main(int, char**):
0x0000000000400770 <+0>: push rbp
0x0000000000400771 <+1>: mov rbp,rsp
0x0000000000400774: sub rsp,0x20
0x0000000000400778: lea rax,ds:0x40089c
0x0000000000400780: mov DWORD PTR [rbp-0x4],edi
0x0000000000400783: mov QWORD PTR [rbp-0x10],rsi
=> 0x0000000000400787: mov rdi,rax
0x000000000040078a: mov al,0x0
0x000000000040078c: call 0x400660 printf@plt
0x0000000000400791: mov ecx,0x0
0x0000000000400796: mov DWORD PTR [rbp-0x14],eax
0x0000000000400799: mov eax,ecx
0x000000000040079b: add rsp,0x20
0x000000000040079f: pop rbp
0x00000000004007a0: ret
End of assembler dump.

(gdb) disassemble 0x400660
Dump of assembler code for function printf@plt:

Yes, it looks like 64 bit ELF is doing this correctly. So we probably need to modify the code to handle 32 bit and we are all set.

For the “n” statement not stepping over a i386 hello_world app, it looks like we don’t have unwind info for the printf plt.

They do exist in the binary:

mikesart@mikesart-rad:~/data/src/blah/build32$ readelf --debug-dump=frames --wide hello_world
Contents of the .eh_frame section:

00000000 00000014 00000000 CIE
Version: 1
Augmentation: “zR”
Code alignment factor: 1
Data alignment factor: -4
Return address column: 8
Augmentation data: 1b

DW_CFA_def_cfa: r4 (esp) ofs 4
DW_CFA_offset: r8 (eip) at cfa-4
DW_CFA_nop
DW_CFA_nop

00000018 00000020 0000001c FDE cie=00000000 pc=080484a0…080484e0
DW_CFA_def_cfa_offset: 8
DW_CFA_advance_loc: 6 to 080484a6
DW_CFA_def_cfa_offset: 12
DW_CFA_advance_loc: 10 to 080484b0
DW_CFA_def_cfa_expression (DW_OP_breg4 (esp): 4; DW_OP_breg8 (eip): 0; DW_OP_lit15; DW_OP_and; DW_OP_lit11; DW_OP_ge; DW_OP_lit2; DW_OP_shl; DW_OP_plus)

But lldb either can’t find them or they’ve failed to load.

Unwind info does exist for addresses in main(), and all of this works as
expected in x64.

I'll start debugging where this is failing...

For x86 elf files, the plt_entsize wasn't being rounded to the proper
alignment - this was causing the .plt symbols to be incorrect, along with
unwind info, etc. This patch fixes that:

http://llvm-reviews.chandlerc.com/D1189

The next problem is we're using the x64 register set, but then calling into
the i386 ABI. Ie, this call:

246| addr_t pc;
247+> if (!ReadGPRValue (eRegisterKindGeneric, LLDB_REGNUM_GENERIC_PC,
pc))
248| {

Winds up here:

1092| ExecutionContext exe_ctx(m_thread.shared_from_this());
1093| Process *process = exe_ctx.GetProcessPtr();
1094| if (have_unwindplan_regloc == false)
1095| {
1096| // If a volatile register is being requested, we don't want
to forward the next frame's register contents
1097| // up the stack -- the register is not retrievable at this
frame.
1098| ABI *abi = process ? process->GetABI().get() : NULL;
1099| if (abi)
1100| {
1101+> const RegisterInfo *reg_info =
GetRegisterInfoAtIndex(lldb_regnum);
1102| if (reg_info && abi->RegisterIsVolatile (reg_info))
1103| {
1104| UnwindLogMsg ("did not supply reg location for %d
(%s) because it is volatile",
1105| lldb_regnum, reg_info->name ? reg_info->name :
"??");
1106| return
UnwindLLDB::RegisterSearchResult::eRegisterIsVolatile;
1107| }
1108| }

Which calls into this function:

902| bool
903| ABIMacOSX_i386::RegisterIsCalleeSaved (const RegisterInfo *reg_info)
904| {
905| if (reg_info)
906| {
907| // Saved registers are ebx, ebp, esi, edi, esp, eip
908| const char *name = reg_info->name;
909| if (name[0] == 'e')
910| {

reg_info->name is "rip", and so ABIMacOSX_i386::RegisterIsCalleeSaved() is
returning false.

ABIMacOSX_i386.cpp looks like it does several things using register names.

Actually, RegisterContext_i386 doesn't get used in the case of a 32-bit

inferior on a 64-bit host. In that scenario we use RegisterContext_x86_64
and do some mapping under the covers for 32-bit targets.

Does this mean this is an issue with RegisterContext_x86_64 returning "rip"
and not "eip"?

Thanks.
-Mike

I realized Andrew’s reloc-fix-32.patch & stack-fix-32.patch weren’t checked in and I didn’t have them. Applying both of those with my patch below allows me to step over the 32-bit printf() calls now.

Are those patches what you hope to check in at some point Andrew?

And please let me know if it’s ok to check this in:

http://llvm-reviews.chandlerc.com/D1189

Thanks!
-Mike

Hi Mike,

A variation of reloc-fix-32.patch was committed today by Richard Mitton.

I need to do something with stack-fix-32.patch. If I’m reading the documentation correctly, it appears that stack frames really do need to be 8-byte aligned in Darwin even for 32-bit code.

Unfortunately, we’re using the same ABI plugin for 32-bit x86 code on Darwin and Linux and at the level where this check is performed we don’t know which we’re looking at. As far as I know, there really isn’t a significant difference between the two scenarios, so I don’t think we’d want a totally separate ABI plug-in. I just need to see if there’s an easy way to give it a little target information when it’s created so it can handle the special cases.

As for your patch, am I correct in thinking that I should ignore the history in that review? Can you explain the change to me? What values were you seeing for sh_entsize and sh_addralign?

-Andy

As for your patch, am I correct in thinking that I should ignore the history in that review?

Yeah, sorry. I need to clear arc out.

Can you explain the change to me? What values were you seeing for sh_entsize and sh_addralign?

On x64, both were 16. On i386, entsize was 4, addralign was 16. The size of the .plt entries are 16 in each case.

Thanks!

OK. It seems like whatever generated that sh_entsize value for i386 is wrong, but I suppose it’s best to handle it as you have anyway.

The only problem I see with your patch is that according to the ELF spec zero is a valid value for sh_addralign, and if you pass that to llvm::RoundUpToAlignment it will get a divide-by-zero error. If you check for zero and substitute 1 in that case it should be good to commit.

-Andy

The one remaining big issue we still have is we are using the x86_64 register context that waters itself down for i386. The DWARF and GCC register numbers need to use the i386 register numbering schemes otherwise all info parsed from EH frame and DWARF will be incorrect when they don't match up. Is this fixed already, or does it still need to be fixed?

The x86_64 register context attempts to remap the registers on the fly when we're debugging a 32-bit target.

There is a problem in ConvertBetweenRegisterKinds where this can return the incorrect result (because it returns the first match and ignores the target).

I have a patch that adds a special case to handle that if the target kind is eRegisterKindLLDB, but I don't think that's in trunk yet. I suspect that the conversion will fail for other register kinds too, but that's the only one we've seen.

I think we need to put a short-term fix like that in place until we have time to re-work the i386 register context.

-Andy