symbol lookup bug

Hi everyone,

I found a bug in the symbol lookup in lldb, and as far as I can tell it's a serious design flaw in the way symbols are handled.
However, I can only reproduce it when debugging an Android target remotely, and I don't understand the symbol lookup mechanism sufficiently to be able to reproduce it on a local target. The bug relies on a certain number of symbol lookups and then another symbol lookup in very specific circumstances.
I am hoping that describing the bug in detail will allow someone who knows the mechanism understand the issues and create a local testcase.

Symbols are stored in a vector. Symbols are referenced and passed around as pointers. When a new symbol is added it is appended to the symbol vector. If the vector does not yet have enough capacity it is resized, which makes all pointers to the symbols invalid.

In my example I use the "si" command to step into a function call. lldb then tries to dump the assembly of the new function. During that call new symbols are resolved and added, because symbols are resolved lazily, as I understand. However, this happens right in the middle of Instruction::Dump (in the call to CalculateMnemonicOperandsAndCommentIfNeeded), which takes in two SymbolContexts which contain pointers to Symbols and passes them to Debugger::FormatDisassemblerAddress. By the time this function gets called the pointers to Symbols are invalid. GetName is called on them, but the name is 0xfeeefeee, hence it crashes.

To check my hypothesis, I reserved a lot of space for the m_symbols vector initially, and that stopped the crash from happening.

I've included below the callstacks to the place where the vector is resized, as well as where it crashes.

Please let me know if you need any more information.
Thanks very much!

symbol vector resizing:
lldb_private::Symtab::AddSymbol
ObjectFileELF::ResolveSymbolForAddress
lldb_private::Module::ResolveSymbolContextForAddress
lldb_private::Address::Dump
DisassemblerLLVMC::SymbolLookup
DisassemblerLLVMC::SymbolLookupCallback
llvm::MCExternalSymbolizer::tryAddingSymbolicOperand
llvm::MCDisassembler::tryAddingSymbolicOperand
tryAddingSymbolicOperand
translateImmediate
translateOperand
translateInstruction
llvm::X86Disassembler::X86GenericDisassembler::getInstruction
DisassemblerLLVMC::LLVMCDisassembler::GetMCInst
InstructionLLVMC::CalculateMnemonicOperandsAndComment
lldb_private::Instruction::CalculateMnemonicOperandsAndCommentIfNeeded
lldb_private::Instruction::Dump
lldb_private::Disassembler::PrintInstructions
lldb_private::Disassembler::Disassemble

crash:
lldb_private::ConstString::operator bool
lldb_private::Mangled::GetDemangledName
lldb_private::Mangled::GetName
lldb_private::Symbol::GetName
lldb_private::Debugger::FormatDisassemblerAddress
lldb_private::Instruction::Dump
lldb_private::Disassembler::PrintInstructions
lldb_private::Disassembler::Disassemble

A repro test case (even if Android) or repro instructions would give
us much more to work with than your description.

Ok, I've filed a bug for this, which includes steps to reproduce:

https://llvm.org/bugs/show_bug.cgi?id=23515

  Verena

So symbols currently can't be resolved lazily, and if they are, then that is the bug. As you noticed "Symbol *" items are handed out and you can't ever resize the vector.

The symbol table is fetched during:

    virtual Symtab *
    ObjectFile::GetSymtab () = 0;

Each ObjectFile subclass must hand out a complete symbol table when if first parses and hands out a Symtab.

Please fix the plug-in that is lazily adding symbols.

Greg