Bug ID 23515
Summary pointer to symbol invalidated when symbol vector size is increased
Product lldb
Version unspecified
Hardware PC
OS All
Status NEW
Severity normal
Priority P
Component All Bugs
Assignee lldb-dev@cs.uiuc.edu
Reporter verena@codeplay.com
Classification Unclassified
Created attachment 14321 [details]
Executable
When debugging the attached application remotely on an Android emulator lldb
crashes because a pointer to a symbol has been invalidated.
To reproduce:
Requirements: adb, Android SDK with emulator.exe and an x86 device image
Steps:
* Start the emulated android device
* Create a folder /data/symbol_bug on the device
* Move a.out and libsimple.so to /data/symbol_bug by typing
adb push a.out /data/symbol_bug
adb push libsimple.so /data/symbol_bug
* Set the permissions of the a.out file by typing
adb shell chmod 777 /data/symbol_bug/a.out
* Run launch_tool.py -e <path_to_emulator.exe> -s <path_to_adb.exe> -d
<name_of_device>
name_of_device will default to "Nexus_5_API_21_x86"
* Run lldb
* Type
gdb-remote 1234
disassemble -m
* Find the first instruction after the infinite loop (i.e. the line "if
(handle)"
* Type
register write eip <instruction_hex>
* Type "si" a few times
* After stepping into the call to dlclose (when it tries to print the
function's assembly) lldb crashes.
Explanation:
Symbols are stored in a vector. Symbols are referenced and passed around as
pointers. When a new symbol is added it is appended to the symbol vector. If
the vector does not yet have enough capacity it is resized, which makes all
pointers to the symbols invalid.
In my example I use the "si" command to step into a function call. lldb then
tries to dump the assembly of the new function. During that call new symbols
are resolved and added, because symbols are resolved lazily, as I understand.
However, this happens right in the middle of Instruction::Dump (in the call to
CalculateMnemonicOperandsAndCommentIfNeeded), which takes in two SymbolContexts
which contain pointers to Symbols and passes them to
Debugger::FormatDisassemblerAddress. By the time this function gets called the
pointers to Symbols are invalid. GetName is called on them, but the name is
0xfeeefeee, hence it crashes.
To check my hypothesis, I reserved a lot of space for the m_symbols vector
initially, and that stopped the crash from happening.
I've included below the callstacks to the place where the vector is resized, as
well as where it crashes.
symbol vector resizing:
lldb_private::Symtab::AddSymbol
ObjectFileELF::ResolveSymbolForAddress
lldb_private::Module::ResolveSymbolContextForAddress
lldb_private::Address::Dump
DisassemblerLLVMC::SymbolLookup
DisassemblerLLVMC::SymbolLookupCallback
llvm::MCExternalSymbolizer::tryAddingSymbolicOperand
llvm::MCDisassembler::tryAddingSymbolicOperand
tryAddingSymbolicOperand
translateImmediate
translateOperand
translateInstruction
llvm::X86Disassembler::X86GenericDisassembler::getInstruction
DisassemblerLLVMC::LLVMCDisassembler::GetMCInst
InstructionLLVMC::CalculateMnemonicOperandsAndComment
lldb_private::Instruction::CalculateMnemonicOperandsAndCommentIfNeeded
lldb_private::Instruction::Dump
lldb_private::Disassembler::PrintInstructions
lldb_private::Disassembler::Disassemble
crash:
lldb_private::ConstString::operator bool
lldb_private::Mangled::GetDemangledName
lldb_private::Mangled::GetName
lldb_private::Symbol::GetName
lldb_private::Debugger::FormatDisassemblerAddress
lldb_private::Instruction::Dump
lldb_private::Disassembler::PrintInstructions
lldb_private::Disassembler::Disassemble