I’m working on a system for analyzing the binary size of mobile applications and I’m trying to match up literal strings with the objects / archives that they came from.
The link map alone is not sufficient for what I’m trying to accomplish because multiple objects could provide redundant copies of the same symbol of which only one will be used.
I need to get all of the symbols and literal strings from each object and then compare those with the values in the link map.
I can already properly attribute symbols because they have a stable identifier which I can get from the symbol table.
But for string literals, the link map has something like:
0x1136B3371 0x00000005 [ 32] literal string: host\\n\tport
While the output from llvm-readobj -p has something like:
[ 5a] host.port
It is mangling the whitespace characters and replacing them with a period. Is llvm-readobj the right tool here? Or is there another option I haven’t considered?
Obviously the linker has to do this so it’s possible, but I’m hoping I don’t have to reverse-engineer LLD just to get the strings from an object.