Extraneous SYMTAB entries, what am I missing?

(If anyone knows of a better list/spot to ask, please feel free to refer me, I don’t want to add unnecessary noise here.)

I have a custom written tool that parses the symbol table entries in dSYMs for when DWARF data is unavailable for a given symbol (doing symbolication on linux, but, besides the point.)

I’m finding certain cases where a there will be multiple nlist entries in a row with the same nvalue (in fact, all of the fields are the same), except for the index into the string table. It seems that the first entry points to the correct symbol name in the string table, then the following entry (or entries) point to what look like bogus names, i.e. things like “m_72b”, “m_72c”, “m_72e”, etc.

When I symbolicate using atos, it spits out the first (correct) symbol name. My tool gets this wrong, because the subsequent entries over-write my search data structure entry for that particular n_value.

I checked this against the lldb symbolication framework exposes via the python interface, and it produces the same output as my tool.

This leads me to think that atos is coded to ignore all but the first entry it finds for a given address, whereas whatever symbolication engine lldb is exposing is following the same behavior my tool is.

Anyone have any idea what would be producing those extra nlist entries, or what they are?

Yes, there are many functions that can be "aliased" to one another using linker flags, assembly code can create multiple visible labels at the same address.

When in doubt, dump the symbol table with "nm" and see what it says about these symbols. It will often include extra text to describe what the symbol is.

Greg

I would have liked to, however I’ve been unable to run nm on the dsym due to the good old:

…malformed object (offset field of section 0 in LC_SEGMENT command 3 not past the headers of the file)

error. I believe we spoke about this off list once before and it was caused by the fact that the dSYM will not include all of the data that was in the binary (of course), and so nm performs some overzealous checks that fail. Unfortunately, even the version of nm that ships with XCode 4.6.3 suffers from this deficiency.

Nonetheless, I’ve run my program outputting all of the nlist data for each entry:

nlist: 0x93c70, n_strx: 78191, name:_m_728, value: 0x4b9fc, desc: 0x0000, type: 0x000e, sect: 0x0001
nlist: 0x93c7c, n_strx: 78198, name:m_SVManager_Start, value: 0x4ba68, desc: 0x0000, type: 0x000e, sect: 0x0001
nlist: 0x93c88, n_strx: 78215, name:_m_729, value: 0x4ba68, desc: 0x0000, type: 0x000e, sect: 0x0001
nlist: 0x93c94, n_strx: 78222, name:m_SVManager_Disperse, value: 0x4bd98, desc: 0x0000, type: 0x000e, sect: 0x0001
nlist: 0x93ca0, n_strx: 78242, name:_m_72a, value: 0x4bd98, desc: 0x0000, type: 0x000e, sect: 0x0001
nlist: 0x93cac, n_strx: 78249, name:m_SVManager_Collect, value: 0x4d128, desc: 0x0000, type: 0x000e, sect: 0x0001
nlist: 0x93cb8, n_strx: 78268, name:_m_72b, value: 0x4d128, desc: 0x0000, type: 0x000e, sect: 0x0001

And as I said, these symbols only differ in their index to the string table. If these were aliases, wouldn’t the n_type field for the bogus looking symbols be N_INDIR(0xa) and not 0xe? Also, is it not meaningful that lldb’s symbolication returns the last referenced symbol name in the file, just as my tool does? (I’d be happy to include my code that uses lldb to do the lookups as a pastebin or similar.)