PDB questions

Andrew_Kelley1 · August 31, 2018, 6:17am

Zachary,

Thanks for the help on IRC earlier. I’ve got code that can capture a stack trace and then discover for each address, its module, function, source index, line, and column.

I still have a couple of loose ends though. Do you know what’s going on here?

There appears to be 8 bytes before every LineFragmentHeader. Here’s some of my own debug output, which matches llvm-pdbutil’s output. You can see it says “unknown bytes: …”.

read C13 line info 136720 bytes
unknown bytes: f2 00 00 00 60 00 00 00
LineFragmentHeader{ .RelocOffset = 0, .RelocSegment = 5, .Flags = LineFlags{ .LF_HaveColumns = true, .unused = 0 }, .CodeSize = 52 }
has column: true
LineBlockFragmentHeader{ .NameIndex = 0, .NumLines = 6, .BlockSize = 84 }
LineNumberEntry{ .Offset = 0, .Flags = 101 } Flags{ .Start = 101, .End = 17, .IsStatement = false }

<snip some LineNumberEntry’s>
ColumnNumberEntry{ .StartColumn = 5, .EndColumn = 0 }
ColumnNumberEntry{ .StartColumn = 30, .EndColumn = 0 }
unknown bytes: f2 00 00 00 f0 00 00 00
LineFragmentHeader{ .RelocOffset = 64, .RelocSegment = 5, .Flags = LineFlags{ .LF_HaveColumns = true, .unused = 0 }, .CodeSize = 366 }
has column: true
LineBlockFragmentHeader{ .NameIndex = 8, .NumLines = 18, .BlockSize = 228 }
LineNumberEntry{ .Offset = 0, .Flags = 53 } Flags{ .Start = 53, .End = 20, .IsStatement = false }
LineNumberEntry{ .Offset = 20, .Flags = 54 } Flags{ .Start = 54, .End = 24, .IsStatement = false }

Do you know what’s going on with these 8 bytes? I have scoured llvm-pdbutil’s source but I cannot find where these bytes are coming from.

Is there a simpler way to find out which is the /names (string table) stream index without porting the entire hash table implementation?

Andrew_Kelley1 · August 31, 2018, 6:49am

One more:

In the purpose of mapping source file index to string, I found this code:

Expectedcodeview::DebugChecksumsSubsectionRef
ModuleDebugStreamRef::findChecksumsSubsection() const {
codeview::DebugChecksumsSubsectionRef Result;
for (const auto &SS : subsections()) {
if (SS.kind() != DebugSubsectionKind::FileChecksums)
continue;

if (auto EC = Result.initialize(SS.getRecordData()))
return std::move(EC);
return Result;
}
return Result;
}

Subsections() is populated here:

if (auto EC = Reader.readSubstream(C13LinesSubstream, C13Size))
return EC;

BinaryStreamReader SymbolReader(SymbolsSubstream.StreamData);
if (auto EC =
SymbolReader.readArray(SymbolArray, SymbolReader.bytesRemaining()))
return EC;

BinaryStreamReader SubsectionsReader(C13LinesSubstream.StreamData);
if (auto EC = SubsectionsReader.readArray(Subsections,
SubsectionsReader.bytesRemaining()))
return EC;

So it looks like there should be one of these just after the C13Lines substream:

struct DebugSubsectionHeader {
support::ulittle32_t Kind; // codeview::DebugSubsectionKind enum
support::ulittle32_t Length; // number of bytes occupied by this record.
};

But when I look there with my own code I only see zeroes:

read C13 line info 142964 bytes
DebugSubsectionHeader{ .Kind = DebugSubsectionKind.None, .Length = 0 }
DebugSubsectionHeader{ .Kind = DebugSubsectionKind.None, .Length = 0 }
DebugSubsectionHeader{ .Kind = DebugSubsectionKind.None, .Length = 0 }
DebugSubsectionHeader{ .Kind = DebugSubsectionKind.None, .Length = 0 }
DebugSubsectionHeader{ .Kind = DebugSubsectionKind.None, .Length = 0 }
DebugSubsectionHeader{ .Kind = DebugSubsectionKind.None, .Length = 0 }
DebugSubsectionHeader{ .Kind = DebugSubsectionKind.None, .Length = 0 }

Any clues?

Zachary_Turner1 · August 31, 2018, 2:06pm

For the first and third questions, the easiest thing to do would be run llvm-pdbutil under a debugger and step through the code. Code that looks simple and innocuous can often have a lot of stuff hidden behind it. For example you could step through that loop that iterates the debug subsections and look at the value of Reader.getOffset() every time, and see if it matches with your own code (probably it doesn’t). Or you could dump the entire contents of the C13Substrem and see if the bytes match up between your own implementation. It looks like you’re reading all 0s, so maybe you’re just not even reading the right data.

For the second question, unfortunately I don’t know of a better way. If the/names stream starts with a magic header, maybe you could walk each stream looking for that. But it maybe possible to have a rare false positive that way.

BTW, have you considered just using llvm’s library rather than porting it? It certainly seems like less work

Andrew_Kelley1 · August 31, 2018, 4:52pm

Thanks for the advice. I’ll examine llvm-pdbutil’s behavior with a debugger.

For the first and third questions, the easiest thing to do would be run llvm-pdbutil under a debugger and step through the code. Code that looks simple and innocuous can often have a lot of stuff hidden behind it. For example you could step through that loop that iterates the debug subsections and look at the value of Reader.getOffset() every time, and see if it matches with your own code (probably it doesn’t). Or you could dump the entire contents of the C13Substrem and see if the bytes match up between your own implementation. It looks like you’re reading all 0s, so maybe you’re just not even reading the right data.

For the second question, unfortunately I don’t know of a better way. If the/names stream starts with a magic header, maybe you could walk each stream looking for that. But it maybe possible to have a rare false positive that way.

BTW, have you considered just using llvm’s library rather than porting it? It certainly seems like less work

The point of these stack traces is that they go into the userland runtime code. So if I did this, it would cause my users’ programs to depend on LLVM. I don’t think that’s right. I already have working stack traces for linux and macos that don’t depend on any libraries and don’t add more than ~20KB to the runtime size.

Andrew_Kelley1 · September 2, 2018, 10:34pm

Thanks again for your help, Zachary.

Here’s a screenshot of stack traces working on Windows!
https://i.imgur.com/eOQO0GT.png

I owe you some PDB documentation patches.

Topic		Replies	Views
Missing data on PDB's generated by lld LLVM Dev List Archives	4	81	March 18, 2019
status of DebugInfo/PDB/Native LLVM Dev List Archives	3	67	July 20, 2017
[MS] Partial PDB (/DEBUG:FASTLINK) parsing support in LLVM LLVM Dev List Archives	2	73	June 8, 2017
Require support to use LLVM's PDB classes and pdbutil LLVM Dev List Archives	3	76	May 16, 2019
llvm pdb utility question - how to convert real address to a segmented one LLVM Dev List Archives	3	66	July 25, 2018

PDB questions

Related Topics