You probably don't want to go down the same route that clang goes through
to write the object file. If you think yaml2coff is convoluted, the way
clang does it will just give you a headache. There are multiple
abstractions involved to account for different object file formats (ELF,
COFF, MachO) and output formats (Assembly, binary file). At least with
yaml2coff
I think your phrase got cut there, but yeah I just found AsmPrinter.cpp and
it is convoluted.
It's true that yaml2coff is using the COFFParser structure, but if you
look at the writeCOFF function in yaml2coff it's pretty bare-metal. The
logic you need will be almost identical, except that instead of checking
the COFFParser for the various fields, you'll check the existing
COFFObjectFile, which should have similar fields.
The only thing you need to different is when writing the section table and
section contents, to insert a new entry. Since you're injecting a
section into the middle, you'll also probably need to push back the file
pointer of all subsequent sections so that they don't overlap. (e.g. if
the original sections are 1, 2, 3, 4, 5 and you insert between 2 and 3,
then the original sections 3, 4, and 5 would need to have their
FilePointerToRawData offset by the size of the new section).
I have the PE/COFF spec open here and I'm happy that I read a bit of it so
I actually know what you are talking about... yeah it doesn't seem too
complicated.
If you need to know what values to put for the other fields in a section
header, run `dumpbin /headers foo.obj` on a clang-generated object file
that has a .debug$H section already (e.g. run clang with
-emit-codeview-ghash-section, and look at the properties of the .debug$H
section and use the same values).
Thanks I will do that and then also look at how the CodeView part of the
code does it if I can't understand some of it.
The only invariant that needs to be maintained is that Section[N]->FilePointerOfRawData ==
Section[N-1]->FilePointerOfRawData + Section[N-1]->SizeOfRawData
Well, that and all the sections need to be on the final file... But I'm
hopeful.
Anyone has times on linking a big project like chrome with this so that at
least I know what kind of performance to expect?
My numbers are something like:
1 pdb per obj file: link.exe takes ~15 minutes and 16GB of ram,
lld-link.exe takes 2:30 minutes and ~8GB of ram
around 10 pdbs per folder: link.exe takes 1 minute and 2-3GB of ram,
lld-link.exe takes 1:30 minutes and ~6GB of ram
faslink: link.exe takes 40 seconds, but then 20 seconds of loading at the
first break point in the debugger and we lost DIA support for listing
symbols.
incremental: link.exe takes 8 seconds, but it only happens when very minor
changes happen.
We have an non negligible number of symbols used on some runtime systems.