I have just checked it, the startup.elf and realmode.elf are fine. Only
few changes are required for mainline kernel and one commit has to be
reverted from lld and a few patches have to be applied.
The only step when I have used BFD is linking vmlinux. I have manually set
LD variable in vmlinux_link() function. The vmlinux produced by lld doesn't
work yet. I will compare it to the one produced by GNU ld and try to figure
out what is wrong (maybe you can suggest some useful objdump flags?)
With objdump I would recommend looking at program headers. In particular at
PT_LOAD's and the dynamic symbol table. Anything in the dynamic table is
also worth scrutinizing. One thing to keep an eye out for is
addresses/offsets that look "weird"; e.g. maybe the LLD version thinks a
symbol has address 0 or some insane value, vs BFD/gold which has a more
Also, set up your system so that you rebuild/reinstall the bootloader too
so that you can add printf's in there to hone in on where the boot is going
wrong. The following workflow might be useful:
Step 1: add a printf to the bootloader to try to hone in on the exact place
where things are going wrong
Step 2: rebuild/reinstall/reboot the new bootloader with the LLD-linked
Step 3: boot and observe the print's (or maybe things crashed before
reaching your print, which is just as useful to know)
Step 4: think about what you observed in Step 3, then go to Step 1, using
these results to inform the next set of print's to add
With appropriate scripts (and a nice qemu setup), one iteration of this may
take 10 minutes (say). You may have to repeat it (say) 20 times to pinpoint
the exact place where things are going wrong (e.g. "the bootloader is
crashing in the memcpy for the second PT_LOAD" or "the boot is failing
because the bootloader is reading from a bogus address that it got from
this part of the binary"). That is 200 minutes which isn't too bad.
One thing to keep in mind is that this is not like debugging a race
condition or other nasty nondeterministic bug. This should be quite
deterministic so you just have to be systematic and keep narrowing down
until you find where things go wrong. It just requires determination.
Once narrowed-down, you should hopefully have a clear indication of where
to look in the binary and compare with gold/bfd and hopefully the
discrepancy is pretty clear. Then you "just" need to figure out why LLD
produces this result and what to change to avoid the problem.
One amazing tool if you are working with object files is "010 Editor"
010 Editor - Pro Text/Hex Editor | Edit 200+ Formats | Fast & Powerful | Reverse Engineering with a "binary template" for ELF
files. I think there is an ELF "binary template" for 010 Editor floating
around the net, but the best one is Michael's one that he has evolved over
the years (ask him for it). If you haven't done so already, I recommend
that you sit down at Michael's desk one day and work with him to debug one
of these nasty "what is wrong with this binary and why?" issues so you can
see him do his thing; he's amazingly good at it.
Also, if you need a quick refresher about this x86 boot stuff (to be
somewhat oriented about the environment in which all this stuff is
happening), you may want to skim:
-- Sean Silva