Is lld the linker we need for our project ?

Hi,

We are currently developing an LLVM-based compilation toolchain
for a micro-controller, but would need some advice about whether
we should use lld as linker.

So far we managed to write a basic target handler to read ELF
files generated by llc and link them (and relocations seem to
be applied correctly).
But we have target-specific requirements:
- the program will be loaded into memory as-is, there is no
   runtime linker
- data and code must be located in different areas of memory
   so we should be able to specify a different base address for
   different sections
- parts of the code (such as interrupt vectors) must be located
   at specific addresses
- we will also need to be able to specify (in a map file) the
   address of any function or static variable

Is any of those requirements incompatible with lld ? And if not,
any hints about how to implement them (or any lld documentation
I might have missed) ?

Thanks a lot in advance,
Rod

It sounds like you need linker scripts and objcopy -O binary.

lld currently has very limited linker script support. Support for
layout using linker scripts would need to be completed. It is
currently stubbed out in DefaultLayout.h:class ScriptLayout. I've
cc'ed Shankar who knows more about how that is intended to work.

Both Shankar and I intend to implement linker scripts eventually, but
it's not a high priority for me right now.

So to answer your question. If you need a linker right now lld isn't
really going to work for you, but your use case is definitely in scope
for lld.

- Michael Spencer

Hi,

Thanks a lot for your answer. It seems lld is still the best
solution, even if it does not work "right out of the box" for
us today.

We already have a solution for the "objcopy" part (added the
required output format to llvm-objdump).

The ScriptLayout class seems to be empty for now (on the master
branch at least), but we do not need linker scripts today.
All that is required for now is to be able to assign a fixed
address to a few atoms (the ones that will hold the reset &
interrupt vectors) and place code/data sections in code/data
memory (so we can simulate generated code and fix and optimize
our LLVM target).
I guess that can be done by adapting the DefaultLayout code in
our own Layout class, but any hint or documentation about how
to do this in a clean manner is welcome.

I have another question: are there any plans concerning debug
information ? In v3.4, the documentation says lld does not
support DWARF info, although there are debug-related sections
in lld's output.

Thanks a lot in advance,
Rod

Hi,

Thanks a lot for your answer. It seems lld is still the best
solution, even if it does not work "right out of the box" for
us today.

We already have a solution for the "objcopy" part (added the
required output format to llvm-objdump).

Interesting solution. Would it be much work to implement an
llvm-objcopy that only supports -O binary? I plan to fully implement
objcopy one of these days, but having just that as a start would be
very useful.

The ScriptLayout class seems to be empty for now (on the master
branch at least), but we do not need linker scripts today.
All that is required for now is to be able to assign a fixed
address to a few atoms (the ones that will hold the reset &
interrupt vectors) and place code/data sections in code/data
memory (so we can simulate generated code and fix and optimize
our LLVM target).
I guess that can be done by adapting the DefaultLayout code in
our own Layout class, but any hint or documentation about how
to do this in a clean manner is welcome.

If you're willing to hard code your target into lld that works fine.
It's what we are currently doing for the default linker script for
glibc/linux. There's not really any documentation about how to do
this. I'll try and figure out the basics tomorrow, as it's quite late.
I do know that there will be some issues with getting the base segment
to not start at the ELF file header. Shankar should also be familiar
with this.

I have another question: are there any plans concerning debug
information ? In v3.4, the documentation says lld does not
support DWARF info, although there are debug-related sections
in lld's output.

lld currently passes through DWARF sections correctly and executables
are debuggable. However It does not merge debug info or construct any
acceleration tables. So debugging works fine, you just get giant debug
sections.

Thanks a lot in advance,
Rod

- Michael Spencer

Hi again,

We already have a solution for the "objcopy" part (added the
required output format to llvm-objdump).

Interesting solution. Would it be much work to implement an
llvm-objcopy that only supports -O binary? I plan to fully implement
objcopy one of these days, but having just that as a start would be
very useful.

I must admit I didn't try to implement a clean solution for this,
we only needed the code in some sort of hex format (and the format
accepted by Verilog's $readmemh system task) to be used as input
for our ISS and adding an option and the associated dump function
to llvm-objdump was very easy.

Writing a simple llvm-objcopy tool (with necessary output formats)
would indeed be a cleaner way of solving that problem and probably
not a lot of work (I will have time for this next week).
Not sure how it should cope with non-adjacent sections for binary
format, though...

>> [...]

lld currently passes through DWARF sections correctly and executables
are debuggable. However It does not merge debug info or construct any
acceleration tables. So debugging works fine, you just get giant debug
sections.

That's fine for us (and excellent news since I believed debug info
were not processed at all).

Thanks a lot,
Rod

Rodolphe,

Your work in this direction is much appreciated as I will be eventually
leveraging lld for a microcontroller project as well. So far
I’m working on the compiler back-end and using vendor’s linker,
but switching to lld will be the next logical step.

Cheers, Kuba Ober

Hi,

Thanks a lot for your answer. It seems lld is still the best
solution, even if it does not work "right out of the box" for
us today.

We already have a solution for the "objcopy" part (added the
required output format to llvm-objdump).

Interesting solution. Would it be much work to implement an
llvm-objcopy that only supports -O binary? I plan to fully implement
objcopy one of these days, but having just that as a start would be
very useful.

The ScriptLayout class seems to be empty for now (on the master
branch at least), but we do not need linker scripts today.
All that is required for now is to be able to assign a fixed
address to a few atoms (the ones that will hold the reset &
interrupt vectors) and place code/data sections in code/data
memory (so we can simulate generated code and fix and optimize
our LLVM target).
I guess that can be done by adapting the DefaultLayout code in
our own Layout class, but any hint or documentation about how

There are two ways for getting this to work :-

(a)
There is a requirement that an output section could have a fixed address by using --section-start.

The way I thought to go about this was to have a structure, called

class AtomSectionAttributes {
virtual StringRef getOutputSectionName() = 0;
};

class MergedSectionAttributes {
virtual uint64_t getAddress() = 0;
virtual StringRef name() = 0;
virtual bool needsNewProgramHeader() = 0;
};

When the user says -section-start .text = 0xf000000, what lld could do is create a mergeSection upfront, and set the MergedSectionAttributes information.

The assignVirtualAddress would change by looking at the mergedSectionAttribute to set the appropriate virtualaddress.

(b)

The hierarchy that we have is as below :-

DefaultLayout
  >
ScriptLayout
  >
TargetLayout

All the ELF writers have a TargetLayout object contained in them.

Use a linker script, you need to parse the SECTION command, and override the following functions in the ScriptLayout class to take care of the users intention specified in the linker script.

getSectionOrder
getSectionName
hasOutputSecgment
addAtom
assignVirtualAddress
assignFileOffsets

Does this help ?

-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation

Hi,

We already have a solution for the "objcopy" part (added the
required output format to llvm-objdump).

Interesting solution. Would it be much work to implement an
llvm-objcopy that only supports -O binary? I plan to fully implement
objcopy one of these days, but having just that as a start would be
very useful.

I started a tiny llvm-objcopy utility that can read an object file and
output intel-hex or $readmemh format (which is exactly what we need for
now). It also has limited support for binary output provided the
sections are ordered by address and gaps are not too large (this should
not be difficult to fix, but I don't know your exact requirements about
this option).
Sources are available there: https://github.com/RodAtDISA/llvm-objcopy

Best regards,
Rod