[lld] driver and options questions

Michael,

I'm looking at flushing out the mach-o driver and targetinfo.

Can we rename the "ld64" flavor to "darwin". The command line tool on MacOSX is called "ld" - just like on unix. The name ld64 is the current source repository name for the linker. Once lld takes over, the term ld64 won't mean anything.

I've worked through adding DarwinOpts.td new DarwinDriver class, but have some questions about wiring it up. Currently the instantiated Driver transforms the command line arguments in to "core" arguments which is passed to generatedOptions() to construct a LinkerOptions object.

Is the plan for LinkerOptions to contain the superset of all flavor's options? That seems like it won't scale well. In particular if you are using lld as a library and you want to programmatically create a LinkerOptions, it is unclear which options need to be set for a particular flavor.

It seems like the concrete subclass of TargetInfo will ultimately hold the flavor specific options. So can DarwinDriver get a copy of the MachOTargetInfo object and set its ivars based on the command line options? Previously, I thought of LinkerOptions as the options needed by the core-linking phase (resolver), and the WriterOptions were flavor specific.

Here is how I see it currently works:

1) The flavor determines the driver class instantiated.
2) The driver transforms flavor specific options into a "core" ArgList;
3) LinkerOptions constructor requires a core ArgList and sets ivars base on the ArgList.
4) The LinkerInvocation object is constructed from LinkerOptions object.
5) The LinkerInvocation object instantiates a Target object from the LinkerOptions which also creates a TargetInfo object and passes ownership to of the TargetInfo to the Target object. This last step seems convoluted. Couldn't the Target constructor create the TargetInfo ivar?
6) Problem: there is no way to connect flavor specific options to the BlahTargetInfo object.

It seems like there are too many classes involved. I think would be simpler to have:
1) The flavor determines the Driver class instantiated.
2) The driver creates a TargetInfo subclass object. The base class TargetInfo contains all the fields that used to be in LinkerOptions.
3) The driver looks at each command line option and either uses it to set something in the TargetInfo object or passes it to the super class to handle TargetInfo base class options.
4) The LinkerInvocation object is constructed using the TargetInfo object.

In summary, my proposed model merges the Target class and LinkerOptions class into the TargetInfo class. The LinkerInvocation class runs a link based on a TargetInfo object. The TargetInfo object is programmatically configured. In the command line case, it is created by a Driver instance and configured based on command line options.

If we really need all the classes, can you explain the purpose of each (in doxygen comments). Thanks.

-Nick

My take on it would be to have LinkerOptions and TargetInfo as seperate, because they mean totally different things.

Right now, targetOptions are used to only construct the type of object to create (little endian/big endian{32,64}) but I see lot more changes coming in that direction(like TargetHandler).

Different targets could consume LinkerOptions and say yes/no, to see the target sees them as valid options (or) not.

Thanks

Shankar Easwaran

Michael,

I'm looking at flushing out the mach-o driver and targetinfo.

Can we rename the "ld64" flavor to "darwin". The command line tool on MacOSX is called "ld" - just like on unix. The name ld64 is the current source repository name for the linker. Once lld takes over, the term ld64 won't mean anything.

Sounds fine.

We also need to figure how to determine when ld means binutils ld and
when it means darwin ld.

I've worked through adding DarwinOpts.td new DarwinDriver class, but have some questions about wiring it up. Currently the instantiated Driver transforms the command line arguments in to "core" arguments which is passed to generatedOptions() to construct a LinkerOptions object.

Is the plan for LinkerOptions to contain the superset of all flavor's options? That seems like it won't scale well. In particular if you are using lld as a library and you want to programmatically create a LinkerOptions, it is unclear which options need to be set for a particular flavor.

It seems like the concrete subclass of TargetInfo will ultimately hold the flavor specific options. So can DarwinDriver get a copy of the MachOTargetInfo object and set its ivars based on the command line options? Previously, I thought of LinkerOptions as the options needed by the core-linking phase (resolver), and the WriterOptions were flavor specific.

Not allowing Drivers to touch anything except for core args has a very
important side effect. We will always be able to test everything
through -core and dump how to run it with -###. LLVM and Clang both
handle options like this, and it seems to scale fine.

As an alternative, I think we should split the target specific options
up in LinkerOptions by adding sub objects. This will simplify user
created LinkerOption setup.

Here is how I see it currently works:

1) The flavor determines the driver class instantiated.
2) The driver transforms flavor specific options into a "core" ArgList;
3) LinkerOptions constructor requires a core ArgList and sets ivars base on the ArgList.
4) The LinkerInvocation object is constructed from LinkerOptions object.
5) The LinkerInvocation object instantiates a Target object from the LinkerOptions which also creates a TargetInfo object and passes ownership to of the TargetInfo to the Target object. This last step seems convoluted. Couldn't the Target constructor create the TargetInfo ivar?
6) Problem: there is no way to connect flavor specific options to the BlahTargetInfo object.

It seems like there are too many classes involved. I think would be simpler to have:
1) The flavor determines the Driver class instantiated.
2) The driver creates a TargetInfo subclass object. The base class TargetInfo contains all the fields that used to be in LinkerOptions.
3) The driver looks at each command line option and either uses it to set something in the TargetInfo object or passes it to the super class to handle TargetInfo base class options.
4) The LinkerInvocation object is constructed using the TargetInfo object.

In summary, my proposed model merges the Target class and LinkerOptions class into the TargetInfo class. The LinkerInvocation class runs a link based on a TargetInfo object. The TargetInfo object is programmatically configured. In the command line case, it is created by a Driver instance and configured based on command line options.

If we really need all the classes, can you explain the purpose of each (in doxygen comments). Thanks.

-Nick

I do agree that the Target class is now unneeded. It's original
purpose was to translate between LinkerOptions and
{Reader,Writer}Options{ELF,MachO,PECOFF}.

So my alternative would be:

1) The flavor determines the driver class instantiated.
2) The driver transforms flavor specific options into a "core" ArgList;
3) LinkerOptions constructor requires a core ArgList and sets data
members base on the ArgList.
4) The LinkerInvocation object is constructed from LinkerOptions object.
5) LinkerInvocation creates a TargetInfo object from a LinkerObject.

- Michael Spencer

We can still test any option (for example a darwin specific option) like:
    lld -flavor darwin -no_compact_unwind ...

Clang is different than ld is that it currently (from my understand) just supports the gcc command line options. It does not support completely different command line languages like binutils's ld and darwin's ld have.

The darwin linker has ~100 command line options. For most of those we will need to make up some core option name. A name which no one will ever use, but exists solely go through a command line bottleneck.

Looking forward to when lld is used as a library, I think we should have a (non-string based) programmatic interface. That is, some big (structured) struct with fields for all the linking configuration settings. The linking is driven by this struct. Then for command line links should be layered on top of this. That is, the driver's job should be to convert command line args into this big struct. And, for debugging, the driver should have a way to take an instance of the big struct and dump it into command line args (like -###).

Given that model, the question is, is there one big struct that is the union of all options from all flavors? Or a base struct and a subclass for each flavor. I prefer the subclass per flavor approach.

-Nick

I've worked through adding DarwinOpts.td new DarwinDriver class, but have some questions about wiring it up. Currently the instantiated Driver transforms the command line arguments in to "core" arguments which is passed to generatedOptions() to construct a LinkerOptions object.

Is the plan for LinkerOptions to contain the superset of all flavor's options? That seems like it won't scale well. In particular if you are using lld as a library and you want to programmatically create a LinkerOptions, it is unclear which options need to be set for a particular flavor.

It seems like the concrete subclass of TargetInfo will ultimately hold the flavor specific options. So can DarwinDriver get a copy of the MachOTargetInfo object and set its ivars based on the command line options? Previously, I thought of LinkerOptions as the options needed by the core-linking phase (resolver), and the WriterOptions were flavor specific.

Not allowing Drivers to touch anything except for core args has a very
important side effect. We will always be able to test everything
through -core and dump how to run it with -###. LLVM and Clang both
handle options like this, and it seems to scale fine.

We can still test any option (for example a darwin specific option) like:
    lld -flavor darwin -no_compact_unwind ...

The problem with this is that the driver is allowed to look at its
environment (the file system, environment vars, etc...) to figure
things out. -core isn't. It is only allowed to look at the command
line. For example, the line above would target whatever the default
target triple lld was configured for is.

Clang is different than ld is that it currently (from my understand) just supports the gcc command line options. It does not support completely different command line languages like binutils's ld and darwin's ld have.

Clang has many command line options that only effect a single
platform. Also, binutils-ld targets both Windows and Darwin in
addition to ELF systems.

The darwin linker has ~100 command line options. For most of those we will need to make up some core option name. A name which no one will ever use, but exists solely go through a command line bottleneck.

Lots of these options probably have equivalents in other flavors. It's
also very easy to use TableGen to automatically forward a lot of these
options. This is what clang does for a large part of the -cc1 options.

Looking forward to when lld is used as a library, I think we should have a (non-string based) programmatic interface. That is, some big (structured) struct with fields for all the linking configuration settings. The linking is driven by this struct. Then for command line links should be layered on top of this. That is, the driver's job should be to convert command line args into this big struct. And, for debugging, the driver should have a way to take an instance of the big struct and dump it into command line args (like -###).

I agree. And this is a non-string based interface.

As for debugging, having -### give the -core command line is
important. It allows a dev to debug a crash on a totally unrelated
system to the user, as the entire context of the link (except the
individual files) is there on the command line.

Given that model, the question is, is there one big struct that is the union of all options from all flavors? Or a base struct and a subclass for each flavor. I prefer the subclass per flavor approach.

-Nick

Sub classing for each format would actually be fine. And would make it
easy to give unused argument errors for -core command lines that mix
flags.

- Michael Spencer