Proposal: split tools/opt/opt.cpp to OptTool and a smaller main()

Hello,

I started some refactoring work on tools/opt/opt.cpp in r201116, but conceptually this is part of a larger effort. I’d like to consult with llvmdev@ about the best way to move this forward.

Background: opt is a very useful swiss-army-knife tool, and its capabilities may be useful to custom tools. However, as it stands now opt is now very modular - almost all its functionality is contained within its main opt.cpp file.

I think this is also the cause of some code duplication that currently exists between different LLVM tools. For example, some code from opt.cpp is duplicated in llc.cpp; the same is true for the logic of adding optimization passes between opt.cpp and Clang (a comment in opt.cpp even admits it “duplicates llvm-gcc behaviour”, showing its age :slight_smile: A lot of cl::opt definitions are duplicated between different tools and have to be kept in sync for consistency.

What I’d like to see is most of opt’s functionality moving outside of opt.cpp to make it reusable in other tools. For example, this could be encapsulated in a class named OptTool, or just a namespace with a bunch of utility functions, as well as cl::opt definitions.

Such a transition can be done in steps: the first step would be to create this OptTool within tools/opt. This would immediately enable building custom opt-based tools by linking in the code from tools/opt, leaving tools/opt/opt.cpp out. A more ambitious step would be to move the functionality to a library (lib/Tools?) - enabling code reuse between different LLVM tools as well as easier use within custom tools.

Any opinions / suggestions welcome. I’ll be happy to send out piecemeal patches that implement this refactoring.

Eli

+1. With this change, the llvm build wouldn't need to optionally
reference polly. Instead, the polly build could create a one-off
version of opt that includes the static-linked polly library (perhaps
named spopt?). The polly build would use that executable instead of
opt for its unit tests.

From a package management perspective, llvm would be a standalone

package, polly would depend on it, and clang would depend on both
polly and llvm. Good stuff.

-Greg

+1

To make opt+llc more aligned with the clang driver would be very helpful!

/Patrik Hägglund

Hello,

I started some refactoring work on tools/opt/opt.cpp in r201116, but
conceptually this is part of a larger effort. I'd like to consult with
llvmdev@ about the best way to move this forward.

Background: opt is a very useful swiss-army-knife tool, and its capabilities
may be useful to custom tools. However, as it stands now opt is now very
modular - almost all its functionality is contained within its main opt.cpp
file.

I think this is also the cause of some code duplication that currently
exists between different LLVM tools. For example, some code from opt.cpp is
duplicated in llc.cpp; the same is true for the logic of adding optimization
passes between opt.cpp and Clang (a comment in opt.cpp even admits it
"duplicates llvm-gcc behaviour", showing its age :slight_smile: A lot of cl::opt
definitions are duplicated between different tools and have to be kept in
sync for consistency.

*nod*

What I'd like to see is most of opt's functionality moving outside of
opt.cpp to make it reusable in other tools. For example, this could be
encapsulated in a class named OptTool, or just a namespace with a bunch of
utility functions, as well as cl::opt definitions.

Such a transition can be done in steps: the first step would be to create
this OptTool within tools/opt. This would immediately enable building custom
opt-based tools by linking in the code from tools/opt, leaving
tools/opt/opt.cpp out. A more ambitious step would be to move the
functionality to a library (lib/Tools?) - enabling code reuse between
different LLVM tools as well as easier use within custom tools.

I'd prefer just going for the latter to minimize churn etc.
Alternately we can leave it separated into tools/opt for now until we
get the right interfaces. I think the existing code has been around
enough that moving it into a separate library will be fine.

Any opinions / suggestions welcome. I'll be happy to send out piecemeal
patches that implement this refactoring.

I'll happily review.

-eric

> Hello,
>
> I started some refactoring work on tools/opt/opt.cpp in r201116, but
> conceptually this is part of a larger effort. I'd like to consult with
> llvmdev@ about the best way to move this forward.
>
> Background: opt is a very useful swiss-army-knife tool, and its
capabilities
> may be useful to custom tools. However, as it stands now opt is now very
> modular - almost all its functionality is contained within its main
opt.cpp
> file.
>
> I think this is also the cause of some code duplication that currently
> exists between different LLVM tools. For example, some code from opt.cpp
is
> duplicated in llc.cpp; the same is true for the logic of adding
optimization
> passes between opt.cpp and Clang (a comment in opt.cpp even admits it
> "duplicates llvm-gcc behaviour", showing its age :slight_smile: A lot of cl::opt
> definitions are duplicated between different tools and have to be kept in
> sync for consistency.

*nod*

>
> What I'd like to see is most of opt's functionality moving outside of
> opt.cpp to make it reusable in other tools. For example, this could be
> encapsulated in a class named OptTool, or just a namespace with a bunch
of
> utility functions, as well as cl::opt definitions.
>
> Such a transition can be done in steps: the first step would be to create
> this OptTool within tools/opt. This would immediately enable building
custom
> opt-based tools by linking in the code from tools/opt, leaving
> tools/opt/opt.cpp out. A more ambitious step would be to move the
> functionality to a library (lib/Tools?) - enabling code reuse between
> different LLVM tools as well as easier use within custom tools.
>

I'd prefer just going for the latter to minimize churn etc.
Alternately we can leave it separated into tools/opt for now until we
get the right interfaces.

Yep, this is my main motivation to do it stepwise. The best interfaces may
not be immediately obvious and polishing it within tools/opt in this case
is the less-churn approach because once you go to a library and have a
bunch of dependencies things are harder to change and test.

I think the existing code has been around
enough that moving it into a separate library will be fine.

> Any opinions / suggestions welcome. I'll be happy to send out piecemeal
> patches that implement this refactoring.
>

I'll happily review.

Thanks.

Eli

I thought that opt was supposed to basically be a way to run a PassManager over some IR? Seems like the new PM should standardize the appropriate things and have them be a natural part of its API.

– Sean Silva

I thought that opt was supposed to basically be a way to run a PassManager
over some IR?

See some of my patches over the past week or so. opt has grown quite a bit
of functionality, some of which gets duplicated in other tools simply
because tools/opt/opt.cpp can't just be added to other tools.

Seems like the new PM should standardize the appropriate things and have
them be a natural part of its API.

I don't intend to modify anything about how the PM works, so ISTM these
changes are orthogonal. All I'm doing is encapsulating functionality from
the main opt.cpp file into self-contained utility modules that can also be
linked into other tools (existing and custom), avoiding code duplication.

Eli

What exactly do you have in mind? The reusable part of what opt does is already factored into the pass manager builder. Everything else is file I/O, debugging, or flags management.

Put another way, opt is a collection of all the boiler plate you could ever want to use, but that's not a good line to refactor along. Having all this stuff -- let's take the flags as one example -- is sensible for opt because it's an internally facing tool for compiler developers, but it's not a reusable part.

Eli Bendersky wrote:

Patrik Hägglund H wrote:

+1

To make opt+llc more aligned with the clang driver would be very helpful!

I don't think that's one of the possible outcomes. Clang has its own flag processing, and llvm parts can not depend on clang parts so opt can't use clang's flag processing. Clang can't use llvm's flag processing either because llvm does keep a single global flag state, breaking the model of having clang linked into your program as a library and processing different files with different sets of flags.

Nick

What exactly do you have in mind? The reusable part of what opt does is
already factored into the pass manager builder. Everything else is file
I/O, debugging, or flags management.

Put another way, opt is a collection of all the boiler plate you could
ever want to use, but that's not a good line to refactor along. Having all
this stuff -- let's take the flags as one example -- is sensible for opt
because it's an internally facing tool for compiler developers, but it's
not a reusable part.

This email outlined the general approach, but the more concrete steps are
best viewed through the patches I was sending this week. Yes, opt is a
collection of a lot of boilerplate and yes, some of it really does belong
in the main opt.cpp file; but some other parts could be separated into
standalone modules (starting with h/cpp pairs in tools/opt and maybe at
some point moving to a library if that starts making sense).

The benefits are twofold. One is removing code duplication within existing
LLVM tools:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140210/204847.htmlis
a good example.

Another is enabling opt-like tools to be written without duplicating a lot
of opt. Today, if I want to write a custom tool based on LLVM libraries
that runs some custom passes as well as adding LLVM optimization passes,
ISTM I have to copy-paste a lot of code from opt.cpp; my goal is to reduce
this amount - not to zero, which isn't realistic, but to at least reduce
it. When all of my patches are augmented in my local branches, the size of
opt.cpp already goes down from ~900 LOC to half that size, and much of the
remnants are irreducible like flags and main() logic.

Eli

FWIW, it will. I don't want to try to sequence these things though.

I don't think any of my changes contradict future PM work. On the contrary,
if they reduce code duplication in places that could only have benefits.

I have a feeling some folks overestimate the extent of these changes. All
I'm doing is trying to take all the cruft that has been dumped into opt.cpp
over many years and modularize it a bit, making it more convenient to reuse.

P.S. is there an ETA on the new PM work?

Eli

ETAs on software are moderately... challenging.

When I'm not stuck in a committee, I'm working on it. I hope that the
driver related bits will be done quite soon. The first bits are already
there.

What exactly do you have in mind? The reusable part of what opt does is
already factored into the pass manager builder. Everything else is file
I/O, debugging, or flags management.

Put another way, opt is a collection of all the boiler plate you could
ever want to use, but that's not a good line to refactor along.

This is a great way to put it.

-- Sean Silva