Adding sub-commands and option classes to the command line library.

There was discussion on IRC about merging all of the compiler hacker
tools (llvm-as, bugpoint, llc, lli, etc...) into a single llvm
megatool. This tool would work similar to how most version control cli
programs work. One would call 'llvm as bitcode.ll' instead of 'llvm-as
bitcode.ll'. The main reason for this is to improve link time, but it
also reduces the total file size (especially in the case of static
linking), and makes it easier to discover the tools available.

I've begun work on this, but quickly ran into two problems. The first
is how to hook up each cl::opt to the command it belongs to, as when
all of the tools are in one executable, all options would be exposed
as it stands. The following is a sample of the API I propose to solve
this first problem. A new cl::subcommand class would be added. This
class contains the information for all sub-commands, including what
entry point to call when they are used. When the parser encounters
this option, It sets Command to the specified entry point and switches
to only recognizing options that have the cl::sub("<toolname>")

Example usage (I've not written any code yet, so please forgive any
errors in the following):

======== driver.h ==================
typedef int (*MainFunctionT)(int, char**);

int main_eat(int, char**);
int main_order(int, char**);

======== driver.cpp ================
#include "driver.h"

static cl:subcommand<MainFunctionT> Command(
  cl::desc("Driver commands"),
    "eat", main_eat, "Eat all the food",
    "order", main_order, "Order the food",

static cl::opt<std::string> Restaurant("restaurant"
  cl::desc("The name of the restaurant to use"));

int main(int argc, char **argv) {
  cl::ParseCommandLineOptions(argc, argv, "Restaurant driver!\n");

  // Shared setup code...

  return Command(argc - 1, argv + 1);

======== eat.cpp ===================
#include "driver.h"

static cl::opt<std::string> Utensil("utensil",
  cl::desc("What to eat with"),

int main_eat(int argc, char **argc) {
  // Some code...

======== order.cpp =================
#include "driver.h"

static cl::opt<std::string> Size("size",
  cl::desc("Size of food to order"),

int main_order(int argc, char **argc) {
  // Some code...

==== end ====

This will output the following.

$ driver
Restaurant driver!
USAGE: driver [-restaurant] <command> [<args>]

  -restaurant - The name of the restaurant to use

Driver commands:
    eat - Eat all the food.
    order - Order the food.

See 'driver help <command>' for more information on a specific command.
$ driver help order
USAGE: driver order [-size] <food>...

  -size - Size of food to order

The second problem is handling library options. With the megatool, the
union of libraries needed for each command is linked in, and many of
these libraries include options. As it currently stands, the megatool
would have all of these options, which would be very confusing to the
user. I propose that we add the concept of an option class. All
library options would be given a class depending on what they actually
effect. For example, -x86-asm-syntax would have the class backend, or
target. The call to cl::parseCommandLineOptions would have a defaulted
classes argument. If an option has a class, and it isn't one of these,
it would be ignored.

- Michael Spencer

Why don't you use --enable-shared for this? Does this not solve the file size and the link time problem?

I am a little worried the additional complexity in the command line system and other parts of LLVM are worth the benefits. At the moment tools like 'opt' are nice and simple examples of how to use LLVM. They allow new people to jump into LLVM and understand its structure easily.

Also, your proposal may increase the link and compilation time for me.
I mainly develop opt passes and often want to recompile the opt tool, but don't really care about anything else. Following your proposal, the
linking of the mega tool would draw in a lot more libraries as opt uses today and therefore will take significantly longer. I may solve this with --enable-shared, but I will still have the problem that unrelated libraries need to be recompiled, in case some global header changed. Today only the necessary libraries are recompiled which should be a lot faster.


One advantage of the current system (specially when building with cmake)
is that we have explicit list of dependencies and tools that use only a subset
of it. For example, that is what recently found a problem with the IL parser
depending on codegen.

I agree with Tobias that having better support for shared libraries is probably
a better solution for incremental builds.