One interesting issue with moving away from the current system of static initializers for cl::opt is that we will no longer have the automatic registration of all the options so that -help will print everything available and generally we will not be able to issue an error for an “unknown command line option” (without calling into any other code).
Not automatic no, but in the proposal Chris puts the addOption call inside the pass initializer which is called before ParseCommandLineOptions. This means you’ll still get options listed as you currently do, so long as you continue to calls the pass initializers before parse (something you have to do anyway to get the pass name visible to the command line)
The auto-registration is fundamentally tied with the globalness and the static initializers; pondering this has led me down an interesting path that has made me understand better my suggestion in the other thread. As I see it, there are two very different sorts of uses of llvm::cl in LLVM:
-
For regular command line processing. E.g. if a tool accepts an output file, then we need something that will parse the argument from the command line.
-
As a way to easily set up a conduit from A to B, where A is the command line and B is some place “deep” inside the LLVM library code that will do something in response to the command line.
(and, pending discussion, someday point A might include a proper programmatic interface (i.e. in a way other than hijacking the command line processing))
That would be nice. I just suggested in another thread that we expose ParseCommandLineOptions to the C API to hack around this, but a nice clean interface would of course be better.
llvm::cl does a decent job for #1 and that is what it was designed for AFAICT; these uses of llvm::cl live outside of library code and everything is pretty happy, despite them being global and having static initializers.
The problem is that llvm::cl is not very well-suited to #2, yet it is used for #2, and that is the real problem. We need a solution to problem #2 which does not use llvm::cl. Thus, I don’t think that the solution you propose here is the right direction.
The first step is to clearly differentiate between #1 and #2. I will say “command line options” for #1 and “configuration/tweak points” for #2. (maybe “library options” is better for #2; neither is perfect terminology)
The strawman I suggested in http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075503.html was a stab at #2. There is no way to dodge being stringly typed since command lines are stringly typed, so really it is just a question of how long a solution stays stringly typed.
My thought process for staying stringly typed “the whole time” (possibly with some caching) comes from these two desires:
- adding a c/t point should require adding just one call into the c/t machinery (this is both for convenience and for DRY/SPOT), and
Right. The current point is in the pass initializer.
There is another point in the pass constructor to read the option value. This is the only point at which something will change from being a string to its actual type and value.
- this change should be localized to the code being configured/tweaked
This is the thought process:
Note that llvm::cl is stringly typed until it parses the options. llvm::cl gives the appearance of a typed interface because it uses static initialization as a backdoor to globally transport the knowledge of the expected type to the option parsing machinery (very early in the program lifetime). Without this backdoor, we need to stay stringly typed longer, at least until we reach the “localized” place where the single call into the c/t machinery is made; this single call is the only place that has the type information needed for the c/t value to become properly typed. But there is no way to know how long it will be until we reach that point (or even if we reach that point; consider passes that are not run on this invocation).
The current proposal exposes the type in addOption (as well as later when we get the option). So the type continues to be known to the command line parser. Whether you want to actually type check in the command line is a point i’m open to discuss. Personally i want a command line option to be type checked because it was registered, even if no-one actually gets the value of the option later.
Hence my suggestion of just putting a stringly typed key-value store (or whatever) in an easily accessible place (like LLVMContext), and just translating any unrecognized command line options (ones that are not for #1) into that stringly typed storage.
I’m against it being in the context because you may want to set up and reuse passes multiple times with the same options, and use that configuration to compile multiple LLVMContexts. But I do agree that having a store with some lifetime is useful.
I think the current proposal is to have the store be a singleton, but there’s nothing to stop further work to have the storage for options be one per thread for example. If you wanted to have one pass manager per thread with its own set of passes, configured (currently) via their own call to ParseCommandLineOptions then that would be possible with little work beyond the current proposal.
I agree with Rafael that “constructor arguments to passes” are not c/t points. In the future, there might be some way to integrate the two (from the referenced post, you can probably tell that I kind of like the idea of doing so), but for now, I think the clear incremental step is to attack #2 and solve it without llvm::cl. I have suggested a way to do this that I think makes sense.
If you change the current proposal so that it doesn’t read cl::opt, then I think this reads to me like what is being proposed now. Really its creating a string->string map with addOption, and getting the values with getOption. The passes don’t care (or know) whether the options are set via the command line or any other API. I hope i’ve understood your proposal correctly here. Please correct me otherwise.
Thanks,
Pete