[GSoC][clang] Bash completion for clang project

Hello,

My name is Yuka Takahshi and I would like to ask few questions regarding to GSoC project : bash-autocompletion for clang.
We are now trying to build flag completions for which we call “value”.
Eg. in -std=c++11, c++11 is a “value”, and in -analyzer-checker=alpha.cplusplus, alpha.cplusplus is a “value”.

We are planning to implement most of the code in OptTable.cpp, in order to reuse OptTable, which is generated by Options.inc.
Options.inc is generated via Tablegen from Options.td, so we are planning to add the information of values into Options.td.

I would like to ask for a advice regarding how to implement the flag like “-std=” and “-analyzer-checker=”.
These flags are unique because their value information are already in LangStandards.def for “-std=”, and Checkers.td for “-analyzer-checker=”.
We are thinking to reuse these information and add these information to Options.inc, so that we can handle the flag completion in unified manner.
This way of implementation has further benefits from this GSoC project, because from this we can make documentation more simply and reduce custom handling of each value and code duplication.

The problem is that, we are not sure what is the best way for this implementaion.
For flags which are not like “-std=” and “-analyzer-checker=”, we decided to add a class to hold the value information in Options.td. Eg. for “-stdlib=”, ArgValues<“libc++, libstdc++, platform”>.
So we are looking for how to generate something like “libc++, libstdc++, platform” from LangStandards.def and Checkers.td for “-std=” and “-analyzer-checker=”.

Regards,
Yuka

One way to do this would be to rewrite LangStandards.def as a .td file, and
extend clang-tblgen to generate something like the current .def file from
that .td file. Then you can include LangStandards.td into Options.td and
somehow specify that all of the LangStandard records define possible values
for the -std= flag. (You could use something like
ArgValueClass<"LangStandard"> for this case, ArgValueClass<"Checker"> for
-analyzer-checker=, and so on. Your tablegen backend is able to do
arbitrary queries over the tablegen records, so there are a variety of ways
to express this.)

Similar things can be done for other options that take values from .def
files, such as -fsanitize= (Basic/Sanitizers.def).

Hello,

My name is Yuka Takahshi and I would like to ask few questions regarding
to GSoC project : bash-autocompletion for clang.
We are now trying to build flag completions for which we call "value".
Eg. in -std=c++11, c++11 is a "value", and in
-analyzer-checker=alpha.cplusplus, alpha.cplusplus is a "value".

We are planning to implement most of the code in OptTable.cpp, in order
to reuse OptTable, which is generated by Options.inc.
Options.inc is generated via Tablegen from Options.td, so we are planning
to add the information of values into Options.td.

I would like to ask for a advice regarding how to implement the flag like
"-std=" and "-analyzer-checker=".
These flags are unique because their value information are already in
LangStandards.def for "-std=", and Checkers.td for "-analyzer-checker=".
We are thinking to reuse these information and add these information to
Options.inc, so that we can handle the flag completion in unified manner.
This way of implementation has further benefits from this GSoC project,
because from this we can make documentation more simply and reduce custom
handling of each value and code duplication.

The problem is that, we are not sure what is the best way for this
implementaion.
For flags which are not like "-std=" and "-analyzer-checker=", we decided
to add a class to hold the value information in Options.td. Eg. for
"-stdlib=", ArgValues<"libc++, libstdc++, platform">.
So we are looking for how to generate something like "libc++, libstdc++,
platform" from LangStandards.def and Checkers.td for "-std=" and
"-analyzer-checker=".

One way to do this would be to rewrite LangStandards.def as a .td file,
and extend clang-tblgen to generate something like the current .def file
from that .td file. Then you can include LangStandards.td into Options.td
and somehow specify that all of the LangStandard records define possible
values for the -std= flag. (You could use something like
ArgValueClass<"LangStandard"> for this case, ArgValueClass<"Checker"> for
-analyzer-checker=, and so on. Your tablegen backend is able to do
arbitrary queries over the tablegen records, so there are a variety of ways
to express this.)

Similar things can be done for other options that take values from .def
files, such as -fsanitize= (Basic/Sanitizers.def).

I don't know if rewriting LangStandards.def just for shell autocompletion
is worth the effort, unless there's other reason to do that. It seems you
can define LANGSTANDARD macro before including this file to get a list of
all possible language options. You can use that information to return all
possible options when `-std=<tab>` is hit, can't you?

You can, but we don't run the C preprocessor on tablegen files, which means
you couldn't generate this data from tablegen without some auxiliary work.
We also want the "valid values" data for generated help text and
command-line argument summary documentation; it's not only desired for tab
completion.

Are you saying that you want to have the "valid values" data not only for
"-std=" and a few other options but for all options such as
-flto={thin,full} or -fopenmp={libomp,libgomp,libiomp}?

Ideally, yes, we'd have the "valid values" data for all options that have a
(small) enumerated set of options, for tab-completion, referemce
documentation, and any other similar use cases that people might have (UI
for configuring compiler flags?).

The advantage outside of autocompletion would be that we can reduce the amount of StringSwitch cases handling the argument values in the driver and centralize it in a common place. This would give us the opportunity to diagnose enhance error: invalid value "c+-11" for flag "-std=" with Did you mean one of c++11, c++14, c++17 and alike.

Hello,

My name is Yuka Takahshi and I would like to ask few questions regarding
to GSoC project : bash-autocompletion for clang.
We are now trying to build flag completions for which we call "value".
Eg. in -std=c++11, c++11 is a "value", and in
-analyzer-checker=alpha.cplusplus, alpha.cplusplus is a "value".

We are planning to implement most of the code in OptTable.cpp, in order
to reuse OptTable, which is generated by Options.inc.
Options.inc is generated via Tablegen from Options.td, so we are
planning to add the information of values into Options.td.

I would like to ask for a advice regarding how to implement the flag
like "-std=" and "-analyzer-checker=".
These flags are unique because their value information are already in
LangStandards.def for "-std=", and Checkers.td for "-analyzer-checker=".
We are thinking to reuse these information and add these information to
Options.inc, so that we can handle the flag completion in unified manner.
This way of implementation has further benefits from this GSoC project,
because from this we can make documentation more simply and reduce custom
handling of each value and code duplication.

The problem is that, we are not sure what is the best way for this
implementaion.
For flags which are not like "-std=" and "-analyzer-checker=", we
decided to add a class to hold the value information in Options.td. Eg. for
"-stdlib=", ArgValues<"libc++, libstdc++, platform">.
So we are looking for how to generate something like "libc++, libstdc++,
platform" from LangStandards.def and Checkers.td for "-std=" and
"-analyzer-checker=".

One way to do this would be to rewrite LangStandards.def as a .td file,
and extend clang-tblgen to generate something like the current .def file
from that .td file. Then you can include LangStandards.td into Options.td
and somehow specify that all of the LangStandard records define possible
values for the -std= flag. (You could use something like
ArgValueClass<"LangStandard"> for this case, ArgValueClass<"Checker"> for
-analyzer-checker=, and so on. Your tablegen backend is able to do
arbitrary queries over the tablegen records, so there are a variety of ways
to express this.)

Similar things can be done for other options that take values from .def
files, such as -fsanitize= (Basic/Sanitizers.def).

I don't know if rewriting LangStandards.def just for shell autocompletion
is worth the effort, unless there's other reason to do that. It seems you
can define LANGSTANDARD macro before including this file to get a list of
all possible language options. You can use that information to return all
possible options when `-std=<tab>` is hit, can't you?

The advantage outside of autocompletion would be that we can reduce the
amount of StringSwitch cases handling the argument values in the driver and
centralize it in a common place. This would give us the opportunity to
diagnose enhance `error: invalid value "c+-11" for flag "-std="` with `Did
you mean one of c++11, c++14, c++17` and alike.

Ah, that sounds nice. One concern I had is that, if we enumerate all
possible values for each option in .td files, we have to modify both a C++
file and a .td file when we add a new option value for an exiting flag, but
that's probably not that bad.

Yes, indeed. I see this also as documentation for options goes in the .td files and the implementation in the driver…

Thank you for your replies.

It looks like we should

  • Hardcode possible ArgValues to Options.td for simple ones (Eg. libstd=).

  • Convert *.def files to *.td files, in order to generate flag values from definition files. (Eg. LangStandards.def and Analyses.def)

  • Reduce StringSwitch cases and enhance error messages.

I will start implementing the first item.

I think getting rid of StringSwitches and gathering possible flag value information to one file is a big advantage outside this GSoC project, so I think it is reasonable to do this. I believe that will be pretty useful not only for my gsoc project but for everyone.

Thanks!

Yuka