[RFC] Adding a default file location to config file support

I support this proposal.
I originally thought this would add yet another nuance for developers to analyze user reports, but then I realize we have to admit vendor customizations.
(GCC provides a lot of --enable-* and every Linux distro may do something differently.)
(
In the initial thread discussing configuration files Configuration files ,
Joerg expressed strong concern of a magic user default (but he thought a system default was fine).)

Currently there are bunch of cmake variables customizing the Clang behavior:

  • CLANG_DEFAULT_PIE_ON_LINUX
  • CLANG_DEFAULT_LINKER
  • CLANG_DEFAULT_STD_C
  • CLANG_DEFAULT_STD_CXX
  • CLANG_DEFAULT_CXX_STDLIB
  • CLANG_DEFAULT_RTLIB
  • CLANG_DEFAULT_OBJCOPY
  • CLANG_DEFAULT_OPENMP_RUNTIME
  • CLANG_DEFAULT_UNWINDLIB
  • (CLANG_CONFIG_FILE_SYSTEM_DIR)
  • (CLANG_CONFIG_FILE_USER_DIR)
  • CLANG_OPENMP_NVPTX_DEFAULT_ARCH
  • CLANG_SPAWN_CC1

And new variables are coming (e.g. default stack protector level). Many of them can be overridden by a driver option.

A default configuration file location can possibly render a lot above unneeded.

Per Clang Compiler User’s Manual — Clang 16.0.0git documentation , a vendor can make clang a shell script which executes
a {prefix}-clang and let {prefix}-clang load {prefix}.cfg. So if a vendor wants to have a default configuration file,
there is already a way. (Options in a configuration file are not subject to -Wunused-command-line-argument.)
This proposal will make the use less hacky for some vendors.

Some random opinions:

For the default configuration file location, one idea is to use $resource_dir/etc as the directory.
Downstream clang tools need to know the resource directory but may not know the executable path.

We have GCCMode, GXXMode, CPPMode, CLMode, FlangMode, DXCMode.
Perhaps each mode should load different configuration files. (clang++.cfg perhaps doesn’t help users doing clang --driver-mode=g++)

It looks like the same effect as default config file can be obtained if clang would try to load two config files, the first one deduced from driver mode (clang.cfg, clang++.cfg, gcc.cfg etc) and the second - defined by target prefix. The driver mode file could set some default values, which may be overridden in target-prefix file. The driver mode file works as default config file.

This solution has some advantages over simple default config file:

  • Different tools (clang, clang++ etc) can have separate configuration files. At the same time they can share content by using @file constructs.
  • The configuration may be overridden by usual mechanism (file in user directory has precedence over system one).

To get invocation without loading config files, a new option is required. In some cases such invocation is necessary and solution for this task must exist.

I suppose the most complete solution would involve something like:

  1. Trying to load <prefix>-<mode>.cfg first.
  2. Trying to load <prefix>.cfg and <mode>.cfg.
  3. Trying to load a base file as fallback.

However, I’m not sure if we should have two bases for fallback from missing prefix file and for fallback from missing mode file, or some other model.

Personally, I don’t think I need all that but I can see potential use cases e.g. for specifying host-specific options, as well as mode-specific options.

We also need to use that in the test suite. Otherwise, if we enable the system config location during the builds, user’s configuration may impact the test results.

We already have CLANG_CONFIG_FILE_SYSTEM_DIR, though for some reason isn’t it a cache variable. I think it’d be useful to have some default at least for unix derivatives, so that users wouldn’t end up being confused by every Linux distribution choosing their own directory.

As mentioned on the Phab review too - as for default-included config files, it would be great to have it look for e.g. <triple>.cfg too, if clang is invoked as clang -target <triple>. Currently this is done if clang is invoked via a symlinked name such as <triple>-clang, but not if the triple is set explicitly on the command line. (This would be very helpful wrt setting defaults like CLANG_DEFAULT_RTLIB and CLANG_DEFAULT_CXX_STDLIB etc, where you may want to have such defaults set for cross targets but not when operating as native compiler.)

Yes I have suggested this before - I think that would be very helpful for cross-compile toolchains.

I’ve submitted ⚙ D134018 [clang] [Driver] Add an option to disable default config filenames to add a --no-default-config option. Hopefully that’d help moving things forward.

1 Like

@mgorny has commandeered ⚙ D109621 [clang][Driver] Default to loading clang.cfg if config file not specified to add the support. Thanks for stepping up!

Cc some folks @davidchisnall @reinterpret_cast @rengolin @rnk from Configuration files for attention.

To be honest, I’ve only done the absolute minimum to make the current config useful for vendors, and covering the proverbial 99% of use cases, i.e. unless I’m mistaken, it makes it possible to install matching config files for all clang symlinks we install.

There are many things I dislike about the current logic, I’m not sure how much of that can be changed without breaking backwards compatibility (and whether we even need to worry about it) and I honestly doubt I want to be rewriting it.

The things I dislike that I’ve observed so far are, in no particular order:

  • There’s no way to actually refer to the “user directory” in user configuration directory (e.g. tilde expansion doesn’t work).
  • There’s no way to refer to XDG configuration dirs.
  • The logic is almost entirely based on clang executable filename, i.e. clang --driver-mode=... does the wrong thing, so does clang --target=....
  • The “default” config as added by my patch is used only when no target prefix is used, rather than as a generic fallback.
  • There’s a hack for -m32 that substitutes arch name in prefix.
  • The current logic tries every filename in all directories before trying the next filename, e.g. x86_64-clang++.cfg in system dir overrides x86_64.cfg in user dir. Not sure if it shouldn’t be the other way around.

Ok, I think I’ve hit my first roadblock so far: discontinuity between options supported by various clang drivers.

So far we’ve considered using the following options in the system config file:

Right now setting these options in the main configuration file means that a variety of driver invocations will fail due to incorrect options. To be honest, I don’t think that it’s really a good solution to require users to supply two dozen configuration files for various driver mode-target combinations to make sure every one gets the right set of options. I see two potential solutions here. Either:

  1. Make more of the useful options "CoreOption"s to have them accepted across all drivers (and optionally emit “option unused” warnings for them — config file support already silences these warnings, so that should be OK).
  2. Make the config support silently ignore all invalid options.

To be honest, I think 1. seems more correct but I’m happy enough with either option. On top of that, I think it would also make sense to add an option like --verify-config that disables the “unknown option silencing” behavior of config parsing, to let users check if their configs are valid.

If clang searched for two config files, one for target and one for driver mode, could it reduce number of configuration files in your use case?

Yes, that would be helpful. However, FWIU we currently allow only a single configuration file, don’t we?

If we went that way, we should also probably improve the logic to respect the actual target and actual driver mode independently of the executable name. I’m not sure how much backwards compatibility we need, though.

Yes, but there are demands for more flexible solution for configuration file from several users, so it would be nice to improve existing mechanism.

Allowing two config files probably leads to support of two --config options or using some other way to specify target and driver config files separately. This option is used to bypass file search mechanism, which might be complex, and we need something similar in any case.

If you could work on that, that’d be most helpful. I’m afraid my time is limited (and I’ve already spent too much on it), yet given the need of testing changes prior to Clang 16.x release, it’d nice to have some working solution ASAP. Hence, my patch. However, OTOH I wouldn’t want people to start relying on it if the final config support is going to be different.

Hmm, maybe here’s a better description of what we would like to have.

Basically, we’d like to have something like gentoo-runtimes.cfg with the equivalent of:

-fuse-ld=...
-rtlib=...
-stdlib=...
-unwindlib=...

and gentoo-gcc-install.cfg:

--gcc-install-dir=...

and we’d like to be able to reasonably include these configs so that they would be appropriately picked up by all clang invocations.

I think the best way to provide such flexibility is to have a single configuration file that can consult available data (e.g., driver name used, host platform, target, etc…) and alter configuration accordingly (frequently via some form of conditional inclusion). Something like:

!if $driver_name ~ *clang
!include clang.cfg
!elif $driver_name ~ *clang-cl
!include clang-cl.cfg
!endif

!if exists($target.cfg)
!include $target.cfg
!endif

!if $host_platform ~ zos* and $target ~ zos*
-L'//SYS1.SCEELIB'
!endif

The benefits of such an approach are:

  1. Less guessing about which config file(s) are being consulted.
  2. Support for distribution specific conventions.
  3. Support for composition (e.g., dropping new .cfg files into the configuration directory.
  4. Support for future extensions.

I’ve been working on the original idea, and I have two potentially interesting diffs on review right now:

  1. ⚙ D134270 [clang] [Driver] Support multiple configuration files
  2. ⚙ D134337 [clang] [Driver] More flexible rules for loading default configs (WIP)

The first one makes it possible to load multiple configuration files, both via multiple --config arguments and via combining the default-loaded file with explicit files. The second one is a WIP in extending the standard list of “default” files to include fallback to separate per-target and per-mode files, as well as including the effective triple and driver mode in the search (rather than focusing on the executable name).

I think you are fundamentally right. Configuration files was implemented as a simple feature for simple task - specifying driver options. A part of this functionality, default configuration files, work well for simple tasks but are difficult to use in more complex cases. Replacement of existing pattern-match algorithm with directive-based implementation would be a more powerful solution. For example, set of libraries may depend not only on target but also on ABI variant or object file format. Making configuration for this case now requires preparing separate file for each combination but in availability of conditional includes is just one more branch in the same file.

Well, I’m not exactly opposed to that idea but generally I’m not a fan of creating new microlanguages. In either case, this is a complex solution that may or may not be worth the effort, and I’m certainly not capable of working on it.

As a general update: after some more updates, main branch now features more complete config support that should remain roughly compatible with existing use cases and at the same time gives Gentoo the flexibility it needs.

Config files are now loaded by default (unless --no-default-config is passed on the command-line or CLANG_NO_DEFAULT_CONFIG envvar is set to a non-empty value), also for plain clang, clang++, etc. calls without explicit triple in filename.

The default config lookup starts by trying <triple>-<driver>.cfg. If such a file is not found, it falls back to loading <triple>.cfg and <driver>.cfg, where neither has to exist.

I think the changes are too intrusive for an upstream 15.x backport but I’m planning to backport them into the next 15.x release in Gentoo via our patchset, given that we haven’t enabled config file support before, so there’s nothing to be broken ;-).

What I’d still like to happen is proper support for specifying user configuration directory — one that can account for XDG_CONFIG_HOME and the home directory. However, I don’t have a good idea how to support that.

1 Like