Environment variables

Is there an LLVM-ish way to handle environment variables? Specifically,
I want to check existence and/or value of an environment variable and
take appropriate action.

I was kind of surprised that LLVM doesn't seem to have any
special/optimized way to deal with environment variables. The one
Stackoverflow I found on it suggested using getenv().

Thanks!

                  -David

llvm::Process::GetEnv looks like it does the right thing.

I think we added it to deal with Unicode on Windows, though. We have plenty of calls to getenv that are mostly looking for ‘1’, ‘0’, or variable presence, and they pretty much work.

Ok, thanks! I'm not dealing with UTF-8 so I don't think Process::GetEnv
will work. I was looking for something that caches calls to getenv so
checks could be put into tight(-ish) loops without too much performance
impact.

Would such a utility be of interest to the community?

                           -David

Reid Kleckner via llvm-dev <llvm-dev@lists.llvm.org> writes:

Ok, thanks! I'm not dealing with UTF-8 so I don't think Process::GetEnv
will work. I was looking for something that caches calls to getenv so
checks could be put into tight(-ish) loops without too much performance
impact.

Sorry for the snarky answer but we already have that:

// outside of loop
bool enableFooBar = getenv("ENABLE_FOO_BAR");
while (...) {
    // it's not getting re-checked every loop iteration:
    enableFooBar;
}

Generally we don't really look at env vars today (I think for clang you can mostly affect some search paths with them) and IMO it is a good thing to force being explicit on the command line instead...

- Matthias

Yes, but in your example getenv is called every time enableFooBar needs
to be initialized. What if your code is itself wrapped inside another
loop you can't see (for example, the PassManager invoking passes)?

Maybe I'm being overly pedantic.

We use a lot of environment variables in our compiler because it's
really super annoying and takes a lot of developer time to have to
update customer Makefiles to include the debugging options we want to
use to debug customer problems. These are huge customer codes with
often many Makefiles which may be generated by custom tools we don't
understand at all. :slight_smile: It's much easier to use the compiler flags that
are in the Makefiles and set some environment variables to override
things during "make."

It seems odd that cl::ParseEnvironmentOptions exists but there is no
"official" way to get at environment variables.

If this isn't something the community wants or needs, that's fine. I
was just asking if a contribution would be welcomed if we end up
developing something.

                       -David

Matthias Braun <mbraun@apple.com> writes:

I can definitely relate to third party Makefiles being a huge pain to manipulate. And env vars can be an okay tool to help debugging (see also CCC_OVERRIDE_OPTIONS in clang for example). I also don't want to dispute that they may be the right solution in some cases.
That said in my opinion we should not make it look like using environment variables is a good or encouraged thing because there are downsides:

- The bug reproducetion instruction and file bundles that clang produces when it crashes does not show environment variables, meaning you possibly cannot reproduce a bug...
- Similarily when producing .ll files, going through jenkins output, etc. You typically have no good way to see the environment variables in effect. Even if you do, there are typically hundreds of them and it may be hard to know which ones affect the compiler.
- env vars only work more reliably if they are sparsly used as soon as they have interesting/desirable effects you will see people using them in their makefiles as well, making the whole thing brittle again because now you have to wonder if someone overrides, resets, manipulates the env vars in their makefiles bringing you back to square one where you have to understand and change complex third party makefiles...

- Matthias

Env vars that change compiler output make it impossible to write tools such as ccache or distcc. Including the entire env in the hash value that determines whether ccache has a cache hit (as well as the compiler command line and the preprocessed source file) would be ridiculous and would result in very few cache hits.

I agree with all of this. We have many environment variable hooks in
the compiler but they're all for internal use and have very specific
debugging purposes (turn off this-or-that, tune this parameter, etc.).
We don't share them with customers at all.

They are a very useful debugging tool but need to be used judiciously.

                           -David

Matthias Braun <mbraun@apple.com> writes:

Sure, but we're not using this with ccache and other such things. We're
very specifically using them for debugging purposes. It's very
special-case and so we don't expect them to interact well with general
tools. They're not meant for day-to-day use.

                          -David

Bruce Hoult <brucehoult@sifive.com> writes:

Perhaps it’d make sense to just have one such environment variable entry point - perhaps in Clang’s driver (& it’d use the environment variable as part of constructing the -cc1 command line - thus crash reports, etc, would still encode everything needed for reproducing the failure). Ensuring that all the options/configuration points are exposed at least via -mllvm style cl options (these are cheap to add - don’t have to be plumbed through all the different layers, etc - intended for compiler-engineer tweaking/experiments/investigation).

& that means not having lots of environment variable reading/testing all over the codebase, so probably not much need for generic/helper utilities

David Blaikie <dblaikie@gmail.com> writes:

Perhaps it'd make sense to just have one such environment variable
entry point - perhaps in Clang's driver (& it'd use the environment
variable as part of constructing the -cc1 command line - thus crash
reports, etc, would still encode everything needed for reproducing the
failure). Ensuring that all the options/configuration points are
exposed at least via -mllvm style cl options (these are cheap to add -
don't have to be plumbed through all the different layers, etc -
intended for compiler-engineer tweaking/experiments/investigation).

& that means not having lots of environment variable reading/testing
all over the codebase, so probably not much need for generic/helper
utilities

That sounds like a very reasonable approach. We've done similar things
in the past with other projects. The code I'm referencing is 30+ years
old, so the same design decisions almost certainly don't apply to much
newer code.

                                 -David