Textual IR value names

I like my LLVM IR text to have nice value names, e.g.

%add = add ...
%mul = mul ...

And this all works well if the build has asserts enabled. If the build
does not have asserts enabled, it's not so nice:

%1 = add ...
%2 = mul ...

I understand the use for obfuscating names, but the choice to make this
dependent on whether or not asserts are enabled seems odd to me. At the
very least it's surprising. It took some time for me to figure out why
some of our builds behaved differently than others.

Is this an intentional design choice? If so, what's the rationale? If
not, would it make sense to add a CMake option to specify whether
textual IR preserves names or not rather than overloading
ENABLE_ASSERTIONS?

                            -David

You can use `opt -instnamer`.

And the clang behavior can be controlled with -fdiscard-value-names/-fno-discard-value-names

Davide Italiano <davide@freebsd.org> writes:

Is this an intentional design choice? If so, what's the rationale? If
not, would it make sense to add a CMake option to specify whether
textual IR preserves names or not rather than overloading
ENABLE_ASSERTIONS?

You can use `opt -instnamer`.

Ok, but that's somewhat cumbersome with clang and it still doesn't
really answer why things are set up they way they are. Why should
building with asserts change the observable behavior of a (correct)
compiler? That's surprising and not in a good way.

                            -David

Craig Topper via llvm-dev <llvm-dev@lists.llvm.org> writes:

And the clang behavior can be controlled with -
fdiscard-value-names/-fno-discard-value-names

That's a little better for users, but again, why should the default
behavior change based on whether or not asserts are enabled?

                          -David

The names are dropped to save memory when a release build of the compiler is being used. This is what you probably want on a release compiler you intend to ship since it should be faster. The NDEBUG check is an easy way to tell the difference between release and debug builds. People probably don’t want to have to remember to set an additional cmake option to make a release compiler faster.

The -fdiscard-value-names/-fno-discard-value-names was added about a year ago https://reviews.llvm.org/D42887 not sure if there is more discussion about the asserts behavior discussed there.

Also, the textual IR is really a developer/debugging tool, not something an end-user would be expected to look at, so the textual names would be largely pointless in a release build.

If you are writing lit tests that look at textual IR, and they are failing with release builds, you can add the “REQUIRES: asserts” directive to the lit test so that the test won’t be run for a release build.

–paulr

Craig Topper via llvm-dev <llvm-dev@lists.llvm.org> writes:

The names are dropped to save memory when a release build of the
compiler is being used. This is what you probably want on a release
compiler you intend to ship since it should be faster. The NDEBUG
check is an easy way to tell the difference between release and debug
builds. People probably don't want to have to remember to set an
additional cmake option to make a release compiler faster.

The CMake option could override the current behavior. Then users
wouldn't be forced to remember to set an option to make release
compilers faster.

In the end it's not a huge deal, I just found keying off asserts to be
quite surprising and it took some time for me to figure out what was
going on.

                            -David

via llvm-dev <llvm-dev@lists.llvm.org> writes:

Also, the textual IR is really a developer/debugging tool, not
something an end-user would be expected to look at, so the textual
names would be largely pointless in a release build.

Our customers using our (non-LLVM-based) Fortran compiler have the
ability to dump internal IR(-ish) and examine it. Some find it useful
to understand how the compiler transforms their code. So it's not quite
right to say such dumps are pointless. It depends on the user.

                          -David

While I understand the value of turning it off, it is somewhat annoying when you're trying to run a clang -S -emit-llvm to discover that you're not getting any value names. I would suggest that, irrespective of the kind of build you use, -S -emit-llvm should cause the value names to be defaulted to on as a principle of least surprise to the user.

From: David Greene [mailto:dag@cray.com]
Sent: Thursday, January 10, 2019 3:07 PM
To: Robinson, Paul
Cc: craig.topper@gmail.com; llvm-dev@lists.llvm.org
Subject: Re: [llvm-dev] Textual IR value names

via llvm-dev <llvm-dev@lists.llvm.org> writes:

> Also, the textual IR is really a developer/debugging tool, not
> something an end-user would be expected to look at, so the textual
> names would be largely pointless in a release build.

Our customers using our (non-LLVM-based) Fortran compiler have the
ability to dump internal IR(-ish) and examine it. Some find it useful
to understand how the compiler transforms their code. So it's not quite
right to say such dumps are pointless. It depends on the user.

It's not an intended use-case, and the format has no stability guarantee.
If you want to provide a textual-IR feature to your users, nobody will
stop you. I'm just explaining why it is the way it is.
--paulr

From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of
Cranmer, Joshua via llvm-dev
Sent: Thursday, January 10, 2019 3:14 PM
To: David Greene; Craig Topper
Cc: llvm-dev@lists.llvm.org
Subject: Re: [llvm-dev] Textual IR value names

While I understand the value of turning it off, it is somewhat annoying
when you're trying to run a clang -S -emit-llvm to discover that you're
not getting any value names. I would suggest that, irrespective of the
kind of build you use, -S -emit-llvm should cause the value names to be
defaulted to on as a principle of least surprise to the user.

Sounds like you want to file a feature-request bugzilla. And it's more
or less the same thing that David Greene seems to want.
--paulr

+1! I didn’t realize that you could even change that in a release build, but I concur that it should be enabled by default at least when emitting textual asm. The efficiency rationale for having it default off doesn’t really seem to apply to that case.

Having it by default for -S -emit-llvm seems like a good default to me.
Is it enough to make clang’s behavior independent of the build settings?