TBAA

Are there any studies or examples of TBAA optimization for C showing substantive
performance wins?

I have not been able to find any papers or even compelling examples.

If this is off-topic let me know.

vy

Are there any studies or examples of TBAA optimization for C showing
substantive performance wins? I have not been able to find any
papers or even compelling examples.

Consider:

void
square (float *arr, int *bound)
{
  int i;
  
  for (i = 0; i <= *bound; i++)
    arr[i] = arr[i] * arr[i];
}

without TBAA you can't hoist the reference to *bound out of the loop. That
will prevent almost all loop optimizations.

The programmer can get this with a simple change, no?

void
square (float *arr, int *bound)
{
int i;
const int b = *bound;

for (i = 0; i <= b ; i++)
arr[i] = arr[i] * arr[i];
}

The programmer can get this with a simple change, no?

Sure. The programmer can do all sorts of optimizations themselves, in
fact almost all of them. But the purpose of an optimizer is to do these
things so that the programmer doesn't have to do that.

I would think that in a language like C, the role of the optimizer should be to do optimizations that are difficult or impossible or time consuming for the programmer.
My concern is that TBAA has a trade-off in terms of the complexity of the language for the programmer so I’d like to know if the tradeoff is really worth it.
BTW: the programmer based hoisting does not depend on the type of bound being different from the types off arr - so it is more robust.

vy

I would think that in a language like C, the role of the optimizer
should be to do optimizations that are difficult or impossible or
time consuming for the programmer.

That's a philosphical question. Keep in mind that very few
programmers nowadays, even in C, have a good idea of the underlying
machine model (or even the fact that there *is* such a model), so they
don't really even know what to do. Also, think of macro expansion and
such. And maintenance costs over time if you add all of the
"clutter". I would argue against that view.

My concern is that TBAA has a trade-off in terms of the complexity of the
language for the programmer

I don't understand. TBAA implements the rules of the programming language
with respect to aliasing. It doesn't change the language in any way, so
I don't see how it can increase the complexity for the programmer.

Perhaps you're arguing that anti-aliasing rules in languages aren't
worth the complexity, but that's a different issue and one that language
design groups look at, not compiler writers.

There’s a lot of overlap, and it’s probably good for language design groups to be talking to compiler writers (as in this thread) to understand what really matters in practice - if the answer was “there aren’t significant benefits to using TBAA” then maybe that’d inform language design groups to remove such requirements from their specifications, etc. I expect that’s where Victor is coming at this from - to understand/gather implementation experience to inform language design.

Or potentially to inform compiler implementations - violating TBAA is UB, so it’s valid for an implementation to define that behavior (eg: could change the default to be -fno-strict-aliasing).

Exactly. The C Standards committee has justified complex rules for programmers and compiler writers ( which seem to often cause programmers to yell at compiler writers) on the basis of optimization gains - but without much documentation. I am looking for information on whether those are significant enough to justify the tradeoff.

thanks
vy

I don’t have great data, but if you have any relevant benchmarks of your own you can try them -f{no-}strict-aliasing to compare their performance

Thanks. I am working on it. So far I cannot find cases other than the obvious loop hoisting ones - which C programmers can do themselves.

vy

I am working on it. So far I cannot find cases other than the obvious
loop hoisting ones - which C programmers can do themselves.

Again, the fact that they *can* doesn't at all mean they know they should.
But indeed optimizations are most effective for loops, so that's where
you'll find the major benefits. It's easy to construct other cases,
but you'd just argue that the C programmer should create the temporary
themselves.

Once way to look at it is that what's being said is "if you follow
reasonable aliasing rules (which, by the way, will result in cleaner
code), you won't have to manually create new variables in order for your
program to be more efficient: you can write it in whatever way is easiest
for you".

I am working on it. So far I cannot find cases other than the obvious
loop hoisting ones - which C programmers can do themselves.

Again, the fact that they can doesn’t at all mean they know they should.

It is easy to write bad code in C, for sure. Perhaps I have a minority opinion, but I think programmers who don’t get the memory model or don’t want to think about it have better alternatives than C. I really like C, but not for all purposes. There are good reasons for languages like Java, Go, Dart, etc. where much of the low level memory stuff is left to the compiler. But there is a good reason for C, especially
when we need to work with the representations of values.

reasonable aliasing rules (which, by the way, will result in cleaner
code), you won’t have to manually create new variables in order for your
program to be more efficient: you can write it in whatever way is easiest
for you".

The flip side is that many things that are common in C practice, even essential for writing C library functions and operating systems, are (silently!) disabled
or made shaky by C alias rules - which is why Linux, for example, uses fno_strict_aliasing and musl relies on may_alias. The canonical examples of problems are memcpy, allocators like malloc/free, and network packet classifiers. Or see https://blog.regehr.org/archives/1307

The original C aliasing rules were kind of slipped into the standard just a year or so after a pretty stiff dispute between the ANSI committee and Dennis Ritchie about aliasing, and, quite obviously, they were not completely thought out (e.g. the necessity of a hurried addition of the character pointer exception). Anyways, my goal right now is to understand the tradeoffs better.
vy