Question about GlobalOpt

Hi,

GlobalOpt may not consider demoting globals to locals in the “main” function when C is used. It used to consider “main” specifically prior to commit r253168 , for both C and C++. Since r253168, the check for the norecurse attribute may prevent “main” from being considered. This happens because the Function Attributes pass will not add the norecurse attribute to functions that have calls to library functions that aren’t themselves marked with the norecurse attribute, such as putchar. Even a call to llvm.lifetime.start, for example, will prevent a function from being considered as non-recursive as llvm.lifetime.start isn’t marked with the “norecurse” attribute.

We have a C workload that benefits from this demotion with LTO, as some hot functions get inlined into main.

The comment in tools/clang/lib/CodeGen/CodeGenFunctions.cpp explains the reason for marking “main” with the norecurse attribute in C++:

// If we’re in C++ mode and the function name is “main”, it is guaranteed

// to be norecurse by the standard (3.6.1.3 "The function main shall not be

// used within a program").

No such restriction exists in the C standard, as far as I can tell.

Is there anything that can be done to alleviate this restriction in C? Can we make the Function Attributes pass more aggressive, for example? Or mark certain library functions as “norecurse”, although I don’t see how this can be guaranteed.

Thanks,

Sanjin

Hi,

GlobalOpt may not consider demoting globals to locals in the “main” function when C is used. It used to consider “main” specifically prior to commit r253168 , for both C and C++. Since r253168, the check for the norecurse attribute may prevent “main” from being considered. This happens because the Function Attributes pass will not add the norecurse attribute to functions that have calls to library functions that aren’t themselves marked with the norecurse attribute, such as putchar. Even a call to llvm.lifetime.start, for example, will prevent a function from being considered as non-recursive as llvm.lifetime.start isn’t marked with the “norecurse” attribute.

We have a C workload that benefits from this demotion with LTO, as some hot functions get inlined into main.

The comment in tools/clang/lib/CodeGen/CodeGenFunctions.cpp explains the reason for marking “main” with the norecurse attribute in C++:

// If we're in C++ mode and the function name is "main", it is guaranteed
// to be norecurse by the standard (3.6.1.3 "The function main shall not be
// used within a program").

No such restriction exists in the C standard, as far as I can tell.

This seem to be you problem.

Is there anything that can be done to alleviate this restriction in C?

Except if we had a source-level attribute, or a clang command line flag, I don't see how.
Write your main in C++ maybe?

  Can we make the Function Attributes pass more aggressive, for example?

Are you suggesting to break C semantics or I misunderstand what you mean?

Or mark certain library functions as “norecurse”, although I don’t see how this can be guaranteed.

This is pretty recent and we don't have a good support for libcalls and norecurse. I think Chandler found also some conceptual issue to get it to work properly.

I think the conceptual issues have largely been sorted out, it is mostly that it is much harder to deduce norecurse than it might seem like superficially.

Is annotating “known” libcall with “norecurse” something that is just “work in-progress”, or are we blocked because we may have a user-defined strlen for example?
What about intrinsics?

Is there a specific thread / email I can look at to read about what
the issues were?

-- Sanjoy

Hi,

On my phone right now but I’ll fish out the pertinent threads when I get to the office. Keyword searches for ‘norecurse’ on llvm-dev will probably get most of them.

Indeed, this correctness improvement caused a performance regression on some programs. There is a way to revert to the old, broken behaviour: ‘-mllvm -force-attribute=main:norecurse’. Given how many people run old C code that rely on this property I wouldn’t be against adding an appropriate frontend option for this either, but I am not a clang Dev so they might object more :slight_smile:

Many library functions can be implemented in a recursive fashion. The issue is the same as we’ve had elsewhere in LLVM- is there a defined visibility boundary between user and library code? The same problem can be seen in the Malloc attribute annotations (I forget the attribute name) that Vaiva created - having one arbitrary visibility barrier breaks down when libraries are LTOd (bare metal or OpenCl systems being examples)

Norecurse as a concept is a trade off between ease of inference and ease of definition. Norecurse is indeed hard for the compiler to infer, but the definition is precise.

There may be other, superior options - suggestions welcome! :slight_smile:

Cheers,

James

By adding the attributes only on libcalls declarations only, you solve most of the issues: if you’re in the translation unit that exposes the internal implementation of malloc you won’t have the attribute (we could imagine that one would implement “strlen” split in multiple files, but that’s quite unlikely).

That could work!

But the bigger problem (that I’ve just remembered) is hooks and callbacks. If a function can either exit, abort or Malloc it could call back into user code.

That said, those functions (strtok etc) are a smaller subset of the library.

James

Hi Mehdi,

You are right – modifying the Function Attributes pass to mark “main” as norecurse would break the C standard (unless it has a similar statement regarding “main” that the C++ standard has – I cannot find it), so that’s a no-go. Looks like there was an attempt to bypass library calls in the Function Attributes pass for the purpose of detecting norecurse functions: http://reviews.llvm.org/D14769. I’ll look for other threads on this topic.

Thanks,

Sanjin

The only problem we ran into is that many library functions cannot be deduced as norecurse.

I think to deduce ‘main’ as norecurse you would need to have language specific knowledge, but the way the frontend could encode that knowledge would be to annotate main with the norecurse attribute directly. ;] So that seems to obviate the need to deduce anything for library functions.