alias.scope and local restricted C pointers

Concerning slide 16 of https://llvm.org/devmtg/2017-02-04/Restrict-Qualified-Pointers-in-LLVM.pdf

Specifically “Currently, LLVM only supports restrict on function arguments, although we have a way to preserve that information if the function is inlined.”

Is that statement still accurate? It would seem that https://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata should be sufficiently general to honor C’s restrict qualifier on local pointers, but it does not appear that Clang uses this part of LLVM’s IR for that purpose today and thus local restricts are ignored.

Thanks,

Troy

Concerning slide 16 of https://llvm.org/devmtg/2017-02-04/Restrict-Qualified-Pointers-in-LLVM.pdf

Specifically “Currently, LLVM only supports restrict on function arguments, although we have a way to preserve that information if the function is inlined.”

Is that statement still accurate?

Yes, correct (actually I was just working on restrict, no_alias and alias.scope attributes). The inliner also propagates them correctly.

It would seem that LLVM Language Reference Manual — LLVM 18.0.0git documentation should be sufficiently general to honor C’s restrict qualifier on local pointers,
but it does not appear that Clang uses this part of LLVM’s IR for that purpose today and thus local restricts are ignored.

I think that’s correct, but I haven’t come out with any scenarios regarding local variables/memory that can _not_ be solved by AA. As BasicAA is able to solved most of the local cases, including malloc and some memory intrinsics.

Best
Bekket

Yes. It could in some cases, and in fact that was my original thought, but making it work well seems to require a bunch of analysis in the frontend. The problem is that, in order to conclude that two pointers don’t alias based on restrict, you need to know that one pointer is based on the restrict-qualified pointer and the other, within the appropriate scope, is not. Doing this relies on data-flow and aliasing analysis (which would need to be pretty simple in the frontend), but also becomes pessimistic when pointers are passed to functions (because they might be captured). Especially in C++ (where restrict is often used as an extension), the function-call-capturing problem can be significant because all of the overloaded operators are function calls (and so on). When we have restrict on function parameters we add noalias on the LLVM IR level to those parameters, and if we later inline that function, we add this noalias metadata, but at that point, we’ve already done IPO nocapture inference, memory-to-register conversation, and have access to LLVM’s AA infrastructure. I’d proposed an llvm.noalias intrinsic in order to address this problem (and the patches for this, including frontend patches, are on phabricator). Originally, I had wanted to make the intrinsic fairly transparent, such that it would have minimal interference on other optimizations. Part of the problem is that I wanted to intrinsic to fix itself to a particular place in the CFG in order to be able to differentiate, as a particular design point, between noalias pointers within one loop iteration vs. across all loop iterations (based on whether the intrinsic was inside or outside the loop). This means that the intrinsic had to be annotated such that it could not be hoisted. In any case, after testing, we found problems with the transparency of the intrinsic because, both literally and effectively during the recursive AA-query-processing algorithm, we could break the dependence of a pointer value on the intrinsic from which it was originally derived. This can’t happen and have the scheme maintain correctness. Thus, we need to make it hard for analysis to look through the intrinsic, and thus it even has a larger impact on other optimizations. Several people have been thinking about this problem, but I wouldn’t say that we yet have a clearly-good answer yet. Maybe we want to start with the intrinsic and then translate it later into maetadata, for example. -Hal

int *restrict x = some_external_function(); int *restrict y = some_other_external_function(); This is one of the fundamental use cases for restrict and BasicAA has nothing to offer in this regard. In other words, it’s a mechanism for encoding an interface contract. -Hal

Thanks to both of you for clarifying.

Bekket, local restrict is pretty common in my experience, at least with our users. It would be good for it to work as expected.

Hal, I appreciate the difficulty in doing the based-on analysis, but it needs to be understood that a C compiler was never really expected to do that analysis! Being able to do it would be great, of course, but it is quite daunting and was not the original intent of restrict. I have in fact worked with the original author of the restrict keyword and proposed WG14/N2260 to clarify its usage and intent.

Basically, the based-on terminology was the least bad way anyone thought of at the time to describe the semantics. The belief was that users would supply restrict on multiple pointers of the same type whenever none of them aliased, then use those pointers directly. Doing the based-on analysis is similar to what’s required to usefully interpret a single restrict-qualified pointer in isolation, and that’s considered unreasonably hard.

So I think this problem is quite solvable in the common (and recommended) case of the user specifying restrict liberally and accessing data directly via those pointers.

The use case that I normally see is a bunch of local restricted pointers initialized once and indexed by some increasing integer. The compiler would honor restrict if the user outlined the code into a separate function and the pointers were parameters instead. Sometimes that is the recommended workaround, but it shouldn’t be necessary.

I’m happy to discuss futher with whomever works on the related parts of Clang and LLVM. I am not familiar with who that might be.

-Troy

You’re proposing that there is some common cases, dealing specifically with multiple restrict pointers, where the based-on relationship is clear, and handling those using the existing metadata. I’m not sure how limiting this would be, but certainly could be worth doing. We should discuss this in more detail. I agree. I’d like this workaround not to be necessary. I did most of the work associated with the llvm.noalias metadata and the proposed llvm.noalias patches. Others have been investigating this as well. I’m happy to discuss this with you (and, unless there’s some reason to do otherwise, we can discuss this here, for LLVM issues, or on cfe-dev, for issues of how Clang translates C into LLVM). Thanks again, Hal