[RFC] Modelling errno memory effects

Motivation

A large number of C library functions report errors by setting the errno variable. LLVM currently has no explicit way to model that a function can only write to errno in particular, so we have to make very conservative assumptions about which memory can be clobbered by such library calls.

In a previous discussion on the topic, I suggested to solve this by emitting int TBAA metadata on such FP libcalls, and this did end up being implemented. While this does somewhat work, it is something of a hack, and does not generalize well. For example, if the libcall can also legitimately access other memory (e.g. because it has pointer arguments), then this approach does not work.

Proposal

I propose to explicitly model errno by adding a new location kind to MemoryEffects. A function that only writes to errno would have memory effects memory(errno: write) or maybe memory(errnomem: write) to stick closer to the spelling of other locations. A function that can read arguments and write errno would be memory(argmem: read, errnomem: write).

errno TBAA

Unfortunately, we can not actually make a lot of assumptions about which pointers may alias errno. For all we know, a pointer passed as a function argument might actually be a pointer to errno.

For C-based languages with strict aliasing, one of the strongest guarantees we have is that errno needs to be accessed with int-compatible TBAA (which is also how the original TBAA-based hack come about). A complication here is that in the proposed represention using memory effects, we don’t actually know how the relevant TBAA looks like, as it can be freely determined by the frontend, and we presumably wouldn’t want to hardcode the specific format produced by Clang in LLVM.

As such, I’m also proposing to add named metadata !llvm.errno.tbaa, which specifies the TBAA metadata for an integer access:

!llvm.errno.tbaa = !{!7}
!7 = !{!"int", !8, i64 0}
!8 = !{!"omnipotent char", !9, i64 0}
!9 = !{!"Simple C++ TBAA"}

Then, if there is an access with, say, float TBAA for the same root, we would know that it does not alias with memory(errnomem: readwrite).

To support module merging, I think that !llvm.errno.tbaa should accept a list of TBAA nodes, and we can conclude no alias if we don’t alias for any one of them.

Other ways to determine no alias

Beyond TBAA, there are probably also a few other ways to determine that accesses do not alias with memory(errnomem: ...).

For example, we should be able to say that alloca memory never aliases errno, even if it escapes.

We can probably also say that an access larger than sizeof(int) does not alias errno (but a language expert would have to confirm that).

Auto-upgrade

Because errno is currently part of the other location, the semantics of existing textual IR remain unchanged.

For bitcode, we will explicitly upgrade MemoryEffects to copy the modref value of the other location to the errno location.

3 Likes

Sounds reasonable to me. I’d be happy to work on it, unless anybody is already working on it.

An implementation detail: we seem to fall back to APIs in AAResults::getModRefInfo to refine aliasing results, if previous AAs haven’t said anything particularly but being conservative. I wonder if having something along these lines – before checking for inaccessible memory – could be still something to take into account:

  // Safely bail out if the Call only accesses errno, and the supplied pointer
  // happens to be errno memory.
  if (ME.onlyAccessesInaccessibleMemOrErrnoMem() &&
      isErrnoFn(getUnderlyingObject(Loc.Ptr)))
    return ModRefInfo::ModRef;
1 Like

In the discussion that you link to, @jdoerfert points out that you can analyze aliasing with errno also without TBAA. This seems to be what GCC is doing, because if I modify the original example to use int access instead of float access, GCC still succeeds to eliminate the second load: Compiler Explorer

Do you expect an errno memory effect to only be combined with TBAA or also with other information?

Sure, I’d be happy if you work on this :slight_smile:

To be honest, I don’t really understand the question here. The code snippet you shared doesn’t really make sense to me. ModRef is always the default AA assumption, and we try to disprove it, not the other way around. So what AA would be doing is to mask out the errno location from MemoryEffects if it can prove Loc cannot alias errno.

I think you may be mixing things up here with the fact that some functions that do write errno are incorrectly marked as memory(inaccessiblemem: readwrite). But that particular problem needs to be addressed by changing their memory attribute, not by changing AA.

Yes, see the “other ways to determine no alias” section in the proposal. There are ways other than TBAA to exclude errno aliasing, with access / object size being a possible candidate.

I have a semi-related question, specifically w.r.t memory effects of intrinsics and address spaces. We have several intrinsics that are known to only touch memory for a specific address space or subset of address spaces. Has anyone considered modeling intrinsics related memory effects, or memory effects in general on a per address space basis?

This seems reasonable and useful to me. I like the fact that it gives us explicit information about errno being the thing that is being accessed so that we can improve reasoning based on that specific information.

I think this is useful even after considering the alternative that @brunodf mentions. If I further modify that example to have the loop reference a single integer location rather than an array, gcc does reload the value after the function call but if I modify it to access a single float value it does not. So gcc seems to be doing analysis based on both type and object size. Going even further, if I modify the example to use a single global integer value, the value is not reloaded after the call, but if the global is an extern it is reloaded. I’m not quite sure what to conclude from that.

Oh, I think that was my source of confusion. We surely wish to move from top to bottom of the lattice (NoModRef, in this case) – yet, I was mistakenly tricked into thinking that we may want to stop this by earlier returning ModRef (indeed, because we currently mark malloc as memory(inaccessiblemem: readwrite)). Agree that this needs to be addressed not by changing AA, thanks for clarifying this!

I’ve had the same idea, but I would like to extend it to a wider range of floating point environment effects, such as rounding mode read/write. This would assist with a better implementation of strictfp

2 Likes

I want errno to be handled, I am not sure I want it to be a top-level category but I’m OK with it.
Errno is somewhere at the intersection of two cases I wanted us to add:

memory(threadprivate:...)
memory(symbols: [X, Y] read, [Z] write)

My initial preference would have been to implement one or both of these extensions, hence:

  1. threadprivate as top-level category
  2. a list of named symbols as top-level category
    The latter should be strictly stronger than errnomem (we would name the symbol __errno or sth), and the former would often be as good as errnomem but more often applicable.

Anyway, I rather have something than nothing, just hoping to get more than just handling of a single “variables”/use case.

(To go @arsenm’s route, maybe: memory(environment/compiler/internal: [errno, fpenv] write), but symbols should work too and be more generic
.

1 Like

I guess this is a naive question, but what makes errno different from any other thread-local variable that might or might not be aliased with a pointer argument or whatever? Maybe this is the same thing as what @jdoerfert is already saying, but it seems weird to make errno such a special case.

This is leaving out consideration for platforms (if any still remain) which don’t have thread-local errno.

For historical reasons, on most targets, errno is thread-local, but not thread_local; the address is hidden behind some function call. So we need an abstraction.

Some microcontroller targets don’t have thread-local errno… but they also don’t have threads, so it doesn’t really make a difference.

I feel like the errno case is quite distinct from a generic “symbol” location, and it would be best not to conflate them.

Part of the problem is that, as Eli mentioned, errno is not actually a symbol, it’s a concept. There is not going to be any actual global with the name “errno”. We could of course still shoehorn it into the same concept, but then it just becomes a magic string.

The other part is that, while there is a small bit of commonality between AA handling for errno and proper symbols (e.g. they both don’t alias with alloca/malloc), most of the logic for errno is going to be bespoke, such as the TBAA interaction outlined in my initial post.


Though to be clear, I do support further expanding memory(...) to more precisely describe other locations as well. That’s why I introduced this representation, after all. errno is something of a test run to see how such an extension works out in practice.

IMHO, it already is a magic string no matter what. We look at functions names and decide if they modify the magic “errno” concept, e.g. *__errno_location() = 42;.
My concern is that we enshrine this one-off idea as a top-level category even though it behaves like a (hidden) symbol, as far as I can tell, and we have other concepts that are similar, e.g. @arsenm 's fp env stuff).

Is there a technical argument why we could not just treat the “errno concept” as if there is a __errno symbol, rather than making it a one-off?
The effect is the same, we look at the name of sin, __errno_location, etc., and decide that it will write the concept or return the “concept’s address” (or better symbols address).
This will not only provide a clear path for other symbols, but I think we need to look at it as a symbol at the end of the day because it acts as if it is a hidden global.
What I mean is that

void write_errno() { errno = 42; }             // errnomem: write
int *return_errno_loc() { return &errno; }.    // readnone
void write_int(int *i) { *i = 17; }            // argmem: write
void test() { write_int(return_errno_loc()); } // errnomem: write

are all fine functions and we need to ensure we now track “errno” across uses, or give up accordingly.

At least it’s clear to me that in the future we want to annotate a function foo with memory(symbols: [A,B,errno] write) to infer it won’t access symbol C.
If you want to be restrictive for now, which I get, we can restrict the symbol list to __errno, and if the idea turns out to be problematic for some reason, we can scrap the high-level symbol category again and go with errnomem.

So, I’d very much prefer us to do some sort of this (spelling is exemplary!):

memory(symbol: read/write : __errno)

No, we’re not actually interested in detecting such patterns. The goal here is to prove that accesses/locations aren’t clobbered by errno-writing libcalls, to the extent this is possible. This involves providing that certain accesses/locations cannot alias errno.

While showing that an access must alias errno (which is what would require the kind of libc-specific pattern matching you mention) is something that would nominally improve accuracy of the inferred memory effects under this proposal, I don’t think it’s actually useful for anything in a practical sense, and is not something I’m interested here.

To be clear, I think that the extension to FP environment representation should also be via a top-level category. I’d say modelling that independently of generic “symbols” is even more important than in the errno case, because the FP environment is part of what we currently call “inaccessible memory” rather than “other memory”. As such, the FP environment is currently disjoint from the memory space of normal “symbols”. Mixing these will muddy the waters.

I tried to explain this already. errno is not a symbol, it requires bespoke alias analysis, and does not materially benefit from being treated as a generic symbol.

From a pure implementation perspective, the benefit of having first-class support vs using a magic string is the same reason why we have enum attributes while we could just be encoding everything with string attributes: It’s a lot more efficient.

A first-class location is just two bits in MemoryEffects. It can be queried efficiently. More than that, it can me modified efficiently. AA can just mask out the location when it proves aliasing with errno is not possible. Doing the same thing in a generic “symbols” representation would be very inefficient. (It depends on implementation details, but most likely it would involve creating a new hash set without the errno pseudo-symbol and running it through uniquing infrastructure.)

I feel like this may be the core misunderstand here. This is indeed the kind of analysis you’d be interested in for generic symbols, but (as explained at the start of this post) this is not what we’re interested in when it comes to errno.

That is possible. I went back to the first post and there is no example, could you provide one or two, maybe also one that you think is out of scope?
I’m particularly interested in how you can tell “accesses/locations aren’t clobbered by errno-writing libcalls” if the pointer might be based on __errno_location(). Maybe you only want to distinguish “identified object” vs “libcall”?

Sure, sorry for not providing any examples in the first place.

This should do store-to-load-forwarding based on TBAA:

define float @test(ptr %p, float %f) {
  store float 0.0, ptr %p, !tbaa !0
  call float @sinf(float %f)
  %v = load float, ptr %p, !tbaa !0
  ret float %v
}

declare float @sinf(float) memory(errnomem: write)

!0 = !{!"float", !1, i64 0}
!1 = !{!"omnipotent char", !2, i64 0}
!2 = !{!"Simple C++ TBAA"}

This should forward because an alloca can’t be errno:

define float @test(float %f) {
  %p = alloca float
  call void @escape(ptr %p)
  store float 0.0, ptr %p
  call float @sinf(float %f)
  %v = load float, ptr %p
  ret float %v
}

declare float @sinf(float) memory(errnomem: write)
declare void @escape(ptr %p)

This should forward because we’re accessing something that is larger than errno:

define double @test(ptr %p, float %f) {
  call void @escape(ptr %p)
  store double 0.0, ptr %p
  call float @sinf(float %f)
  %v = load double, ptr %p
  ret double %v
}

declare float @sinf(float) memory(errnomem: write)

Here is a negative example. This one should not forward, because for all we know %p might be a pointer to errno here:

define float @test(ptr %p, float %f) {
  store float 0.0, ptr %p
  call float @sinf(float %f)
  %v = load float, ptr %p
  ret float %v
}

declare float @sinf(float) memory(errnomem: write)

No worries.

In all the examples, you have sinf calls, why do we need errnomem since we already pattern match sinf anyway. Is the only reason we want to move it to a top-level category such that for this case we have a quick lookup? I mean, we don’t need errnomem for any of the examples, do we, we know sinf (that’s how we got errnomem).

Do we have an idea how many (Rust/C/…) codes have non-leaf functions that we would identify as errnomem: ..., and additionally only inaccesiblemem and argmemonly. I guess if we have any other (current) category, errnomem is not helpful anymore, right?

Right, having a list of all the errno-affecting libcalls is in AA is something we could do, but I don’t think it’s a particularly great solution.

Generally speaking, we always try to handle libcalls by describing their behavior with (inferred) attributes, and then having generic analyses only work on attributes. This keeps the architecture orthogonal.

The benefit of describing things with attributes is that we aren’t limited to a hardcoded list of functions. If you create a thin wrapper around sinf, we’ll propagate the attributes, and it will see the same optimization behavior as a direct call.

Sorry, no idea. It’s worth noting though (and what actually triggered this RFC) that even something like malloc can write errno (which we currently just ignore and miscompile). And yes, I don’t think errnomem provides useful information if there are memory effects on the other location.

1 Like

100% in favor. We still have the name switch but at least the pipeline should not need to know about details like function names. I guess I take it even one step further, I don’t want anything to need to know about details like “errno”, which means I would like to have a general concept rather then errnomem. I get that AA would not need to know about errnomem since it just does the intersect of the memory attributes, but if we want to ever go down the road of properly tracking errno (what you said above is not the initial goal), we’d need to treat it differently than other “symbols”.

I think we actually are pretty much in agreement here about the options, we just have different opinions based on what we want to optimize for.

The way I see it we can:

  1. Make errno queries special and really cheap for simple cases, e.g., access vs only-errno function, and give up on more complex cases. Let symbols tracking rest for another day.
  2. Use errno as the reason to do the more complex encoding scheme that tracks “symbols”, potentially restricted to errno for now. The cost of a lookup would be higher as we need at least one integer per accessed symbol. The check would depend on the other memory. If it is a “symbol”, it’s a set lookup/intersection. If the other memory is a non-symbol identified object, it’s trivial. If the other memory is something else, we could find the (virtual) definition of the symbols, e.g., int __errno, and do the TBAA or size checks you did in your examples.

I see the benefits of 1 and the complexity of 2.

My main point is that I want to be able to replace sinf in all your examples with

int GlobalError = 0;
float myfunc(float f) {
  if (!isValid(f)) GlobalError = 1;
  return f * 3.14;
}

and the AA queries should still work the same. I always want to make things “less special”, so sinf and my own implementation of sinf that is exactly the same code should result in the same optimizations.

Finally, I believe there is substantial potential in global-only functions (e.g., we have const and pure) and knowing the size + tbaa of the globals is hopefully helping a lot.

Anyway, we seem to be at an impasse. I will not block errnomem but maybe others should chime and “vote” for fast special case(s) vs generic encoding.

How do functions end up annotated with the errno information? I can imagine attributes on source code, but libc headers won’t have said annotations.

Ths obvious answer is a long list of functions with known behaviour (hopefully ignored for ffreestanding), but at that point we could use the function names directly as suggested above.

Unless this is planned as a fixpoint thing where the leaves are a list of known function calls and we annotate the callers, which will work at the cost of annotating essentially everything, and will mostly fail to handle function pointers.

I want better errno modelling but am not immediately convinced that the attribute is significantly more useful than the list of magic function names it also seems to require.