[PSA] Swapping out `_or_null` with `_if_present`

xref: Updates to Casting.h

dyn_cast_or_null and cast_or_null have been soft-deprecated since the update to Casting.h (there are comments to that effect, and all they do is forward to the _if_present variants). IMO, we want to have one way to perform the function “dyn_cast this thing if it exists” so as to not introduce possible foot-guns and keep things clean.

The if_present wording was chosen because it correctly describes the function’s capabilities in that it handles nulls, but also None optionals, nullable values (a la MLIR).

In preparation for marking the or_null variants deprecated in the code, I have patches up for the llvm directory, and will do the same with the other subprojects to make sure upstream is nice and clean to avoid compiler spew. Patches here: ⚙ D133089 [LLVM] Replace `dyn_cast_or_null` with `dyn_cast_if_present`, NFC., ⚙ D133090 [LLVM] Replace `cast_or_null` with `cast_if_present`, NFC.

Thanks for bumping this out to a forum thread.

Have you evaluated just making dyn_cast always tolerate “a null”, returning “null” if found? This would eliminate a footgun in the api, simplify things, and define away the naming problem. I think that cast_or_null and isa_or_null can also go away as well.

I think I was probably responsible for the original design. While it had some theoretical benefits, I don’t think it is worth the complexity it induces, I think we should strongly consider just shedding the complexity.

-Chris

Personally, I would strongly prefer we not change the semantics of dyn_cast(P) to silently pass through nullptr. For me, there’s a semantic distinction between dyn_cast(P) and dyn_cast_or_null(P). The later indicates a lack of an assert, and frankly, that’s often a code smell. I want that to be visible when looking at the code.

I prefer the or_null wording over or_present, but I feel much less strongly about that.

1 Like

Have you evaluated just making dyn_cast always tolerate “a null”, returning “null” if found? This would eliminate a footgun in the api, simplify things, and define away the naming problem. I think that cast_or_null and isa_or_null can also go away as well.

I thought about this early on, but my concern was what @preames brought up, namely, that some folks like having that _if_present be explicit. I personally agree that having fewer foot-guns is better but I think that might have to be a follow-on change :slight_smile:

dyn_cast asserting non-nullness is an important invariant for code readers and an important optimization. I strongly prefer keeping the check.

I can provide another data point that I am happy with _or_null and don’t find the new names more appealing, but I can live with a change.

3 Likes

As already commented on the review, I also find the _or_null naming clearer in the context of LLVM, where we are actually working with null pointers.

While having only one way to do things is generally a good guideline, I don’t think it needs to be followed dogmatically. Nothing wrong with having a more meaningful alias for a more specific use-case for such a widely used construct.

To fill out the matrix of possible opinions here:

Strong +1 on keeping the non-null assertion in dyn_cast. Strong invariants are extremely important for code readability and reviewability.

Slightly in favor of having only one spelling, and that spelling being if_present. (The reasoning being: dyn_cast’s behavior can be described as “dynamically either cast to the given type or return null”, which may lead some newcomers to be puzzled over what the difference is to dyn_cast_or_null. Using if in the name makes it clearer that an operation is being performed conditionally. But it’s a weak argument and we’re deep into bikeshed territory here :slight_smile: )

2 Likes

So to summarize the opinions here, I hear generally:

  • Keep dyn_cast and dyn_cast_(or_null|is_present) separate
  • Seems like we are roughly divided on or_null vs is_present

I personally would prefer to have one way to do the thing this function does, i.e. check for existence and then do the dyn_cast. The main reason for the is_present wording is just because it now supports many more classes of values than just pointers, and I was uncomfortable with or_null on (for example) Optional<T>.

To converge the discussion a little bit, I see 2 options given what’s been expressed so far.

  1. Keep or_null and if_present
  2. Replace or_null with if_present

Obviously I prefer option 2 for the reasons above, but if the community feels strongly about keeping option 1 then I can learn to live with that :slight_smile:

I think the error here was in allowing dyn_cast_if_null to share an implementation with dyn_cast_if_present. dyn_cast_if_null is working on a pointer type. Having an Optional silently slide through is surprising.

From a code readability perspective, dyn_cast_if_null and dyn_cast_if_present (on optional) are slightly different things.

So basically, my vote would be to keep both. Remove the ability for _of_null to handle optionals. Use _or_null on pointers, and _or_present in the handful of cases involving optional.

dyn_cast_if_null is working on a pointer type. Having an Optional silently slide through is surprising.

Agreed

From a code readability perspective, dyn_cast_if_null and dyn_cast_if_present (on optional) are slightly different things.

I’m not sure I agree with this statement - both are testing for “does this thing exist” before (maybe) casting.

Remove the ability for _of_null to handle optionals. Use _or_null on pointers, and _or_present in the handful of cases involving optional.

All types that go into if_present go through a struct called ValueIsPresent and we actually handle raw pointers, smart pointers, and types like mlir::Type through the same overload (they’re all ‘nullable’), so that’s definitely part of my reticence to split it up into 2 different things. That said, IMO it’s not great to have 2 things that do conceptually the same thing (check for existence, then cast) in slightly different ways. That feels like a recipe for code drift/maintainability problems.

I think the suggestion here is to just limit the functions using the “null” terminology to overloads accepting pointers. So if we currently have


template <class X, class Y> auto cast_or_null(const Y &Val) {
  return cast_if_present<X>(Val);
}

template <class X, class Y> auto cast_or_null(Y &Val) {
  return cast_if_present<X>(Val);
}

template <class X, class Y> auto cast_or_null(Y *Val) {
  return cast_if_present<X>(Val);
}

template <class X, class Y> auto cast_or_null(std::unique_ptr<Y> &&Val) {
  return cast_if_present<X>(std::move(Val));
}

then we can reduce this to just

template <class X, class Y> auto cast_or_null(Y *Val) {
  return cast_if_present<X>(Val);
}

This limits the use of cast_or_null (and friends) to just the case where the naming makes sense.

Also worth noting that next to the two functions already discussed we also have isa_and_nonnull / isa_and_present. The former is self-explanatory, while the latter is not.

+1.

I also prefer the new name. if_present is a superset of nonnull IMO. It doesn’t make sense to keep an extra function around just as an alias.

dyn_cast_if_present can also work on “pointer-like” types, i.e. PointerUnion. So I don’t think it makes sense to restrict it to just pointer types. Having two functions that do the same thing also creates friction for new users who aren’t “used” to a particular thing. Which one are they supposed to use? Is this buried somewhere in the style guide? Why are there two different functions anyways?

My 2c:

  • I think the “or_null” suffix was probably OK to keep even for other things where they aren’t exactly null - sorry I missed the “if_present” discussion, as I would’ve pushed back a bit there
  • I’d mildly +1 @clattner’s comment about removing the non-null variants & leaving the pass-through versions. (if someone’s got data to suggest they make a material difference to compiler/tool/LLVM performance, I’d be curious to see the data - it’d be a bit surprising to me) - and generally I’m all for making null/non-null distinctions super clear, but in this case not so much.

But sounds like the winds are headed in other directions.