[ADT] early_inc_iterator + pointer_iterator reference to temporary

The following code generates a warning in MSVC regarding an invalid temporary:

SmallVector<int> V{1, 2, 3, 4};

for (int *P : make_early_inc_range(make_pointer_range(V)))
  errs() << *P;

Warning generated:

llvm\include\llvm\ADT\STLExtras.h(630) : warning C4172: returning address of local variable or temporary

This seems to also randomly segfault, but it’s difficult to reproduce.

The problem seems to be an unfortunate combination of operator* details on the two synthesized iterator types.

template <...>
class pointer_iterator : ... {
  mutable T Ptr;

public:
...
  T &operator*() const { return Ptr = &*this->I; }
};

pointer_iterator returns a reference to a member variable containing the pointer value. This is fine when pointer_iterator is instantiated on the top scope because the pointer_iterator is kept live until the end of the statement. However, when we embed the pointer_iterator within an early_inc_iterator_impl, we end up with a problem:

template <typename WrappedIteratorT>
class early_inc_iterator_impl : ... {
...
  using BaseT::operator*;
  decltype(*std::declval<WrappedIteratorT>()) operator*() {
#if LLVM_ENABLE_ABI_BREAKING_CHECKS
    assert(!IsEarlyIncremented && "Cannot dereference twice!");
    IsEarlyIncremented = true;
#endif
    return *(this->I)++;
  }
...
};

The early_inc_iterator_impl::operator*() dereferences a post-increment result (i.e. a temporary pointer_iterator. The temp pointer_iterator temp goes out of scope and we’ve returned a reference to a temp that no longer exists.

Could the pointer_iterator simply return the pointer by value from its operator* (i.e. set the iterator’s reference type to T*)? What’s the purpose of the mutable member variable?

@timshen would you please comment?

Good question but I don’t know why nobody pay attention to.

SmallVector<int> V{1, 2, 3, 4};
for (int *P : make_early_inc_range(make_pointer_range(V)))
  errs() << *P;

We use int * not int *& to receive a return ref.So this code snipaste should not crash I think. (maybe in debug build will not crash but in release build crash due to some optimization inner,etc…)

SmallVector<int> V{1, 2, 3, 4};
for (int *& P : make_early_inc_range(make_pointer_range(V)))
  errs() << *P;

If we use like above will crash as expected.

#include <iostream>

struct A {
    int a{0};
    int &invoked() { return a = 123; }
};
int &foo() {
    return A{}.invoked();
}

struct B {
    int a{0};
    int &invoked() { return a = 123; }
};

int main() {
    int test1 = foo();         // UB
    int test2 = B{}.invoked(); // Not UB
    std::cout << test1 << test2 << '\n';
    return 0;
}

Ok I make some mistakes. Above can explain my problem.

The underlying issue might be that a lot of our range utilities don’t work with temporaries. We have similar issues with reverse in range for loops. Some newer utilities like zip* and enumerate solve it by moving temporaries to their storage.

I’m not sure if there’s a generic solution we could adopt, perhaps ranges from c++20 or other libraries have figured something out.

cc: @dblaikie @MaskRay @kazutakahirata

1 Like

Yeah, pretty much all that, @kuhar

Why the mutable member? Because it’s necessary to implement operator-> and otherwise meet the spec for iterators, unfortunately - as far as I recall/understand.

We could add more safety to other ranges helpers - but, yeah, it’s a sort of unsolved problem for range adapters generally.

The mutable member was a separate problem unique to enumerate where we want the tuples of references to be mutable, not the stored range.

I had a WIP PR to make reverse work with temporaries but discovered that one challenge of making these range adaptors work with temporaries is that a lot of existing code requires them to return llvm::iterator_range.

So It’s not because that the llvm follow MIT license, so we can’t copy the implement of pointer_iter in GNU license software/library?

Generally can’t copy from another project, no.

Why wouldn’t something like iterator_facade_base::PointerProxy solve the temporary problem for a pointer-type reference (assuming you don’t store the result of operator->() to a named variable)? I’m still not following why the mutable reference is necessary, unless we’re trying to make the following work (which seems broken to me anyway):

struct Foo {
  int Val;
  Foo(int V) : Val(V) {}
};

void test() {
  SmallVector<int> V{1, 2, 3, 4};
  auto R = make_pointer_range(V);

  auto ArrowResult = R.begin().operator->();

  ArrowResult->Val;
  //         ^^^^^^
  // Dangling reference.
}