subtle problem with inst_iterator

Hello, I think there's a rather subtle problem with the inst_iterator. It
declares its iterator category as std::bidirectional_iterator_tag but C++
standard requirements for forward iterator (which are included in
requirements for bidirection iterator), say that the type of expression

   *r;

should be T&, where 'r' is the iterator and T is its value type. The
inst_iterator, though, returns pointer to Instruction, not a reference to a
pointer to instruction.

The above might sound like I've gone crazy reading C++ standard, but my real
reason is that with the current code the following:

    for_each(inst_begin(f), inst_end(f), boost::bind(
        apply_phi_semantics, result, _1, instruction2vertex));

does not compile, where 'apply_phi_semantics' is my function and boost::bind
is a tool for binding function arguments described in

     Chapter 1. Boost.Bind - 1.80.0

The reason it does not compile is somewhat contrived. for_each passes the
result of *it to functional object returned by boost::bind. The operator() of
that object has the following signature:

   template<class Arg1>
   .... operator()(Arg& a1)

and since result of *it is considered to be rvalue it can't be accepted by
this operator. The complete discussion is in

    http://std.dkuug.dk/jtc1/sc22/wg21/docs/papers/2002/n1385.htm

I'd suggest to apply the following patch which makes operator* return
reference to pointer. It makes my code compile and the rest of LLVM compiles
too. Comments?

- Volodya

iterator.diff (1.22 KB)

Hrm, I'm not sure about this at all. In particular ilists have some funny
semantics that you should know about. In particular, the instruction list
is more like a std::list<Instruction> (but that allows polymorphism) than
a std::list<Instruction*>, so it's not possible to just assign a new
instruction* through an iterator. In other words: *iit = new
Instruction(...) doesn't make any sense.

Probably the right thing to do would be to make the inst_iterator return a
reference to the *instruction* itself, rather than a reference to the
pointer. This would make it work the same as standard ilist iterators.
The reason why we have it return pointers right now is to populate
worklists, which allows us to do:

std::set<Instruction*> WorkList(inst_begin(F), inst_end(F));

Which is nice.

What do you think?

-Chris

Chris Lattner wrote:

> and since result of *it is considered to be rvalue it can't be accepted
> by this operator. The complete discussion is in
>
> http://std.dkuug.dk/jtc1/sc22/wg21/docs/papers/2002/n1385.htm
>
> I'd suggest to apply the following patch which makes operator* return
> reference to pointer. It makes my code compile and the rest of LLVM
> compiles too. Comments?

Hrm, I'm not sure about this at all. In particular ilists have some funny
semantics that you should know about. In particular, the instruction list
is more like a std::list<Instruction> (but that allows polymorphism) than
a std::list<Instruction*>,

Yea, I've noticed that. However, it looks like inst_iterator is iterator over
pointers. Oh, wait a minite, that's the current code:

  inline IIty operator*() const { return BI; }
  inline IIty operator->() const { return operator*(); }

So operator* works as if value_type is Instruction*, but operator-> works as
if value_type is Instruction. Hmm :wink:

so it's not possible to just assign a new
instruction* through an iterator. In other words: *iit = new
Instruction(...) doesn't make any sense.

Probably the right thing to do would be to make the inst_iterator return a
reference to the *instruction* itself, rather than a reference to the
pointer. This would make it work the same as standard ilist iterators.

Yes, I though about this appoach as well but decided it can be too disturbing
for other code.

The reason why we have it return pointers right now is to populate
worklists, which allows us to do:

std::set<Instruction*> WorkList(inst_begin(F), inst_end(F));

Maybe this can be rewritten like;

    std::set<Instruction*> WorkList;
    std::transform(insn_begin(F), insn_end(F),
  inserter(WorkList, WorkList.begin(), take_address);

given proper definition of take_addres. Or maybe manual initialization will
okay.

So, if you think making inst_iterator's operator* return reference to
instruction is better than making it return reference to pointer, I can take
a look at this.

- Volodya

Yea, I've noticed that. However, it looks like inst_iterator is iterator over
pointers. Oh, wait a minite, that's the current code:

  inline IIty operator*() const { return BI; }
  inline IIty operator->() const { return operator*(); }

So operator* works as if value_type is Instruction*, but operator-> works as
if value_type is Instruction. Hmm :wink:

Yeah, fishy huh? :slight_smile:

> Probably the right thing to do would be to make the inst_iterator return a
> reference to the *instruction* itself, rather than a reference to the
> pointer. This would make it work the same as standard ilist iterators.

Yes, I though about this appoach as well but decided it can be too disturbing
for other code.

I think it's the most correct and consistent want to do it.

> The reason why we have it return pointers right now is to populate
> worklists, which allows us to do:
>
> std::set<Instruction*> WorkList(inst_begin(F), inst_end(F));

Maybe this can be rewritten like;

    std::set<Instruction*> WorkList;
    std::transform(insn_begin(F), insn_end(F),
  inserter(WorkList, WorkList.begin(), take_address);

given proper definition of take_addres. Or maybe manual initialization will
okay.

So, if you think making inst_iterator's operator* return reference to
instruction is better than making it return reference to pointer, I can take
a look at this.

I do think it's the best way. Once concern though, please change code
like this:

std::set<Instruction*> WorkList(inst_begin(F), inst_end(F));

into code like:
     std::set<Instruction*> WorkList;
     for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
       WorkList.insert(&*I);

... instead of using transform. It's the same # of lines of code and much
easier to understand for people not familiar with higher-order C++. It
also avoid having to pull in a "take_address" functor.

I don't think this really occurs often enough to be a serious problem
anyway. :slight_smile:

Thanks,

-Chris

Chris Lattner wrote:

> inline IIty operator*() const { return BI; }
> inline IIty operator->() const { return operator*(); }
>
> So operator* works as if value_type is Instruction*, but operator-> works
> as if value_type is Instruction. Hmm :wink:

Yeah, fishy huh? :slight_smile:

Yea, a bit. I've decided that before changing that I'd better find other
problems, if any, so I run inst_iterator via checks provided by
Boost.Iterators.

First problem is that inst_iterator (and actually InstIterator class template)
is not Assignable, because it has a reference data member, while standard
requires all iterators to be assignable.

Second InstIterator is not DefaultConstructile, which is required from
Forwarditerator.

Also, I get error because InstIterator::difference_type is not signed integer
type (its defined as unsigned), but in this case the current standard does
not say it's should be signed, though it looks reasonable and proposal for
new version of standard require that.

> > Probably the right thing to do would be to make the inst_iterator
> > return a reference to the *instruction* itself, rather than a reference
> > to the pointer. This would make it work the same as standard ilist
> > iterators.
>
> Yes, I though about this appoach as well but decided it can be too
> disturbing for other code.

I think it's the most correct and consistent want to do it.

Great.

> > The reason why we have it return pointers right now is to populate
> > worklists, which allows us to do:
> >
> > std::set<Instruction*> WorkList(inst_begin(F), inst_end(F));
>
> Maybe this can be rewritten like;
>
> std::set<Instruction*> WorkList;
> std::transform(insn_begin(F), insn_end(F),
> inserter(WorkList, WorkList.begin(), take_address);
>
> given proper definition of take_addres. Or maybe manual initialization
> will okay.
>
> So, if you think making inst_iterator's operator* return reference to
> instruction is better than making it return reference to pointer, I can
> take a look at this.

I do think it's the best way. Once concern though, please change code
like this:

std::set<Instruction*> WorkList(inst_begin(F), inst_end(F));

into code like:
     std::set<Instruction*> WorkList;
     for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
       WorkList.insert(&*I);

... instead of using transform. It's the same # of lines of code and much
easier to understand for people not familiar with higher-order C++. It
also avoid having to pull in a "take_address" functor.

I don't think this really occurs often enough to be a serious problem
anyway. :slight_smile:

I attach a new patch which fixes all of the problems. It's kinda large, must
most changes are technical. Everything builds for me.

- Volodya

InstIterator.diff (16 KB)

> Yeah, fishy huh? :slight_smile:

Yea, a bit. I've decided that before changing that I'd better find other
problems, if any, so I run inst_iterator via checks provided by
Boost.Iterators.

First problem is that inst_iterator (and actually InstIterator class template)
is not Assignable, because it has a reference data member, while standard
requires all iterators to be assignable.

Ok.

Second InstIterator is not DefaultConstructile, which is required from
Forwarditerator.

Ok, makes sense.

Also, I get error because InstIterator::difference_type is not signed integer
type (its defined as unsigned), but in this case the current standard does
not say it's should be signed, though it looks reasonable and proposal for
new version of standard require that.

Either way should work. I'm not opposed to changing it. :slight_smile:

I attach a new patch which fixes all of the problems. It's kinda large, must
most changes are technical. Everything builds for me.

Thanks! I've applied the patch:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040426/014023.html
(through 014029.html).

The only thing I changed was to turn a few occurrences of (*P).f into
P->f, and adjust it a bit to apply to current CVS.

Thanks a lot!

-Chris

Chris Lattner wrote:

> I attach a new patch which fixes all of the problems. It's kinda large,
> must most changes are technical. Everything builds for me.

Thanks! I've applied the patch:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040426/014023.
html (through 014029.html).

The only thing I changed was to turn a few occurrences of (*P).f into
P->f, and adjust it a bit to apply to current CVS.

Thanks for applying! Hopefully, one day I'll sumbit less "technical" patch :wink:

- Volodya