API design

Hi,

I've been running LLVM with _GLIBCXX_DEBUG (extra checks) turned on to
see what would happen, and it's been a complete disaster.

The major problem is the use of this API:

  new CallInst(V, &Args[0], Args.size());

repeated throughout LLVM. When Args is empty, Args[0] is invalid, even
if the next operation is taking the address. Trying to fix it
illustrates the depth of the problem. Here are our choices:

- Keeping the API
  a) Detect when the vector is empty and then pass a NULL pointer. This
would require that the constructor accept a null pointer when the
argument count is zero.
  b) Add an extra element to the vector, then subtract 1 from the size.
This causes run time memory bloat.

- Changing the API
  a) Template it to take two iterators. This causes code size bloat.
  b) Template it to take a container. This causes code size bloat.

We could also just ignore the problem. I'm not clear on what the effects
of that would be in the long term. Is this something we can expect will
be changed in C++0x?

I'd like to get the extra checks working so that they can help find our
more subtle bugs. Any idea what we should do here?

Nick Lewycky

I've been running LLVM with _GLIBCXX_DEBUG (extra checks) turned on to
see what would happen, and it's been a complete disaster.

The major problem is the use of this API:

new CallInst(V, &Args[0], Args.size());

repeated throughout LLVM. When Args is empty, Args[0] is invalid, even
if the next operation is taking the address.

Ick.

Trying to fix it illustrates the depth of the problem. Here are our choices:

- Keeping the API
a) Detect when the vector is empty and then pass a NULL pointer. This
would require that the constructor accept a null pointer when the
argument count is zero.

If the count is zero, the pointer is unused. The problem with this is that it would pollute the callers... :frowning:

b) Add an extra element to the vector, then subtract 1 from the size.
This causes run time memory bloat.

yuck. :frowning:

- Changing the API
a) Template it to take two iterators. This causes code size bloat.
b) Template it to take a container. This causes code size bloat.

'b' also prevents many important uses, like passing in a subset of a container.

We could also just ignore the problem. I'm not clear on what the effects
of that would be in the long term. Is this something we can expect will
be changed in C++0x?

I don't think that this issue causes any problems in practice. The biggest bad thing is that it makes the extra checking code apparently useless :frowning:

I'd like to get the extra checks working so that they can help find our
more subtle bugs. Any idea what we should do here?

I don't really have a good idea, but I also don't want to significantly uglify the sourcebase...

-Chris

> I'd like to get the extra checks working so that they can help find our
> more subtle bugs. Any idea what we should do here?

I don't really have a good idea, but I also don't want to significantly
uglify the sourcebase...

#define V_CALLINST(V, Args) new CallInst(V, Args.size() == 0 ? NULL :
&Args[0], Args.size())

:stuck_out_tongue:

-Keith

> > I'd like to get the extra checks working so that they can help find our
> > more subtle bugs. Any idea what we should do here?
>
> I don't really have a good idea, but I also don't want to significantly
> uglify the sourcebase...

#define V_CALLINST(V, Args) new CallInst(V, Args.size() == 0 ? NULL :
&Args[0], Args.size())

I'd actually prefer a new constructor here. One that doesn't take any
Args at all. API wise its a little more clear what's going on, so you'd
have:

if (Args.size())
  new CallInst(V, &Args[0], Args.size());
else
  new CallInst(V);

Or any other permutation using ternary operator and/or macros.

Reid.

Chris Lattner wrote:

I don't really have a good idea, but I also don't want to significantly
uglify the sourcebase...

The only one you didn't explicitly object to was templating on the
iterators. That was my favourite idea.

How bad is it to template here? Will we actually end up with two copies
of each function or can the compiler collapse them? I'm not too familiar
with optimizations at this level.

I'd be willing to go through the errors reported by _GLIBCXX_DEBUG and
fix them one at a time, but I don't want to experiment by going through
all of them just to profile the difference. That's why I'm trying to
decide the best fix (or whether to fix) up front.

Nick

The point is to not have conditional code.

-Chris

> I've been running LLVM with _GLIBCXX_DEBUG (extra checks) turned on to
> see what would happen, and it's been a complete disaster.

Well, that's a bit harsh, isn't it? It's finding bugs, just like it's
supposed to. :slight_smile:

I believe I've started to run into this one too. I hit a similar error and
was in the middle of tracking down the source.

> - Changing the API
> a) Template it to take two iterators. This causes code size bloat.

This seems like the right solution to me. Unless llvm is running on
extremely limited memory embedded systems, the extra code space
shouldn't be an issue.

> We could also just ignore the problem. I'm not clear on what the effects
> of that would be in the long term. Is this something we can expect will
> be changed in C++0x?

I don't think that this issue causes any problems in practice. The
biggest bad thing is that it makes the extra checking code apparently
useless :frowning:

The index operation can imply a dereference on some implementations,
otherwise the checking code wouldn't have caught it. In other words, the
standard requires that the memory being indexed by referenceable. Violating
that could cause all sorts of interesting things, like segmentation faults or
the sudden sprouting of rose bushes on Dan Gohman's head.

I agree that on common architectures it probably won't cause a problem.
But it still results in undefined behavior.

> I'd like to get the extra checks working so that they can help find our
> more subtle bugs. Any idea what we should do here?

I don't really have a good idea, but I also don't want to significantly
uglify the sourcebase...

Passing two iterators is in the spirit of the standard library, so I don't
consider that uglification. I understand others may disagree.

In any event, leaving this kind of code in is asking for trouble down the
road.

                                                   -Dave

- Changing the API
a) Template it to take two iterators. This causes code size bloat.

This seems like the right solution to me. Unless llvm is running on
extremely limited memory embedded systems, the extra code space
shouldn't be an issue.

Code size is always an issue.

What specifically are you proposing? I assume you want to eliminate/supplement this ctor:

CallInst::CallInst(Value *Func, Value* const *Args, unsigned NumArgs,
                    const std::string &Name, BasicBlock *InsertAtEnd)

Turning it into a template would require duplicating it (small), and CallInst::init (large).

I don't really think this is workable.

We could also just ignore the problem. I'm not clear on what the effects
of that would be in the long term. Is this something we can expect will
be changed in C++0x?

I don't think that this issue causes any problems in practice. The
biggest bad thing is that it makes the extra checking code apparently
useless :frowning:

The index operation can imply a dereference on some implementations,
otherwise the checking code wouldn't have caught it. In other words, the
standard requires that the memory being indexed by referenceable. Violating
that could cause all sorts of interesting things, like segmentation faults or
the sudden sprouting of rose bushes on Dan Gohman's head.

I agree that on common architectures it probably won't cause a problem.
But it still results in undefined behavior.

I agree about undefined behavior. As I said above, I don't think it causes any problems on any implementations we have seen, nor is it likely to in the future. That said, I obviously agree that relying on undefined behavior is bad :slight_smile:

I'd like to get the extra checks working so that they can help find our
more subtle bugs. Any idea what we should do here?

I don't really have a good idea, but I also don't want to significantly
uglify the sourcebase...

Passing two iterators is in the spirit of the standard library, so I don't
consider that uglification. I understand others may disagree.

I totally agree that it is clean, the question is, how do we make it work?

Here's a different suggestion that cloning all the code. Instead of doing that, why not add a new (templated) CallInst ctor, one which is very trivial. Then you could put the conditional code in it (to detect an empty range) and call into the non-inline stuff we already have.

It should be fine to only support random access iterators, so you could call into the existing stuff that takes a base + size.

-Chris

Nick Lewycky <nicholas@mxc.ca> writes:

Hi,

I've been running LLVM with _GLIBCXX_DEBUG (extra checks) turned on to
see what would happen, and it's been a complete disaster.

The major problem is the use of this API:

  new CallInst(V, &Args[0], Args.size());

Forgive me if I'm missing something, but why is it assumed that
&Args[0] must be a pointer to the entire contents of the vector?

This API works with sequential random access containers like vectors, arrays, smallvector, etc. It is not designed to work with arbitrary STL data structures, nor have we ever needed something like that.

-Chris

>>> - Changing the API
>>> a) Template it to take two iterators. This causes code size bloat.
>
> This seems like the right solution to me. Unless llvm is running on
> extremely limited memory embedded systems, the extra code space
> shouldn't be an issue.

Code size is always an issue.

To a greater or lesser degree. Context matters.

Here's a different suggestion that cloning all the code. Instead of doing
that, why not add a new (templated) CallInst ctor, one which is very
trivial. Then you could put the conditional code in it (to detect an
empty range) and call into the non-inline stuff we already have.

It should be fine to only support random access iterators, so you could
call into the existing stuff that takes a base + size.

That's pretty close to what I was thinking. I think it's a good solution.

                                                -Dave

You're right of course, let me rephrase what I meant. These are changes to VMCore. Any bloat to VMCore will affect anything that uses the LLVM IR. As such, bloat can prevent LLVM from being used by various things that it might otherwise be suitable for. We did quite a bit of work to reduce template bloat to reduce codesize for the opengl work we did, for example. Code size does matter :slight_smile:

-Chris

I'm happy to go do this but I'm wondering if we should include some
concept-checking code or something equivalent to ensure that we're
passed random access iterators.

What do you all think?

                                             -Dave

I've got a patch and it turns out I didn't need anything as fancy as concepts.
A simple call out to another init() routine that takes an iterator_category
object is sufficient.

I'll test and get this in ASAP.

                                             -Dave

Please keep it simple :). We can deal with ugly error messages if it explodes.

-Chris

> > - Changing the API
> > a) Template it to take two iterators. This causes code size bloat.

This seems like the right solution to me. Unless llvm is running on
extremely limited memory embedded systems, the extra code space
shouldn't be an issue.

It turns out this wasn't quite a simple as we'd thought. The problem is
the CallInst constructors that take two Value * arguments. They look
just like the teamplte versions that take iterators.

Unfortunately, there is at least one place in llvm where a CallInst is
constructed with two AllocaInst pointers. Because the template constructor
is a better match, it is selected over the Value * constructors.

To get around this problem, I've used SFINAE to remove the template
constructors from the overload set when passed pointers to classes derived
from Value:

  namespace detail {
    template<bool value>
    struct disable_if {};

    template<>
    struct disable_if<false> {
      typedef int type;
    };
  }

  /// CallInst - This class represents a function call, abstracting a target
  [...]

  /// Construct a CallInst given a range of arguments. InputIterator
  /// must be a random-access iterator pointing to contiguous storage
  /// (e.g. a std::vector<>::iterator). Checks are made for
  /// random-accessness but not for contiguous storage as that would
  /// incur runtime overhead.
  /// @brief Construct a CallInst from a range of arguments
  template<typename InputIterator>
  CallInst(Value *Func, InputIterator ArgBegin, InputIterator ArgEnd,
           const std::string &Name = "", Instruction *InsertBefore = 0,
           // This extra default argument use SFINAE to prevent the
           // template from being considered for pointers to
           // Value-derived types. It removes an unwanted
           // instantiation when the caller really wants one of the
           // Value * versions below. Because Value is a base class,
           // the template match is preferred for classes derived from
           // Value if this argument is not here. Note that if
           // InputIterator IS Value *, the non-template constructor
           // is a better match.
           typename detail::disable_if<
             is_base_and_derived<
               Value,
               typename remove_pointer<InputIterator>::type
             >::value
           >::type enabler = 0)
        : Instruction(cast<FunctionType>(cast<PointerType>(Func->getType())
                                         ->getElementType())->getReturnType(),
                      Instruction::Call, 0, 0, InsertBefore) {
    init(Func, ArgBegin, ArgEnd, Name,
         typename std::iterator_traits<InputIterator>::iterator_category());
  }

init() is the function that actually does the checking of the iterators for
an empty range.

This required that I pull in is_base_and_derived, remove_pointer, is_same and
remove_cv from Boost. boost::is_class already exists in Support/type_traits.h
so there is precedent for this.

Now, something is not quite right with the code as the overload doesn't seem
to get removed when passed an AllocaInst *. I'm working on figuring out why.
I wrote a small testcase which passes. I'm working on reducing
UpgradeParser.cc to a minimal testcase that shows the problem. I'm wondering
if I've stumbled on a gcc bug.

Is this an acceptable solution to the problem? Is importing these Boost
components ok? tr1 includes all of the Boost type traits so another option
is to just use those and impose a requirement that compilers used to build
llvm must support tr1 type traits. GNU libstdc++ already does.

                                    -Dave

- Changing the API
a) Template it to take two iterators. This causes code size bloat.

This seems like the right solution to me. Unless llvm is running on
extremely limited memory embedded systems, the extra code space
shouldn't be an issue.

It turns out this wasn't quite a simple as we'd thought. The problem is
the CallInst constructors that take two Value * arguments. They look
just like the teamplte versions that take iterators.

Wow, this sounds very complex :slight_smile:

template<typename InputIterator>
CallInst(Value *Func, InputIterator ArgBegin, InputIterator ArgEnd,
          const std::string &Name = "", Instruction *InsertBefore = 0,

Is it acceptable to just make the template argument be the container? That way you could pass:

std::vector.. V;
...

new CallInst(Callee, V, "foo");

etc. The disadvantage is that you lose the ability to pass a subrange. However, if you know you're using a subrange, you should be able to use another ctor form.

-Chris

I have to think about this a bit but one complication is that clients aren't
always using std-like containers. There are several places that look like
this:

   Value *array = { v1, v2, v3, v4};
   blah = new CallInst(...,&array[0], 4,...);

Since array doesn't have begin/end members we'd need some kind of
specialization for that case. Iterators nicely avoid this issue.

I agree that the SFINAE technique is complicated and esoteric. I'd rather
find something more intuitive. But then again, it is the way "modern C++" is
heading. It's actually a quite common technique in C++ generic programming.
So it's not really so out of the mainstream.

                                     -Dave

We should just keep the existing constructor, so this isn't a problem. These clients don't have the "dereference end" problem.

-Chris

Duh.

Yep, ok, this sounds good.

                                             -Dave