My first patch to clang

Hello,

as promised I've started to look at how clang works under the hood.

My first "kid job" was to implement a simple diagnostic improvement that suggests to use -> instead of . if the base expression is a pointer. You find the patch attached.

In the last mail, Douglas told me to send trivial patches to cfe-commits but I've prefered to post it here because it's the first one and I'm unsure about a couple of things.

1) Have I chosen the right locations for the caret, the range and the substitution? The original error used to put the caret _after_ the dot, but I put the caret _at_ the dot, because I think it's more reasonable.

2) Is it ok to add a new diagnostic string in this case? Are the string and the enum name appropriate?

3) Any other hint?

Thanks,

Nicola

member_pointer_suggestion.patch (2.2 KB)

Hello,

as promised I've started to look at how clang works under the hood.

My first "kid job" was to implement a simple diagnostic improvement that suggests to use -> instead of . if the base expression is a pointer. You find the patch attached.

Great idea! We have some similar-in-concept diagnostics/fixits, but not this one specifically.

I see a few issues with this patch, mostly minor.

The first is that we like to have reasonably high confidence in our suggestions. There are plenty of pointer base types where changing '.' into '->' won't actually help; for example, int*, struct foo**, etc. We shouldn't recommend using '->' unless the base is specifically a pointer to a record type.

The second is that fixits should be aligned with error-recovery whenever possible. In this case, that means that if we're doing to give a fixit for transforming the '.' into the '->', we should then proceed to do the member lookup, build the expression, etc. as if the user had typed '->'. It looks like you can make that work very easily: just (1) pass IsArrow to LookupMemberExpr by reference instead of by value, (2) try to diagnose this error *before* the block in LookupMemberExpr that calls LookupMemberInRecord, and (3) set IsArrow = true when you diagnose.

In the last mail, Douglas told me to send trivial patches to cfe-commits but I've prefered to post it here because it's the first one and I'm unsure about a couple of things.

1) Have I chosen the right locations for the caret, the range and the substitution? The original error used to put the caret _after_ the dot, but I put the caret _at_ the dot, because I think it's more reasonable.

Your locations look good. Pointing the caret at the operator is fine in this case, since the presumed error is with the operator, not with the member or the base expression.

You should test with -fixit to make sure your suggestion actually works.

2) Is it ok to add a new diagnostic string in this case? Are the string and the enum name appropriate?

They look fine.

Thanks!

John.

"pointer to a record type for which the field would be valid". We might as well test that the field makes sense as well.

Thanks again for working on this, it's great to get this enhancement,

-Chris

If we don't do the validity check, then we can double-recover from a mistyped:

   struct X {
     void setValue(int);
   };

   void f(X *xp) {
     xp.SetValue(17);
   }

:slight_smile:

  - Doug

Chris Lattner wrote:

"pointer to a record type for which the field would be valid". We might as well test that the field makes sense as well.

Thanks again for working on this, it's great to get this enhancement,

You need also to check if class has overloaded operator->.

Look at this example:

struct a
{
  int b;
};

struct c
{
  a * operator->();

  int d;
};

int main()
{
  c * c;
  c.d; // do not try to suggest replacement for c->d (!)
  return 0;
}

Ivan Sorokin wrote:

Chris Lattner wrote:

"pointer to a record type for which the field would be valid". We
might as well test that the field makes sense as well.

Thanks again for working on this, it's great to get this enhancement,

You need also to check if class has overloaded operator->.

No he doesn't. That's because, if you have a pointer to an object that
has an overloaded operator->(), using the -> operator on the pointer
doesn't cause the operator->() function to get called. I think it's best
shown with an example; continuing your example:

int main(void)
{
    c * c;

    c.d; // illegal, suggest replacement
    c->d; // dereferences c, accesses c::d
    (*c)->d;// dereferences c, calls c::operator->(), tries to
            // dereference a pointer to an object of type a and access
            // a::d (which fails)
    (*c)->b;// dereferences c, calls c::operator->(), dereferences a
            // pointer to an object of type a and accesses a::b

    return 0;
}

Chip

Even if he does, his patch is still a very nice improvement over what we have. Handling operator-> can be a follow-on :slight_smile:

-Chris

Charles Davis wrote:

No he doesn't. That's because, if you have a pointer to an object that
has an overloaded operator->(), using the -> operator on the pointer
doesn't cause the operator->() function to get called. I think it's best
shown with an example;

Yes. You are right.

That isn't quite what I meant. You've inserted it like this:

  if (const RecordType *RTy = BaseType->getAs<RecordType>()) {
+
+ if(!IsArrow && BaseType->isPointerType()) {

This can never be satisfied, because it's inside an 'if' statement that checks whether the type is a record type. A type can't be both a record type and a pointer type!

Yes, that's a stupid mistake..

Exactly. You should change the input to look exactly like it would look if the user had typed '->' instead. That's why you want your check to run before LookupMemberExpr tries to adjust the base type to the pointee type and so on.

I think I've understood what you mean. I've attached Yet Another Patch :slight_smile: In the meantime, I noticed that the case of a '.' used for a pointer was similar, so I generalized the suggestion. Now I think it works, because this source code:

#include <stdio.h>

struct type
{
  int member;
};

int main()
{
        struct type *p_struct;
        int *p_int;
        int a_int;
        struct type a_struct;

        p_struct.member = 7;
        p_int.member = 7;
        a_struct->member = 7;
        a_int->member = 7;

        return 0;
}

Produces this output:

prova.c:15:10: error: member reference type 'struct type *' is a pointer; maybe you meant to use '->'?
        p_struct.member = 7;
        ~~~~~~~~^
                ->
prova.c:16:8: error: member reference base type 'int *' is not a structure or union
        p_int.member = 7;
        ~~~~~ ^
prova.c:17:10: error: member reference type 'struct type' is not a pointer; maybe you meant to use '.'?
        a_struct->member = 7;
        ~~~~~~~~^~
                .
prova.c:18:9: error: member reference type 'int' is not a pointer
        a_int->member = 7;
        ~~~~~ ^
4 diagnostics generated.

I think it's the right output. The only thing I can't understand is: All the four errors use BaseExpr->getSourceRange() to get the range of the base expressions, but in the first and the third errors (the two added by my patch), the range is wrong because also includes the arrow (or the dot). Is it my fault?

Bye bye,

Nicola

member_pointer_suggestion.patch (3.86 KB)

Exactly! What I meant is that in these four diagnostics, the source range is always obtained with BaseExpr->getSourceRange() (and BaseExpr doesn't change in the meantime). However, in two of the four cases, the range highlighted in the output also includes the '->' or '.'. I think this isn't correct and I can't understand why this happens. Is it my fault?

Anyway, how about the patch in general? Is it ok?

Thanks,

Nicola

Exactly. You should change the input to look exactly like it would look if the user had typed '->' instead. That's why you want your check to run before LookupMemberExpr tries to adjust the base type to the pointee type and so on.

4 diagnostics generated.

This seems like good output.

I think it's the right output. The only thing I can't understand is: All the four errors use BaseExpr->getSourceRange() to get the range of the base expressions, but in the first and the third errors (the two added by my patch), the range is wrong because also includes the arrow (or the dot). Is it my fault?

The replacement fixit causes the original text to be underlined. Don't worry about it.

<member_pointer_suggestion.patch>

The functionality looks good! Please use tabs instead of spaces, and don't put spaces inside 'if' conditions like this:

+ else if ( BaseType->isRecordType() ) {

Do you need someone to commit this for you?

John.

Er, I think you swapped this. Clang uses spaces only.

Sebastian

you probably means use spaces instead of tabs, isn't it?

Yes, I mistyped this, thank you. No tabs, please.

John.