AST representation for parentheses inside types.

Hello.

We noticed some time ago that the AST representation used in clang does
not allow for the representation of those pairs of parentheses that can
occur inside a syntactic type. For instance, the following C code:

void ((((*fun_ptr))))(void);
int (*(int_arr_ptr))[15];
void (((*((*fun_ptr_arr_ptr)[10]))))(int, int);
int ((a));

is printed (using -ast-print) as follows

void (*fun_ptr)(void);
int (*int_arr_ptr)[15];
void (*(*fun_ptr_arr_ptr)[10])(int, int);
int a;

The printed parentheses are not directly represented in the AST: they
are inserted by the pretty printer (to respect precedence).

Are there plans to extend the AST in order to faithfully represent the
code above?

To our eyes, it seems that the Type hierarchy should be enriched by
adding a ParenType derived class (similar to the ParenExpr class for the
Expr hierarchy). A ParenType would always be a non-canonical type and
the corresponding ParenTypeLoc would provide locations for the two
parentheses. At the parser level, we would have another kind of
DeclaratorChunk.

Would this approach make sense?
Are there simpler ways to achieve the same effect?

Cheers,
Enea.

Hello.

We noticed some time ago that the AST representation used in clang does
not allow for the representation of those pairs of parentheses that can
occur inside a syntactic type. For instance, the following C code:

void ((((*fun_ptr))))(void);
int (*(int_arr_ptr))[15];
void (((*((*fun_ptr_arr_ptr)[10]))))(int, int);
int ((a));

is printed (using -ast-print) as follows

void (*fun_ptr)(void);
int (*int_arr_ptr)[15];
void (*(*fun_ptr_arr_ptr)[10])(int, int);
int a;

The printed parentheses are not directly represented in the AST: they
are inserted by the pretty printer (to respect precedence).

Are there plans to extend the AST in order to faithfully represent the
code above?

Not that I know of.

To our eyes, it seems that the Type hierarchy should be enriched by
adding a ParenType derived class (similar to the ParenExpr class for the
Expr hierarchy). A ParenType would always be a non-canonical type and
the corresponding ParenTypeLoc would provide locations for the two
parentheses. At the parser level, we would have another kind of
DeclaratorChunk.

Would this approach make sense?

Yes, this approach makes sense to me.

  - Doug

OK, here is attached a patch along the sketch mentioned above.
It passes all clang tests but a single one.

The failing test is something related to ObjC, on which I have very
little confidence. Apparently, it seems that we should adjust the
expected output, which currently is as follows:

FunctionDecl:{ResultType void}{TypedText f}{LeftParen (}{Placeholder
^int(int x, int y)block}{RightParen )} (50)

whereas we now obtain an extra pair of parentheses:

FunctionDecl:{ResultType void}{TypedText f}{LeftParen (}{Placeholder int
(^)(int, int)}{RightParen )} (50)

We look forward for suggestions/corrections/etc.

Enea.

ParenType.patch (32.9 KB)

Hello.

We noticed some time ago that the AST representation used in clang does
not allow for the representation of those pairs of parentheses that can
occur inside a syntactic type. For instance, the following C code:

void ((((*fun_ptr))))(void);
int (*(int_arr_ptr))[15];
void (((*((*fun_ptr_arr_ptr)[10]))))(int, int);
int ((a));

is printed (using -ast-print) as follows

void (*fun_ptr)(void);
int (*int_arr_ptr)[15];
void (*(*fun_ptr_arr_ptr)[10])(int, int);
int a;

The printed parentheses are not directly represented in the AST: they
are inserted by the pretty printer (to respect precedence).

Are there plans to extend the AST in order to faithfully represent the
code above?

Not that I know of.

To our eyes, it seems that the Type hierarchy should be enriched by
adding a ParenType derived class (similar to the ParenExpr class for the
Expr hierarchy). A ParenType would always be a non-canonical type and
the corresponding ParenTypeLoc would provide locations for the two
parentheses. At the parser level, we would have another kind of
DeclaratorChunk.

Would this approach make sense?

Yes, this approach makes sense to me.

  - Doug

OK, here is attached a patch along the sketch mentioned above.
It passes all clang tests but a single one.

Looks great, with one comment below about the failing test.

The failing test is something related to ObjC, on which I have very
little confidence. Apparently, it seems that we should adjust the
expected output, which currently is as follows:

FunctionDecl:{ResultType void}{TypedText f}{LeftParen (}{Placeholder
^int(int x, int y)block}{RightParen )} (50)

whereas we now obtain an extra pair of parentheses:

FunctionDecl:{ResultType void}{TypedText f}{LeftParen (}{Placeholder int
(^)(int, int)}{RightParen )} (50)

This probably just means that we need to look through ParenTypeLocs within the FormatFunctionParameter routine in SemaCodeComplete.cpp.

We look forward for suggestions/corrections/etc.

How many parenthesized types do you see in a typical .c or .cpp file?

  - Doug

Hello.

We noticed some time ago that the AST representation used in clang does
not allow for the representation of those pairs of parentheses that can
occur inside a syntactic type. For instance, the following C code:

void ((((*fun_ptr))))(void);
int (*(int_arr_ptr))[15];
void (((*((*fun_ptr_arr_ptr)[10]))))(int, int);
int ((a));

is printed (using -ast-print) as follows

void (*fun_ptr)(void);
int (*int_arr_ptr)[15];
void (*(*fun_ptr_arr_ptr)[10])(int, int);
int a;

The printed parentheses are not directly represented in the AST: they
are inserted by the pretty printer (to respect precedence).

Are there plans to extend the AST in order to faithfully represent the
code above?

Not that I know of.

To our eyes, it seems that the Type hierarchy should be enriched by
adding a ParenType derived class (similar to the ParenExpr class for the
Expr hierarchy). A ParenType would always be a non-canonical type and
the corresponding ParenTypeLoc would provide locations for the two
parentheses. At the parser level, we would have another kind of
DeclaratorChunk.

Would this approach make sense?

Yes, this approach makes sense to me.

  - Doug

OK, here is attached a patch along the sketch mentioned above.
It passes all clang tests but a single one.

Looks great, with one comment below about the failing test.

The failing test is something related to ObjC, on which I have very
little confidence. Apparently, it seems that we should adjust the
expected output, which currently is as follows:

FunctionDecl:{ResultType void}{TypedText f}{LeftParen (}{Placeholder
^int(int x, int y)block}{RightParen )} (50)

whereas we now obtain an extra pair of parentheses:

FunctionDecl:{ResultType void}{TypedText f}{LeftParen (}{Placeholder int
(^)(int, int)}{RightParen )} (50)

This probably just means that we need to look through ParenTypeLocs within the FormatFunctionParameter routine in SemaCodeComplete.cpp.

Right, we were missing that source file.
Now even that one test passes.

We look forward for suggestions/corrections/etc.

How many parenthesized types do you see in a typical .c or .cpp file?

  - Doug

Well, of course the answer will depend on the meaning of "typical", but
anyway my guess is that they should be very few. Is it just curiosity or
are you really worried about the memory space impact?

Enea.

Hello.

We noticed some time ago that the AST representation used in clang does
not allow for the representation of those pairs of parentheses that can
occur inside a syntactic type. For instance, the following C code:

void ((((*fun_ptr))))(void);
int (*(int_arr_ptr))[15];
void (((*((*fun_ptr_arr_ptr)[10]))))(int, int);
int ((a));

is printed (using -ast-print) as follows

void (*fun_ptr)(void);
int (*int_arr_ptr)[15];
void (*(*fun_ptr_arr_ptr)[10])(int, int);
int a;

The printed parentheses are not directly represented in the AST: they
are inserted by the pretty printer (to respect precedence).

Are there plans to extend the AST in order to faithfully represent the
code above?

Not that I know of.

To our eyes, it seems that the Type hierarchy should be enriched by
adding a ParenType derived class (similar to the ParenExpr class for the
Expr hierarchy). A ParenType would always be a non-canonical type and
the corresponding ParenTypeLoc would provide locations for the two
parentheses. At the parser level, we would have another kind of
DeclaratorChunk.

Would this approach make sense?

Yes, this approach makes sense to me.

  - Doug

OK, here is attached a patch along the sketch mentioned above.
It passes all clang tests but a single one.

Looks great, with one comment below about the failing test.

The failing test is something related to ObjC, on which I have very
little confidence. Apparently, it seems that we should adjust the
expected output, which currently is as follows:

FunctionDecl:{ResultType void}{TypedText f}{LeftParen (}{Placeholder
^int(int x, int y)block}{RightParen )} (50)

whereas we now obtain an extra pair of parentheses:

FunctionDecl:{ResultType void}{TypedText f}{LeftParen (}{Placeholder int
(^)(int, int)}{RightParen )} (50)

This probably just means that we need to look through ParenTypeLocs within the FormatFunctionParameter routine in SemaCodeComplete.cpp.

Right, we were missing that source file.
Now even that one test passes.

Great!

We look forward for suggestions/corrections/etc.

How many parenthesized types do you see in a typical .c or .cpp file?

  - Doug

Well, of course the answer will depend on the meaning of "typical", but
anyway my guess is that they should be very few. Is it just curiosity or
are you really worried about the memory space impact?

Just curious; you can go ahead and commit this. Parenthesized types aren't common enough to worry about.

  - Doug

Hah! So you say. I’ll see you that overconfidence and raise you a testcase from Boost’s boost/math/special_functions/fpclassify.hpp:

% cat t.cc
template inline bool (f)(T x) {
return false;
}

template inline T g(const T& v) {
if (!(::f)(v))
return 0;
return v;
}

void test() {
int i;
g(i);
}

Crashes after this patch. I’m looking into it.

It has been fixed in r121720.

Thanks for report.