bug with USRs and fixed-length arrays?

Hi,

The strings returned by clang_getCursorUSR represent fixed-length arrays as pointers, so the USRs for “void Func( char[16] )” and "void “Func( char[32] )” are identical and thus ambiguous. In this case, clang_getCursorDisplayName appears to deal with fixed-size arrays correctly.

There seem to be other issues with fixed-length arrays in USRs, for example “template <size_t N> void mystrlwr( char (&dst)[N] )” is represented as “c:@FT@>1#Nkmystrlwr#& #” (and specializations of this template are also similarly broken).

I thought I’d post here before submitting the above as a bug, in case I’m missing something (post 2 of 4 of this kind today :wink: ).

Thanks!

iestyn Bleasdale-Shepherd

Hi Iestyn,

These two decls are just redeclarations of one function -- see C11 6.7.6.3p7:

A declaration of a parameter as ‘‘array of type’’ shall be adjusted to
‘‘qualified pointer to type’’ [...]

So in my opinion there is no reason for them to have different USRs.

Dmitri

Hi Dmitri - thanks for the response!

The line you quote ends with "...where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation."

This qualification allows disambiguation between my two example functions - as it must, since they may have completely different implementations and may be passed as different function pointers.

Am I missing something here? We use these kinds of overloads in our code, in cases which depend on their resolving to *different* functions.

Thanks,

iestyn

Hi Dmitri - thanks for the response!

The line you quote ends with "...where the type qualifiers (if any) are
those specified within the [ and ] of the array type derivation."

This qualification allows disambiguation between my two example functions
- as it must, since they may have completely different implementations and
may be passed as different function pointers.

According to the rules of C++, the two declarations
  void Func(char array[32]);
and
  void Func(char array[16]);
declare the same function, which has one parameter of type char*. They
declare the same function as
  void Func(char *array);

It is an error to define both of them -- the following should generate a
diagnostic from any conforming C++ compiler:
  void Func(char array[32]) {}
  void Func(char array[16]) {}

And it does so with Clang:

/tmp/redecl.cc:2:6: error: redefinition of 'Func'
void Func(char array[16]) {}
     ^
/tmp/redecl.cc:1:6: note: previous definition is here
void Func(char array[32]) {}
     ^
1 error generated.

(It's true that arrays are not pointers, and pointers are not arrays, but
in the special case of function declarations there is no way to pass an
array by value, and the only way to constrain the size of a passed array is
to pass a pointer or reference to a particular size of array.)

Am I missing something here? We use these kinds of overloads in our code,
in cases which depend on their resolving to *different* functions.

That's not possible in C++. Maybe you can show your actual code?

-- James

Hi Dmitri - thanks for the response!

The line you quote ends with "...where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation."

I don't see how it is relevant -- there are no qualifiers within .

This qualification allows disambiguation between my two example functions - as it must, since they may have completely different implementations and may be passed as different function pointers.

Am I missing something here? We use these kinds of overloads in our code, in cases which depend on their resolving to *different* functions.

This is not valid both in C and C++:
void func(char a[16]) {}
void func(char a[32]) {}

So the overloads you refer to probably have more than one argument and
they differ in those extra arguments.

Dmitri

Ah, you are right! There is an extra ingredient in our usage cases: templates*

For example, we use templates to implement ‘safe’ string functions which use the type of the destination buffer (a fixed-length character array) as the template parameter, rather than passing in the array size as a function parameter (which is error-prone – array size sometimes being confused with array length, for multi-byte characters).

However, here are the USRs that I get in a templatized example case:

template void mystrlwr( buffer &dst ); // c:@FT@>1#Tmystrlwr#&t0.0#

template <> void mystrlwr<char[16]>( char (&dst)[16] ); // c:@F@mystrlwr<# >#&S0_#

template <> void mystrlwr<char[32]>( char (&dst)[32] ); // c:@F@mystrlwr<# >#&S0_#

template <> void mystrlwr<char[64]>( char (&dst)[64] ); // c:@F@mystrlwr<# >#&S0_#

Here, the type buffer type gets reduced to “ “ as a template parameter and “S0_” as a function parameter.

The function parameter may be valid (though according to the quoted specification, shouldn’t it be “*C”?), but the template parameter definitely seems wrong – those are three different functions, all with the same USR.

iestyn

*[ I suspect my use of the Microsoft compiler was clouding the results of my simplified test cases. As you are all no doubt shocked to hear! :wink: ]

I see. This was actually discussed previously:

http://lists.cs.uiuc.edu/pipermail/cfe-dev/2012-August/023628.html

Dmitri

Okiedoke. I have updated the associated bug (http://llvm.org/bugs/show_bug.cgi?id=13575) with this new example.

Thankyou, sirs!

iestyn

Hi,

clang_isCursorDefinition returns false for definitions of non-specialized function templates, as well as for definitions of partially-specialized classes. In contrast, CXIdxDeclInfo::isDefinition is true in these cases.

Even if compilation needs to be deferred in these cases (until the set of full specializations is known), should these cursors not still be counted as definitions? (for the sake of identifying the 'defining cursor')

Again, posting here to double-check before submitting the above as a bug...

Thanks!

iestyn Bleasdale-Shepherd

FYI I discovered another USR error case and have added that to the bug too (is that reasonable, or is it preferred to split into multiple bugs?).

In this error case, the representation of function pointer types is too terse; there needs to be a terminator after the parameter list to avoid ambiguity. Here's my simple repro example:

  typedef void (*FuncPtrA_)();
  typedef void (*FuncPtrB_)( int );
  typedef FuncPtrA_ (*FuncPtrA)( int );
  typedef FuncPtrB_ (*FuncPtrB)();
  void Func( FuncPtrA ) {} // c:@F@Func#*F*FvI#
  void Func( FuncPtrB ) {} // c:@F@Func#*F*FvI#

iestyn