I’ve recently been looking into a bug in clang format, that incorrectly treats * as a TT_PointerOrReference
void f() { operator+(a, b *b); }
Actually the misinterpretation of * and & as BinaryOperators or PointerOrReference is a major source of bugs for clang-format, this is because we don’t have semantic information and often we are looking very myopically at adjacent tokens and there just isn’t enough context to disambiguate.
So whilst we see b * b this a easily (b x b)
This is actually just
tok::identifier->tok::star->tok::identifier
Which is indistinguishable from
MyClass *b
which I think we’d all see as (class ptr b), but in all cases we have no additional information, other than perhaps scanning up and down the line to find something relevant.
This literally drops out the bottom of determineStarAmpUsage() with a
return TT_PointerOrReference;
For the bug I’m looking at, given the following example the first f() puts the * as a pointer when actually its a multiply;
void f() { operator+(Bar, Foo *Foo); }
class A {
void operator+(Bar, Foo *Foo);
}
Effectively these two instances of operator appears as the same with almost no difference in terms of tokens,
Looking at the token annotations, there is almost nothing that distinguishes between the two.
AnnotatedTokens(L=0):
M=0 C=0 T=Unknown S=1 F=0 B=0 BK=0 P=0 Name=void L=4 PPK=2 FakeLParens= FakeRParens=0 II=0x1baa688a118 Text=‘void’
M=0 C=1 T=FunctionDeclarationName S=1 F=0 B=0 BK=0 P=80 Name=identifier L=6 PPK=2 FakeLParens= FakeRParens=0 II=0x1baa68bd9c8 Te
xt=‘f’
M=0 C=0 T=Unknown S=0 F=0 B=0 BK=0 P=23 Name=l_paren L=7 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=‘(’
M=0 C=0 T=Unknown S=0 F=0 B=0 BK=0 P=140 Name=r_paren L=8 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=‘)’
M=0 C=0 T=FunctionLBrace S=1 F=0 B=0 BK=1 P=23 Name=l_brace L=10 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=‘{’