clang-tidy FixItHint::CreateRemoval -- trailing semicolon stays

Hopefully a simple question. I’m removing a using namespace directive from the code in a simple check. so

using namespace std;

gets removed from a C++ file my code however ends up with

;

when I run

clang-tidy -checks=-,surelogic -fix-errors test1.cpp – -std=c++14
4 warnings generated.
/home/tim/Source/llvm-work/test1.cpp:8:1: warning: do not use namespace using-directives; use using-declarations instead [surelogic-readability-using]
using namespace std;
^
/home/tim/Source/llvm-work/test1.cpp:8:1: note: FIX-IT applied suggested code changes
using namespace std;
^
<>
clang-tidy applied 2 of 2 suggested fixes.

The non-boilerplate code is below:

void UsingCheck::registerMatchers(ast_matchers::MatchFinder *Finder) {
// Only register the matchers for C++; the functionality currently does not
// provide any benefit to other languages, despite being benign.
if (getLangOpts().CPlusPlus)
Finder->addMatcher(usingDirectiveDecl().bind(“usingNamespace”), this);
}

void
UsingCheck::check(const MatchFinder::MatchResult &Result) {
const auto *U = Result.Nodes.getNodeAs(“usingNamespace”);
SourceLocation Loc = U->getLocStart();
if (U->isImplicit() || !Loc.isValid())
return;

diag(Loc, "do not use namespace using-directives; "
“use using-declarations instead”) << FixItHint::CreateRemoval(U->getSourceRange());
}

(yes this is clone of the google check, just to learn)

But I’m wondering what to do to get the semicolon. I don’t see it in the AST dump output…below

<>

-FunctionDecl 0x482c160 <mod.cpp:5:1, col:37> col:5 used foo ‘int (class Point)’

-ParmVarDecl 0x482c098 <col:9, col:15> col:15 used p ‘class Point’
-CompoundStmt 0x482c2b0 <col:18, col:37> -ReturnStmt 0x482c298 <col:20, col:34>
-CXXMemberCallExpr 0x482c270 <col:27, col:34> 'int' -MemberExpr 0x482c238 <col:27, col:29> ‘’ .getX 0x482b178
-DeclRefExpr 0x482c210 <col:27> 'class Point' lvalue ParmVar 0x482c098 'p' 'class Point' << The using directive is next >> -UsingDirectiveDecl 0x482c2d0 <line:8:1, col:17> col:17 Namespace 0x409b938 'std' << next a function decl in my test code>> -FunctionDecl 0x482c340 <line:10:1, line:41:1> line:10:5 main ‘int (void)’
`-CompoundStmt 0x486d200 <col:12, line:41:1>

Any suggestions?

Unfortunately clang doesn’t have the location of the semicolon in statements, mostly because Statements are the base class of Expressions (for historical reasons that do not apply any more).

Thus, fixits that want to get to the semicolon need to use the Lexer to get at it. Looping in clang-tidy folks for whether we have infrastructure in clang-tidy for that already…

Thanks, after looking into this your answer makes sense to me. What I ended up doing is shown in the snippet below. As you might guess this code is working to remove using namespace and provide fixes to all uses in the translation unit (thus it guards against any changes in an included file – focusing just on the main translation unit).

Are there any good examples of applying multiple fixits for a problem (can I just stream many in?). Just wondering as I have not tried that out much yet.

Thanks again, Tim Halloran

My draft code to deal with “;”

bool UsingCheck::inHeader(clang::SourceLocation &Loc,
const clang::SourceManager *sm) {
return sm->getMainFileID() != sm->getFileID(Loc);
}

void UsingCheck::check(
const clang::ast_matchers::MatchFinder::MatchResult &Result) {
const UsingDirectiveDecl *U =
Result.Nodes.getNodeAs(“using”);
if (U) {
clang::SourceLocation Loc{U->getLocStart()};
if (U->isImplicit() || !Loc.isValid() ||
inHeader(Loc, Result.SourceManager))
return;

clang::SourceRange Range{U->getSourceRange()};

// There is a problem with the returned range – it misses the “;”.
// For example, for “using namespace std;” we don’t get the trailing “;”.
// So the code below searches for a semicolon starting at the last token
// of the node and expands the sourcerange so that the entire statement
// may be removed. We should always find a semicolon.
int SemiIndex = 1;
bool foundSemi = false;
while (!foundSemi) {
clang::SourceLocation PastEndLoc{
U->getLocEnd().getLocWithOffset(SemiIndex)};
clang::SourceRange RangeForString{PastEndLoc};
CharSourceRange CSR = Lexer::makeFileCharRange(
CharSourceRange::getTokenRange(RangeForString), *Result.SourceManager,
Result.Context->getLangOpts());
std::string possibleSemi =
Lexer::getSourceText(CSR, *Result.SourceManager,
Result.Context->getLangOpts())
.str();
if (possibleSemi == “;”) {
// Found a “;” so expand the range for our fixit below.
Range.setEnd(PastEndLoc);
foundSemi = true;
} else {
SemiIndex++;
}
}

diag(Loc, "do not use namespace using-directives; "
“use using-declarations instead”)
<< FixItHint::CreateRemoval(Range);
}
<>

Hi Tim,

Iterating through the SourceLocations is a nice solution, but I think the check might accidentally catch the semicolon in the comment IIUC.

Example:
using namespace std /* ; */ ;

To avoid that you might want to get tok::TokenKind and compare it against tok::semi, which corresponds to the semicolon. To see how this works, refer to clang-tidy readability-braces-around-statements check. It kind of deals with the same “problem”.

Though I’m dealing with the same at the moment, so I guess it’d be nice to get a “universal solution” and put it somewhere (in clang-tidy?) at some point, because some clang-tidy checks are going through that and having a common solution would reduce the amount of possible bugs caused by different implementations etc.

Thanks, after looking into this your answer makes sense to me. What I
ended up doing is shown in the snippet below. As you might guess this code
is working to remove using namespace and provide fixes to all uses in the
translation unit (thus it guards against any changes in an included file --
focusing just on the main translation unit).

Are there any good examples of applying multiple fixits for a problem (can
I just stream many in?). Just wondering as I have not tried that out much
yet.

You can supply many fixes to the same warning as long as they don't overlap.

Thanks again, Tim Halloran

My draft code to deal with ";"

bool UsingCheck::inHeader(clang::SourceLocation &Loc,
                          const clang::SourceManager *sm) {
  return sm->getMainFileID() != sm->getFileID(Loc);

It's an incorrect way to detect headers, since the check can run on the
header directly and then getMainFileID will correspond to the header.
There's a set of utilities being added to ClangTidy (
http://reviews.llvm.org/D16113) to use filename extensions for this purpose.

}

void UsingCheck::check(
    const clang::ast_matchers::MatchFinder::MatchResult &Result) {
  const UsingDirectiveDecl *U =
      Result.Nodes.getNodeAs<UsingDirectiveDecl>("using");
  if (U) {
    clang::SourceLocation Loc{U->getLocStart()};
    if (U->isImplicit() || !Loc.isValid() ||
        inHeader(Loc, Result.SourceManager))
      return;

    clang::SourceRange Range{U->getSourceRange()};

    // There is a problem with the returned range -- it misses the ";".
    // For example, for "using namespace std;" we don't get the trailing
";".
    // So the code below searches for a semicolon starting at the last
token
    // of the node and expands the sourcerange so that the entire statement
    // may be removed. We should always find a semicolon.
    int SemiIndex = 1;
    bool foundSemi = false;
    while (!foundSemi) {
      clang::SourceLocation PastEndLoc{
          U->getLocEnd().getLocWithOffset(SemiIndex)};
      clang::SourceRange RangeForString{PastEndLoc};
      CharSourceRange CSR = Lexer::makeFileCharRange(
          CharSourceRange::getTokenRange(RangeForString),
*Result.SourceManager,
          Result.Context->getLangOpts());
      std::string possibleSemi =
          Lexer::getSourceText(CSR, *Result.SourceManager,
                               Result.Context->getLangOpts())
              .str();

No need to convert this to std::string. There's actually a better way to
find the next semicolon, see the usage of Lexer::getRawToken in
NamespaceCommentCheck.cpp, for example. We should probably add something
more convenient for this purpose to clang-tidy/utils/LexerUtils.h.

      if (possibleSemi == ";") {

I ran into similar problem with semicolons a while ago.

I ended up doing something similar to your code snippet, but I copy-pasted some existing code from a different part of clang codebase:
lib/ARCMigrate/Transforms.cpp - see funcions findSemiAfterLocation() and findLocationAfterSemi().

Perhaps it would be a good idea to expose these functions as part of public API.

Are there any good examples of applying multiple fixits for a problem (can I just stream many in?). Just wondering as I have not tried that out much yet.

Yes, you can add many fix it hints for a given diagnostic. For example:

auto Diagnostic = diag(Loc, “diagnostic”);
Diagnostic.AddFixItHint(FixItHint::CreateInsertion(//));
Diagnostic.AddFixItHint(FixItHint::CreateRemoval(//));

This can also be achieved through overloaded operator<< which is just a wrapper for the same function calls.

Best regards,
Piotr Dziwinski

Yes, you can add many fix it hints for a given diagnostic. For example:

  auto Diagnostic = diag(Loc, "diagnostic");
  Diagnostic.AddFixItHint(FixItHint::CreateInsertion(/*...*/));
  Diagnostic.AddFixItHint(FixItHint::CreateRemoval(/*...*/));

This can also be achieved through overloaded operator<< which is just a
wrapper for the same function calls.

This works during the same call to "check" However if I stash the pointer
to the "Diagnostic" variable (e.g., in a map) and lookup and use the
reference later I'm getting a core dump. I'm guessing that is because the
object created by "diag()" call is not intended to be stashed.

Is there a way to get a clang::DiagnosticBuilder object that can be safely
stashed?

What I am doing is first removing somthing like

using namespace sl;

from the code

then adding qualifiers for each use, e.g., "foo" becomes "sl::foo"

so the "add qualifier" part gets identified in later calls to "check"

Best regards,
Piotr Dziwinski

Thanks!
Tim

As far as I know, diagnostics are meant to be generated on the fly and pushed out to the diagnostics handler as soon as possible.
I don’t know if it’s possible to store any long-lived reference to them.

As a workaround, you can try a different approach - store additional state in your check class and update it on each check() call.
Then, once you have complete context information about the code you want to change, create a full diagnostic with all needed data.

This is actually what I do in some of my checks, especially the localizing variables check that I’m still working on:
https://github.com/piotrdz/clang-tools-extra/blob/master/clang-tidy/readability/LocalizingVariablesCheck.cpp

Best regards,
Piotr Dziwinski

As a workaround, you can try a different approach - store additional state
in your check class and update it on each check() call.
Then, once you have complete context information about the code you want
to change, create a full diagnostic with all needed data.

This is actually what I do in some of my checks, especially the localizing
variables check that I'm still working on:

https://github.com/piotrdz/clang-tools-extra/blob/master/clang-tidy/readability/LocalizingVariablesCheck.cpp

Yes this is a good idea, thanks. Sadly I need to do it at the very end
(your code seems to be able to trigger output (call diag()) based upon
function declaration scope.

I'm trying out using the destruction of the check class to report all the
issues. So, for example:

clang::SourceLocation foo;

UsingCheck::~UsingCheck() {
  std::cerr << "** destroy of UsingCheck instance\n";
  diag(foo, "bogus error for testing");
}

where I just stash a SourceLocation. The code above works (a simple
try-it-out test). I'm not sure if a destructor of a ClangTidyCheck class is
allowed to do this. And I still need to stash SourceLocation and
SourceRange objects for this to work.

Best regards,

Piotr Dziwinski

Thank you so much for the input, and the cool code pointer.

Yes this is a good idea, thanks. Sadly I need to do it at the very end (your code seems to be able to trigger output (call diag()) based upon function declaration scope.

I'm trying out using the destruction of the check class to report all the issues. So, for example:

clang::SourceLocation foo;

UsingCheck::~UsingCheck() {
  std::cerr << "** destroy of UsingCheck instance\n";
  diag(foo, "bogus error for testing");
}

where I just stash a SourceLocation. The code above works (a simple try-it-out test). I'm not sure if a destructor of a ClangTidyCheck class is allowed to do this. And I still need to stash SourceLocation and SourceRange objects for this to work.

It's safe to stash SourceLocation and SourceRange objects, as long as the SourceManager object with which they are associated lives on. In clang-tidy, that would be until the check is finished on current translation unit and then destroyed.

But I wouldn't depend on such tricky behavior in the destructor. A better place to handle it would be in function onEndOfTranslationUnit() overridden from MatchFinder::MatchCallback.

Best regards,
Piotr Dziwinski

My draft code to deal with ";"

bool UsingCheck::inHeader(clang::SourceLocation &Loc,
                          const clang::SourceManager *sm) {
  return sm->getMainFileID() != sm->getFileID(Loc);

It's an incorrect way to detect headers, since the check can run on the
header directly and then getMainFileID will correspond to the header.
There's a set of utilities being added to ClangTidy (
http://reviews.llvm.org/D16113) to use filename extensions for this
purpose.

Thanks for the pointer, I looked this HeaderFileExtensionsUtils code over
(on trunk) and it seems to examine filename suffix. It also seem
configurable but not sure how (you set an option with "h, H" -- it would be
nice to have a comment showing how to do this or documentation -- albeit I
might not have found that yet).

However, that said, my check removes all included files, but, as you say,
would not handle being run directly on a header file. However, my check
seems necessary but not sufficient. If I changed it to use
HeaderFileExtensionsUtils I don't see how includes like "string" or
"iostream" would be excluded in my check?

To fix my code I probably should check that the extension of the main file
is "cpp" or "cc" or "cxx" -- is there a utility for this? In face looking
over HeaderFileExtensionsUtils I wonder why it isn't just
FileExtensionsUtils -- it could consider any extensions really.

Thanks for the pointer!

Best,
Tim

Uggh, disregard the below – it appears the clang-tidy infrastructure handles nested files. Is there some sort of AST filtering in matching?

Basically, I copied the code from (trunk) from google/UnnamedNamespaceInHeaderCheck and it seems to behave okay (even when targeted at header files.