Question of RecursiveASTVisitor class

Dear all:

My name is Xiaohui Chen, a computer science student in UWO in canada and i am using Clang as the frontend of my project, but i am confusing of the following statements.
PS: i am a newbie.

These tasks are done by three groups of methods, respectively:
00089 ///   1. TraverseDecl(Decl *x) does task #1.  It is the entry point
00090 ///      for traversing an AST rooted at x.  This method simply
00091 ///      dispatches (i.e. forwards) to TraverseFoo(Foo *x) where Foo
00092 ///      is the dynamic type of *x, which calls WalkUpFromFoo(x) and
00093 ///      then recursively visits the child nodes of x.
00094 ///      TraverseStmt(Stmt *x) and TraverseType(QualType x) work
00095 ///      similarly.
00096 ///   2. WalkUpFromFoo(Foo *x) does task #2.  It does not try to visit
00097 ///      any child node of x.  Instead, it first calls WalkUpFromBar(x)
00098 ///      where Bar is the direct parent class of Foo (unless Foo has
00099 ///      no parent), and then calls VisitFoo(x) (see the next list item).
00100 ///   3. VisitFoo(Foo *x) does task #3.
00101 ///
00102 /// These three method groups are tiered (Traverse* > WalkUpFrom* >
00103 /// Visit*).  A method (e.g. Traverse*) may call methods from the same
00104 /// tier (e.g. other Traverse*) or one tier lower (e.g. WalkUpFrom*).
00105 /// It may not call methods from a higher tier.

According to the above statement, the calling relationship between these
functions are organized in this way in general:

Traversal*()
{

WalkUpFrom*();

}

WalkUpFrom*()
{

Visit*();

}

am i right?

For this statement:

00096 ///   2. WalkUpFromFoo(Foo *x) does task #2.  It does not try to visit
00097 ///      any child node of x.  Instead, it first calls WalkUpFromBar(x)
00098 ///      where Bar is the direct parent class of Foo (unless Foo has
00099 ///      no parent), and then calls VisitFoo(x) (see the next list item).

what do you mean by saying “the direct parent class of Foo”?
why i need to visit the parent class before i visit the current class?
what is the purpose? Could you please give me a short example?

Thank you in advance!

Sincerely
xiaohui

Dear all:

My name is Xiaohui Chen, a computer science student in UWO in canada and i am using Clang as the frontend of my project, but i am confusing of the following statements.
PS: i am a newbie.

These tasks are done by three groups of methods, respectively:
00089 ///   1. TraverseDecl(Decl *x) does task #1.  It is the entry point
00090 ///      for traversing an AST rooted at x.  This method simply
00091 ///      dispatches (i.e. forwards) to TraverseFoo(Foo *x) where Foo
00092 ///      is the dynamic type of *x, which calls WalkUpFromFoo(x) and
00093 ///      then recursively visits the child nodes of x.
00094 ///      TraverseStmt(Stmt *x) and TraverseType(QualType x) work
00095 ///      similarly.
00096 ///   2. WalkUpFromFoo(Foo *x) does task #2.  It does not try to visit
00097 ///      any child node of x.  Instead, it first calls WalkUpFromBar(x)
00098 ///      where Bar is the direct parent class of Foo (unless Foo has
00099 ///      no parent), and then calls VisitFoo(x) (see the next list item).
00100 ///   3. VisitFoo(Foo *x) does task #3.
00101 ///
00102 /// These three method groups are tiered (Traverse* > WalkUpFrom* >
00103 /// Visit*).  A method (e.g. Traverse*) may call methods from the same
00104 /// tier (e.g. other Traverse*) or one tier lower (e.g. WalkUpFrom*).
00105 /// It may not call methods from a higher tier.

According to the above statement, the calling relationship between these
functions are organized in this way in general:

Traversal*()
{

WalkUpFrom*();

}

WalkUpFrom*()
{

Visit*();

}

am i right?

For this statement:

00096 ///   2. WalkUpFromFoo(Foo *x) does task #2.  It does not try to visit
00097 ///      any child node of x.  Instead, it first calls WalkUpFromBar(x)
00098 ///      where Bar is the direct parent class of Foo (unless Foo has
00099 ///      no parent), and then calls VisitFoo(x) (see the next list item).

what do you mean by saying “the direct parent class of Foo”?

For example, if you are currently traversing a CXXRecordDecl, the direct parent class is the RecordDecl (http://clang.llvm.org/doxygen/classclang_1_1CXXRecordDecl.html)

why i need to visit the parent class before i visit the current class?

You don’t need to visit anything - the RAV is doing the visitation for you.
You only need to know those details if you plan to change the traversal.

what is the purpose? Could you please give me a short example?

This way you can implement the Visit(RecordDecl*) method and get a call to it if the real type is a CXXRecordDecl.

There are two parent-child relationships at play.

  1. Traverse method visits AST nodes that form a tree. CXXRecordDecl is the parent of each of its CXXMethodDecls and FieldDecls. This is what you’ll see with clang -cc1 -ast-dump
  2. WalkUpFrom method visits the class hierarchy of a single AST node. CXXRecordDecl inherits RecordDecl which inherits TagDecl etc. Think of serialization, to serialize CXXRecordDecl you’d first want to serialize everything from the base class and so on recursively.

Thanks for your reply.

I am confusing here:

  1. Traverse method visits AST nodes that form a tree. CXXRecordDecl is the parent of each of its CXXMethodDecls and FieldDecls.

CXXMethodDecls ( FieldDecls ) is not a member of CXXRecordDecl and also does not inherit from CXXRecordDecl, so how could
you define them as parent-child relationship?

  1. WalkUpFrom method visits the class hierarchy of a single AST node. CXXRecordDecl inherits RecordDecl which inherits TagDecl etc.

here TagDecl has three direct parents, so will WalkUpFrom be applied to these three classes?

sincerely
xiaohui

Thanks for your reply.

I am confusing here:

1. Traverse method visits AST nodes that form a tree. CXXRecordDecl is the
parent of each of its CXXMethodDecls and FieldDecls.

CXXMethodDecls ( FieldDecls ) is not a member of CXXRecordDecl and also
does not inherit from CXXRecordDecl, so how could
you define them as parent-child relationship?

CXXRecordDecl contains CXXMethodDecls and RecordDecl contains FieldDecls.
They're just stored in the DeclContext class but are exposed with
method_begin/method_end and field_begin/field_end.

2. WalkUpFrom method visits the class hierarchy of a single AST node.
CXXRecordDecl inherits RecordDecl which inherits TagDecl etc.

here TagDecl has three direct parents, so will WalkUpFrom be applied to
these three classes?

TagDecl does have three super classes but only one of them is also an AST
node. AST is built from declarations, statements and types. DeclContext
represents a declaration context and structs/classes introduce one. They
are also redeclarable (forward declaration) which is what Redeclarable
class keeps track of.

Hi all:

I am writing a source to source tool and adding new keywords in the clang source file(i just care about the AST),
is there a way to just compile the clang frontend source file but not including llvm source file?

Sincerely
xiaohui

I don’t think so. clang uses a lot of LLVM’s core libraries.

Jingyue

The best you can do is compile clang “out of source”. Meaning the clang directory stands on its own and not in llvm/projects. This makes a difference if you’re generating Visual Studio solution file as you won’t have to load llvm projects. But llvm must be built.

i think i get your point and that is what i expect.

you mean that:

  1. make a standalone directory for Clang not in llvm/tools

  2. compile both llvm and clang as usual

  3. if i modify Clang source file, i just execute the Makefile inside clang/ directory, so it
    will just re-compile the Clang file without checking llvm files

is there any instructions to do this?

Especially in step 2, if i compile in this way i guess i will break the dependence relationship in “configure” file?

Best
xiaohui

I only know how to do this with cmake.

… i have no idea of cmake …

i use makefile under linux, so who could help?

Why exactly do you want an out of source build if your using makefiles? It’s beneficial with Visual Studio but I don’t see the reason for it if you’re building from the command line?

I am writing a source to source tool which need to add some new keywords in the clang source file, so this tool can parse
the input file with my new keywords, now this tool is still under developing and debugging, each time when i want to
recompile this tool i do not want to involve llvm stuff, because it will waste some time…

How will it waste time if you don’t change it?

yes, i know, the makefile will skip compiling the code if i do not change them, but it will still
go through some directories and say “nothing to do” i guess.

it is ok, i can use the current solution. Thank you!

I’m not sure how long make takes to run if there’s nothing to compile. But you should definitely check out cmake + ninja.

yes, maybe i am a little out of date, i should dive into cmake and ninja.

hi all:

i am looking into the “main function” of llvm/tools/clang/tools/driver/driver.cpp,
i only add several llvm::outs() in the source code, and re-compile the source code again,
then i use the clang to compile my input file(t.c) like this:

xchen422@dimsum:~$ clang t.c

the output is :

xchen422@dimsum:~$ clang t.c
clang main function start point 1
clang main function start point 3
clang main function start point 4
clang main function start point 1
clang main function start point 2
clang main function start point 5
clang main function start point 6
xchen422@dimsum:~$

It seems that the main() function is called twice, also it seems that there are
some magic in this function(line 468):
Res = TheDriver.ExecuteCompilation(*C, FailingCommands);

i could not understand, could someone give me a brief explanation.

PS:
1). I add llvm::outs() in line 379, line 418, line 422, line 466, line 470 and line 512.
2). t.c file only contains an empty main() function.
3). i paste the main() function below.

int main(int argc_, const char **argv_) {
379 llvm::outs() << “clang main function start point 1” << ‘\n’;
380
381
382 llvm::sys::PrintStackTraceOnErrorSignal();
383 llvm::PrettyStackTraceProgram X(argc_, argv_);
384
385 if (llvm::sys::Process::FixupStandardFileDescriptors())
386 return 1;
387
388 SmallVector<const char *, 256> argv;
389 llvm::SpecificBumpPtrAllocator ArgAllocator;
390 std::error_code EC = llvm::sys::Process::GetArgumentVector(
391 argv, llvm::makeArrayRef(argv_, argc_), ArgAllocator);
392 if (EC) {
393 llvm::errs() << “error: couldn’t get arguments: " << EC.message() << ‘\n’;
394 return 1;
395 }
396
397 std::setstd::string SavedStrings;
398 StringSetSaver Saver(SavedStrings);
399
400 // Determines whether we want nullptr markers in argv to indicate response
401 // files end-of-lines. We only use this for the /LINK driver argument.
402 bool MarkEOLs = true;
403 if (argv.size() > 1 && StringRef(argv[1]).startswith(”-cc1"))
404 MarkEOLs = false;
405 llvm::cl::ExpandResponseFiles(Saver, llvm::cl::TokenizeGNUCommandLine, argv,
406 MarkEOLs);
407
408 // Handle -cc1 integrated tools, even if -cc1 was expanded from a response
409 // file.
410 auto FirstArg = std::find_if(argv.begin() + 1, argv.end(),
411 [](const char *A) { return A != nullptr; });
412 if (FirstArg != argv.end() && StringRef(*FirstArg).startswith("-cc1")) {
413 // If -cc1 came from a response file, remove the EOL sentinels.
414 if (MarkEOLs) {
415 auto newEnd = std::remove(argv.begin(), argv.end(), nullptr);
416 argv.resize(newEnd - argv.begin());
417 }
418 llvm::outs() << “clang main function start point 2” << ‘\n’;
419 return ExecuteCC1Tool(argv, argv[1] + 4);
420 }
421
422 llvm::outs() << “clang main function start point 3” << ‘\n’;
423 bool CanonicalPrefixes = true;
424 for (int i = 1, size = argv.size(); i < size; ++i) {
425 // Skip end-of-line response file markers
426 if (argv[i] == nullptr)
427 continue;
428 if (StringRef(argv[i]) == “-no-canonical-prefixes”) {
429 CanonicalPrefixes = false;
430 break;
431 }
432 }
433
434 // Handle CCC_OVERRIDE_OPTIONS, used for editing a command line behind the
435 // scenes.
436 if (const char *OverrideStr = ::getenv(“CCC_OVERRIDE_OPTIONS”)) {
437 // FIXME: Driver shouldn’t take extra initial argument.
438 ApplyQAOverride(argv, OverrideStr, SavedStrings);
439 }
440
441 std::string Path = GetExecutablePath(argv[0], CanonicalPrefixes);
442
443 IntrusiveRefCntPtr DiagOpts =
444 CreateAndPopulateDiagOpts(argv);
445
446 TextDiagnosticPrinter *DiagClient
447 = new TextDiagnosticPrinter(llvm::errs(), &*DiagOpts);
448 FixupDiagPrefixExeName(DiagClient, Path);
449
450 IntrusiveRefCntPtr DiagID(new DiagnosticIDs());
451
452 DiagnosticsEngine Diags(DiagID, &*DiagOpts, DiagClient);
453 ProcessWarningOptions(Diags, *DiagOpts, /ReportDiags=/false);
454
455 Driver TheDriver(Path, llvm::sys::getDefaultTargetTriple(), Diags);
456 SetInstallDir(argv, TheDriver);
457
458 llvm::InitializeAllTargets();
459 ParseProgName(argv, SavedStrings);
460
461 SetBackdoorDriverOutputsFromEnvVars(TheDriver);
462
463 std::unique_ptr C(TheDriver.BuildCompilation(argv));
464 int Res = 0;
465 SmallVector<std::pair<int, const Command *>, 4> FailingCommands;
466 llvm::outs() << “clang main function start point 4” << ‘\n’;
467 if (C.get())
468 Res = TheDriver.ExecuteCompilation(*C, FailingCommands);
469
470 llvm::outs() << “clang main function start point 5” << ‘\n’;
471 // Force a crash to test the diagnostics.
472 if (::getenv(“FORCE_CLANG_DIAGNOSTICS_CRASH”)) {
473 Diags.Report(diag::err_drv_force_crash) << “FORCE_CLANG_DIAGNOSTICS_CRASH”;
474 const Command *FailingCommand = nullptr;
475 FailingCommands.push_back(std::make_pair(-1, FailingCommand));
476 }
477
478 for (const auto &P : FailingCommands) {
479 int CommandRes = P.first;
480 const Command *FailingCommand = P.second;
481 if (!Res)
482 Res = CommandRes;
483
484 // If result status is < 0, then the driver command signalled an error.
485 // If result status is 70, then the driver command reported a fatal error.
486 // On Windows, abort will return an exit code of 3. In these cases,
487 // generate additional diagnostic information if possible.
488 bool DiagnoseCrash = CommandRes < 0 || CommandRes == 70;
489 #ifdef LLVM_ON_WIN32
490 DiagnoseCrash |= CommandRes == 3;
491 #endif
492 if (DiagnoseCrash) {
493 TheDriver.generateCompilationDiagnostics(*C, FailingCommand);
494 break;
495 }
496 }
497
498 // If any timers were active but haven’t been destroyed yet, print their
499 // results now. This happens in -disable-free mode.
500 llvm::TimerGroup::printAll(llvm::errs());
501
502 llvm::llvm_shutdown();
503
504 #ifdef LLVM_ON_WIN32
505 // Exit status should not be negative on Win32, unless abnormal termination.
506 // Once abnormal termiation was caught, negative status should not be
507 // propagated.
508 if (Res < 0)
509 Res = 1;
510 #endif
511
512 llvm::outs() << “clang main function start point 6” << ‘\n’;
513 // If we have multiple failing commands, we return the result of the first
514 // failing command.
515 return Res;
516 }

Hi,

Please see the Clang FAQ (http://clang.llvm.org/docs/FAQ.html#id2)
which describes the difference between the Clang driver and the Clang
frontend.

Best regards
David

hi Nikola:
i can not dump the openmp node, am i missing some options?
for example:

int main()
{
int k=90, l=0;
#pragma omp parallel private(k, l)
{
#pragma omp for
for(int kk=0; kk<90;kk++)
{
l;
}
}
}

I use the following command to compile:

clang++ -Xclang -ast-dump -Xclang -fopenmp=libiomp5 t.cpp

output is :

-OMPParallelDirective 0x6febf50 <line:32:9, col:35> -CapturedStmt 0x6febf10 <line:33:1, line:39:1>
`-DeclRefExpr 0x6feb7f8 line:37:3 ‘int’ lvalue Var 0x6fe8590 ‘l’ ‘int’

it did not say anything about the for loop and the private clause.

Sincerely
xiaohui