Infinite loop in GetOptionInfo() ?

Hi,

Sorry about cross-posting to this group and cfe-users, but I'm not sure if I'll get a reply on the user group.

A question about the clang-interpreter example distributed in the 3.3 code base. Should it be possible to add a loop to main() and execute the example multiple times (after modifying it to properly receive args for the file to parse and interpret)? I've included my slightly modified version of the example code below (all you need to do is create two test files "test1.c" and "test2.c" in the same dir --- I've included my simple test files also below).

When I run the example it works fine for the first call into the code, but goes into an infinite loop upon the second. For example:

rdsuvideodrome:~/ciEx> ./ciEx.out
Result = 55
<premain>: CommandLine Error: Argument 'verify-loop-info' defined more than once!
<premain>: CommandLine Error: Argument 'verify-scev' defined more than once!
<premain>: CommandLine Error: Argument 'scalar-evolution-max-iterations' defined more than once!
<premain>: CommandLine Error: Argument 'struct-path-tbaa' defined more than once!
<premain>: CommandLine Error: Argument 'enable-tbaa' defined more than once!
<premain>: CommandLine Error: Argument 'enable-bc-uselist-preserve' defined more than once!
<premain>: CommandLine Error: Argument 'agg-antidep-debugmod' defined more than once!
<premain>: CommandLine Error: Argument 'agg-antidep-debugdiv' defined more than once!
<premain>: CommandLine Error: Argument 'tail-merge-size' defined more than once!
<premain>: CommandLine Error: Argument 'tail-merge-threshold' defined more than once!
<premain>: CommandLine Error: Argument 'enable-tail-merge' defined more than once!
<premain>: CommandLine Error: Argument 'stress-early-ifcvt' defined more than once!
<premain>: CommandLine Error: Argument 'early-ifcvt-limit' defined more than once!
<premain>: CommandLine Error: Argument 'view-edge-bundles' defined more than once!
<premain>: CommandLine Error: Argument 'ifcvt-branch-fold' defined more than once!
--------------- end snip, but this continues and continues -------------------

The "Result = 55" is the correct return value from test1.c (i.e. fib(10)). Processing test2.c causes the repeating <premain> complaints.

Could someone let me know if libclang can be called into multiple times as this example does or is it just a limitation of the simple clang-interpreter example?

From looking at the code and debugging, it appears that Clang.ExecuteAction(*Act) causes the global variable RegisteredOptionList (from lib/Support/CommandLine.cpp)

to contain a huge (unbounded?) number of nodes. This becomes an issue when the list is processed in GetOptionInfo() from the same file.

Any ideas?

Thanks for your help,
Jay

//=============================================================================
/******* test1.c *********/
int fib( int x )
{
    if ( x == 0 )
                return 0;
    if ( x == 1 )
                return 1;

    return ( fib( x-1 ) + fib( x-2) );
}

int main( int argc, char** argv )
{
    return fib( 10 );
}

//=============================================================================
/******* test2.c *********/
int fib2( int x )
{
    if ( x == 0 )
                return 0;
    if ( x == 1 )
                return 1;

    return ( fib2( x-1 ) + fib2( x-2) );
}

int main( int argc, char** argv )
{
    return fib2( 12 );
}

//=============================================================================
/****** Modified clang-interpreter example. *********/

//===-- examples/clang-interpreter/main.cpp - Clang C Interpreter Example -===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//

#include "clang/CodeGen/CodeGenAction.h"
#include "clang/Basic/DiagnosticOptions.h"
#include "clang/Driver/Compilation.h"
#include "clang/Driver/Driver.h"
#include "clang/Driver/Tool.h"
#include "clang/Frontend/CompilerInstance.h"
#include "clang/Frontend/CompilerInvocation.h"
#include "clang/Frontend/FrontendDiagnostic.h"
#include "clang/Frontend/TextDiagnosticPrinter.h"
#include "llvm/ADT/OwningPtr.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/ExecutionEngine/ExecutionEngine.h"
#include "llvm/ExecutionEngine/JIT.h"
#include "llvm/IR/Module.h"
#include "llvm/Support/Host.h"
#include "llvm/Support/ManagedStatic.h"
#include "llvm/Support/Path.h"
#include "llvm/Support/TargetSelect.h"
#include "llvm/Support/raw_ostream.h"
using namespace clang;
using namespace clang::driver;

// This function isn't referenced outside its translation unit, but it
// can't use the "static" keyword because its address is used for
// GetMainExecutable (since some platforms don't support taking the
// address of main, and some platforms can't implement GetMainExecutable
// without being given the address of a function in the main executable).
llvm::sys::Path GetExecutablePath(const char *Argv0) {
  // This just needs to be some symbol in the binary; C++ doesn't
  // allow taking the address of ::main however.
  void *MainAddr = (void*) (intptr_t) GetExecutablePath;
  return llvm::sys::Path::GetMainExecutable(Argv0, MainAddr);
}

static int Execute(llvm::Module *Mod, char * const *envp) {
  llvm::InitializeNativeTarget();

  std::string Error;
  OwningPtr<llvm::ExecutionEngine> EE(
    llvm::ExecutionEngine::createJIT(Mod, &Error));
  if (!EE) {
    llvm::errs() << "unable to make execution engine: " << Error << "\n";
    return 255;
  }

  llvm::Function *EntryFn = Mod->getFunction("main");
  if (!EntryFn) {
    llvm::errs() << "'main' function not found in module.\n";
    return 255;
  }

  // FIXME: Support passing arguments.
  std::vector<std::string> Args;
  Args.push_back(Mod->getModuleIdentifier());

  return EE->runFunctionAsMain(EntryFn, Args, envp);
}

//int main(int argc, const char **argv, char * const *envp) {
int interpretFile (int argc, const char *argv, char * const *envp) {
  void *MainAddr = (void*) (intptr_t) GetExecutablePath;
  // llvm::sys::Path Path = GetExecutablePath(argv[0]);
  llvm::sys::Path Path = GetExecutablePath(argv);
  IntrusiveRefCntPtr<DiagnosticOptions> DiagOpts = new DiagnosticOptions();
  TextDiagnosticPrinter *DiagClient =
    new TextDiagnosticPrinter(llvm::errs(), &*DiagOpts);

  IntrusiveRefCntPtr<DiagnosticIDs> DiagID(new DiagnosticIDs());
  DiagnosticsEngine Diags(DiagID, &*DiagOpts, DiagClient);
  Driver TheDriver(Path.str(), llvm::sys::getProcessTriple(), "a.out", Diags);
  TheDriver.setTitle("clang interpreter");

  // FIXME: This is a hack to try to force the driver to do something we can
  // recognize. We need to extend the driver library to support this use model
  // (basically, exactly one input, and the operation mode is hard wired).
  // SmallVector<const char *, 16> Args(argv, argv + argc);
  SmallVector<const char *, 16> Args(2, argv);
  Args.push_back("-fsyntax-only");
  OwningPtr<Compilation> C(TheDriver.BuildCompilation(Args));
  if (!C)
    return 0;

  // FIXME: This is copied from ASTUnit.cpp; simplify and eliminate.

  // We expect to get back exactly one command job, if we didn't something
  // failed. Extract that job from the compilation.
  const driver::JobList &Jobs = C->getJobs();
  if (Jobs.size() != 1 || !isa<driver::Command>(*Jobs.begin())) {
    SmallString<256> Msg;
    llvm::raw_svector_ostream OS(Msg);
    C->PrintJob(OS, C->getJobs(), "; ", true);
    Diags.Report(diag::err_fe_expected_compiler_job) << OS.str();
    return 1;
  }

  const driver::Command *Cmd = cast<driver::Command>(*Jobs.begin());
  if (llvm::StringRef(Cmd->getCreator().getName()) != "clang") {
    Diags.Report(diag::err_fe_expected_clang_command);
    return 1;
  }

  // Initialize a compiler invocation object from the clang (-cc1) arguments.
  const driver::ArgStringList &CCArgs = Cmd->getArguments();
  OwningPtr<CompilerInvocation> CI(new CompilerInvocation);
  CompilerInvocation::CreateFromArgs(*CI,
                                     const_cast<const char **>(CCArgs.data()),
                                     const_cast<const char **>(CCArgs.data()) +
                                       CCArgs.size(),
                                     Diags);

  // Show the invocation, with -v.
  if (CI->getHeaderSearchOpts().Verbose) {
    llvm::errs() << "clang invocation:\n";
    C->PrintJob(llvm::errs(), C->getJobs(), "\n", true);
    llvm::errs() << "\n";
  }

  // FIXME: This is copied from cc1_main.cpp; simplify and eliminate.

  // Create a compiler instance to handle the actual work.
  CompilerInstance Clang;
  Clang.setInvocation(CI.take());

  // Create the compilers actual diagnostics engine.
  Clang.createDiagnostics();
  if (!Clang.hasDiagnostics())
    return 1;

  // Infer the builtin include path if unspecified.
  if (Clang.getHeaderSearchOpts().UseBuiltinIncludes &&
      Clang.getHeaderSearchOpts().ResourceDir.empty())
    Clang.getHeaderSearchOpts().ResourceDir =
      CompilerInvocation::GetResourcesPath(argv, MainAddr);
  // CompilerInvocation::GetResourcesPath(argv[0], MainAddr);

  // Create and execute the frontend to generate an LLVM bitcode module.
  OwningPtr<CodeGenAction> Act(new EmitLLVMOnlyAction());
  if (!Clang.ExecuteAction(*Act))
    return 1;

  int Res = 255;
  if (llvm::Module *Module = Act->takeModule())
    Res = Execute(Module, envp);
  llvm::errs() << "Result = " << Res << "\n"; //Not an error, but just report Res to see what happened.

  // Shutdown.

  llvm::llvm_shutdown();

  return Res;
}

//int main(int argc, const char **argv, char * const *envp) {
int main(int argc, const char **argv) {
    char* const* envp;
    interpretFile( 1, "test1.c", envp );
    interpretFile( 1, "test2.c", envp );
}