Compiling a C++ file into LLVM IR using clang::CompilerInstance

Hi! I’m trying to compile the following C++ file into LLVM IR for use as a function library for my toy programming language, but I can’t extract the llvm::Module out of my llvm::CodeGenAction.

Input.cpp:

#include <cstdio>

void print_i32(int value) {
	printf("%i\n", value);
}

int main() {
	print_i32(42);
}

Compilation code: 
#include <clang/Basic/Diagnostic.h>
#include <clang/Basic/FileManager.h>
#include <clang/Basic/FileSystemOptions.h>
#include <clang/Basic/SourceManager.h>
#include <clang/Basic/TargetInfo.h>
#include <clang/Frontend/CompilerInstance.h>
#include <clang/Frontend/TextDiagnosticPrinter.h>
#include <clang/Lex/PreprocessorOptions.h>
#include <llvm/TargetParser/Host.h>
#include <llvm/IR/Module.h>
#include <llvm/Support/raw_ostream.h>
#include <clang/CodeGen/CodeGenAction.h>
#include <iostream>
#include "clang/Basic/DiagnosticOptions.h"
#include <clang/Frontend/CompilerInvocation.h>

int main() {
	clang::CompilerInstance compiler;
	clang::DiagnosticOptions diagnosticOptions;
	diagnosticOptions.ShowColors = true;
	diagnosticOptions.ShowOptionNames = true;
	diagnosticOptions.setFormat(clang::DiagnosticOptions::Clang);
	clang::TextDiagnosticPrinter* diagClient = new clang::TextDiagnosticPrinter(llvm::errs(), &diagnosticOptions);
	compiler.createDiagnostics(diagClient, false);

	auto& invocation = compiler.getInvocation();
	invocation.getCodeGenOpts().DisableFree = false;
	invocation.getCodeGenOpts().UnrollLoops = true;
	compiler.getTargetOpts().Triple = llvm::sys::getDefaultTargetTriple();
	compiler.getLangOpts().CPlusPlus = 1;
	compiler.getLangOpts().CPlusPlus11 = 1;

	compiler.setTarget(clang::TargetInfo::CreateTargetInfo(compiler.getDiagnostics(), compiler.getInvocation().TargetOpts));

	const clang::FileSystemOptions fileSystemOptions;
	compiler.setFileManager(new clang::FileManager(fileSystemOptions));
	compiler.createSourceManager(compiler.getFileManager());
	clang::FileManager& fileManager = compiler.getFileManager();
	clang::SourceManager& sourceManager = compiler.getSourceManager();

	llvm::Expected<const clang::FileEntry*> inputFile = fileManager.getFileRef("C:/dev/projects/channel/compiler/input.cpp", true);
	if (!inputFile) {
	    llvm::Error err = inputFile.takeError();
	    std::cerr << "Error: " << llvm::toString(std::move(err)) << std::endl;
	    return 1;
	}
	const clang::FileEntry* file_entry = *inputFile;
	sourceManager.setMainFileID(sourceManager.createFileID(file_entry, clang::SourceLocation(), clang::SrcMgr::C_User));
	std::unique_ptr<clang::CodeGenAction> codeGenAction(new clang::EmitLLVMOnlyAction());

	if (!compiler.ExecuteAction(*codeGenAction)) {
	    std::cerr << "Error: Failed to compile the file to LLVM IR." << std::endl;
	    return 1;
	}

	llvm::Module* llvmModule = codeGenAction->takeModule().get();

	if (llvmModule) {
	    llvm::errs() << *llvmModule;
	}
	else {
	    std::cerr << "Error: Failed to generate LLVM IR." << std::endl;
	    return 1;
	}

	return 0;
}

Note that I’m always getting stuck on the part where I’m extracting the module (I get a nullptr). Everything else appears to be working correctly and I get no build errors. Any help and advice is very much appreciated.

I’ve done some more research on this, and I found the clang::tooling::runToolOnCOde function, which looks like it does what I need, but I can’t get that to work either:

#include <fstream>
#include <iostream>
#include <memory>
#include <string>
#include <llvm/IR/LLVMContext.h>
#include <llvm/IR/Module.h>
#include <clang/CodeGen/CodeGenAction.h>
#include <clang/Frontend/FrontendActions.h>
#include <clang/Frontend/CompilerInstance.h>
#include <clang/Tooling/Tooling.h>

class LLVMCodeGenActionFactory : public clang::tooling::FrontendActionFactory {
public:
    LLVMCodeGenActionFactory(llvm::LLVMContext* context) : context(context) {}

    std::unique_ptr<clang::FrontendAction> create() override {
        return std::make_unique<clang::EmitLLVMOnlyAction>(context);
    }
private:
    llvm::LLVMContext* context;
};

int main() {
    std::ifstream file("input.cpp");
    std::string inputFile((std::istreambuf_iterator<char>(file)), std::istreambuf_iterator<char>());

    // Initialize the Clang compiler
    clang::CompilerInstance compiler;
    llvm::LLVMContext context;
    LLVMCodeGenActionFactory factory(&context);
    clang::tooling::runToolOnCode(factory.create(), inputFile);

    // Get the LLVM module
    std::unique_ptr<llvm::Module> module = static_cast<clang::EmitLLVMOnlyAction*>(factory.create().get())->takeModule();
    if (!module) {
        std::cerr << "Failed to generate LLVM IR" << std::endl;
        return 1;
    }

    return 0;
}

Clang tooling often needs a compilation database, a JSON file. You can also ask Clang directly to emit IR.

clang -S -emit-llvm foo.c

will give you a foo.ll file.

Thanks for the suggestion, I probably didn’t specify it well enough, but I’m trying to compile it via the C++ Clang API, which, while taking arguments, does not work (from my experiments, so I might be wrong) with the -emit-llvm flag (from my POV it didn’t do anything). I’ve added a check to my call to runToolOnCodeWithArgs to see, if it finishes successfully, and it does, so the issue seems to be with my action.