Building A Project Against LLVM

I decided to start playing around with building my own programming language recently, and to use LLVM to handle the assembly-level details. I’m on Kubuntu 18.04, and I started out using LLVM 6.0 from Kubuntu’s packages. I put together code for dealing with my language, then went over the Kaleidoscope tutorials (which have been extremely helpful btw!). I was able to successfully get my own compiler to generate IR using LLVM, use PassManager to write that to a native .o file, use gcc to link that, and execute a tiny program written in my own language.

I also decided it was a good time to learn CMake, so I set up my project using that. The CMakeLists.txt file I’m using is essentially just taken from: https://llvm.org/docs/CMake.html#embedding-llvm-in-your-project - though originally it would not link. From scouring the internet I made 2 changes to get that working: replaced “support core irreader” with “all”, and replaced “${llvm_libs}” with just “LLVM”.

However as I was starting to play with setting up JIT, I hit more differences between the version of LLVM in Kubuntu and the version the examples and documentation were written against. So I decided to try to update to a newer version of LLVM… and this is where I’ve been stuck for several days now. Here are the steps I’ve taken:

  • Uninstalled any llvm packages I could find from Kubuntu’s package manager.
  • Followed the getting started guide: https://llvm.org/docs/GettingStarted.html - I git cloned LLVM, checked out the 10.0.0 tag, ran cmake as instructed, with the Release type. When that completed successfully I ran sudo ninja install.
  • I then went back to my project and adjusted a couple places to successfully compile against the new version.
  • At this point I put “${llvm_libs}” in the CMakeLists.txt file back to match the example. However I was getting a massive wall of link errors.
  • I assumed I must have built LLVM incorrectly somehow, so in an effort to undo that install, I deleted everything I could find under /usr/local that had LLVM in its name, downloaded the 10.0 release from https://releases.llvm.org/download.html for ubuntu 18.04, and extracted that all to /usr/local.
  • I can still successfully compile, but not link.

At this point I’m not sure what to try next. Is there additional documentation somewhere for how to “install” a current release of LLVM correctly?

For reference here’s my final CMakeLists.txt file:

cmake_minimum_required(VERSION 3.10)

project(CBreakCompiler)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED True)
add_compile_options(-Wall)

find_package(LLVM 10.0.0 REQUIRED CONFIG)

message(STATUS “Found LLVM ${LLVM_PACKAGE_VERSION}”)
message(STATUS “Using LLVMConfig.cmake in: ${LLVM_DIR}”)

include_directories(${LLVM_INCLUDE_DIRS})
add_definitions(${LLVM_DEFINITIONS})

add_executable(CBreakCompiler
src/main.cpp
src/Parser.cpp
src/SourceTokenizer.cpp
src/IRCompiler.cpp
src/CompiledOutput.cpp)

llvm_map_components_to_libnames(llvm_libs all)
target_link_libraries(CBreakCompiler ${llvm_libs})

And a snippet from the cmake output corresponding to those message lines:

– Found LLVM 10.0.0
– Using LLVMConfig.cmake in: /usr/local/lib/cmake/llvm

There are dozens of link errors… the first few and last few are:

CMakeFiles/CBreakCompiler.dir/src/main.cpp.o:(.data.rel+0x0): undefined reference to llvm::DisableABIBreakingChecks' CMakeFiles/CBreakCompiler.dir/src/main.cpp.o: In function std::default_deletellvm::LLVMContext::operator()(llvm::LLVMContext*) const’:
main.cpp:(.text.ZNKSt14default_deleteIN4llvm11LLVMContextEEclEPS1[ZNKSt14default_deleteIN4llvm11LLVMContextEEclEPS1]+0x1e): undefined reference to llvm::LLVMContext::~LLVMContext()' ... CMakeFiles/CBreakCompiler.dir/src/CompiledOutput.cpp.o:(.[data.rel.ro](http://data.rel.ro)+0xe0): undefined reference to llvm::raw_ostream::anchor()’
CMakeFiles/CBreakCompiler.dir/src/CompiledOutput.cpp.o:(.data.rel.ro+0xf8): undefined reference to typeinfo for llvm::raw_pwrite_stream' CMakeFiles/CBreakCompiler.dir/src/CompiledOutput.cpp.o:(.[data.rel.ro](http://data.rel.ro)+0x110): undefined reference to typeinfo for llvm::raw_ostream’

Rarrum,

Kubuntu 20.04 LTS is available. You may be able to upgrade to 19.10, and then to 20.04 without reinstalling. It can be done on Xubuntu. A direct upgrade to 20.04 should become available. LLVM 10 then installs from the distribution packages. I put all this on a VM using KVM/QEMU to keep it isolated from my primary desktop environment. Building a 20.04 VM after upgrading to 20.04 appears to give a faster VM. Use llvm’s linker lld.

The cmake version for 20.04 is 3.16.3 which should help with llvm’s recommended version.

Neil

Rarrum,

Kubuntu 20.04 LTS is available. You may be able to upgrade to 19.10, and then to 20.04 without reinstalling. It can be done on Xubuntu. A direct upgrade to 20.04 should become available. LLVM 10 then installs from the distribution packages. I put all this on a VM using KVM/QEMU to keep it isolated from my primary desktop environment. Building a 20.04 VM after upgrading to 20.04 appears to give a faster VM. Use llvm’s linker lld.

The cmake version for 20.04 is 3.16.3 which should help with llvm’s recommended version.

Do you believe the error listed have to do with the CMake version?

Neil

I decided to start playing around with building my own programming language recently, and to use LLVM to handle the assembly-level details. I’m on Kubuntu 18.04, and I started out using LLVM 6.0 from Kubuntu’s packages. I put together code for dealing with my language, then went over the Kaleidoscope tutorials (which have been extremely helpful btw!). I was able to successfully get my own compiler to generate IR using LLVM, use PassManager to write that to a native .o file, use gcc to link that, and execute a tiny program written in my own language.

I also decided it was a good time to learn CMake, so I set up my project using that. The CMakeLists.txt file I’m using is essentially just taken from: https://llvm.org/docs/CMake.html#embedding-llvm-in-your-project - though originally it would not link. From scouring the internet I made 2 changes to get that working: replaced “support core irreader” with “all”, and replaced “${llvm_libs}” with just “LLVM”.

However as I was starting to play with setting up JIT, I hit more differences between the version of LLVM in Kubuntu and the version the examples and documentation were written against. So I decided to try to update to a newer version of LLVM… and this is where I’ve been stuck for several days now. Here are the steps I’ve taken:

  • Uninstalled any llvm packages I could find from Kubuntu’s package manager.
  • Followed the getting started guide: https://llvm.org/docs/GettingStarted.html - I git cloned LLVM, checked out the 10.0.0 tag, ran cmake as instructed, with the Release type. When that completed successfully I ran sudo ninja install.
  • I then went back to my project and adjusted a couple places to successfully compile against the new version.
  • At this point I put “${llvm_libs}” in the CMakeLists.txt file back to match the example. However I was getting a massive wall of link errors.
  • I assumed I must have built LLVM incorrectly somehow, so in an effort to undo that install, I deleted everything I could find under /usr/local that had LLVM in its name, downloaded the 10.0 release from https://releases.llvm.org/download.html for ubuntu 18.04, and extracted that all to /usr/local.
  • I can still successfully compile, but not link.

At this point I’m not sure what to try next. Is there additional documentation somewhere for how to “install” a current release of LLVM correctly?

For reference here’s my final CMakeLists.txt file:

cmake_minimum_required(VERSION 3.10)

project(CBreakCompiler)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED True)
add_compile_options(-Wall)

find_package(LLVM 10.0.0 REQUIRED CONFIG)

message(STATUS “Found LLVM ${LLVM_PACKAGE_VERSION}”)
message(STATUS “Using LLVMConfig.cmake in: ${LLVM_DIR}”)

include_directories(${LLVM_INCLUDE_DIRS})
add_definitions(${LLVM_DEFINITIONS})

add_executable(CBreakCompiler
src/main.cpp
src/Parser.cpp
src/SourceTokenizer.cpp
src/IRCompiler.cpp
src/CompiledOutput.cpp)

llvm_map_components_to_libnames(llvm_libs all)
target_link_libraries(CBreakCompiler ${llvm_libs})

And a snippet from the cmake output corresponding to those message lines:

– Found LLVM 10.0.0
– Using LLVMConfig.cmake in: /usr/local/lib/cmake/llvm

There are dozens of link errors… the first few and last few are:

CMakeFiles/CBreakCompiler.dir/src/main.cpp.o:(.data.rel+0x0): undefined reference to llvm::DisableABIBreakingChecks' CMakeFiles/CBreakCompiler.dir/src/main.cpp.o: In function std::default_deletellvm::LLVMContext::operator()(llvm::LLVMContext*) const’:
main.cpp:(.text.ZNKSt14default_deleteIN4llvm11LLVMContextEEclEPS1[ZNKSt14default_deleteIN4llvm11LLVMContextEEclEPS1]+0x1e): undefined reference to llvm::LLVMContext::~LLVMContext()' ... CMakeFiles/CBreakCompiler.dir/src/CompiledOutput.cpp.o:(.[data.rel.ro](http://data.rel.ro)+0xe0): undefined reference to llvm::raw_ostream::anchor()’
CMakeFiles/CBreakCompiler.dir/src/CompiledOutput.cpp.o:(.data.rel.ro+0xf8): undefined reference to typeinfo for llvm::raw_pwrite_stream' CMakeFiles/CBreakCompiler.dir/src/CompiledOutput.cpp.o:(.[data.rel.ro](http://data.rel.ro)+0x110): undefined reference to typeinfo for llvm::raw_ostream’

Seems like llvm_map_components_to_libnames wasn’t populated well?
I’d start by printing ${llvm_libs} in your CMake to check the output of llvm_map_components_to_libnames, I don’t know how the “all” works for external builds? You may have to list the components you need more explicitly instead?

I’d rather avoid updating my OS at the moment or setting up a VM. But the cmake comments were a hint… I suspect something is going wrong in there preventing it from adding the actual library files to the linker commandline.

I added this to my CMakeLists.txt:
set(CMAKE_VERBOSE_MAKEFILE on)

Which shows no LLVM libs at all being passed in:

[ 8%] Linking CXX executable …/bin/CBreakCompiler
/usr/bin/c++ CMakeFiles/CBreakCompiler.dir/src/main.cpp.o CMakeFiles/CBreakCompiler.dir/src/Parser.cpp.o CMakeFiles/CBreakCompiler.dir/src/SourceTokenizer.cpp.o CMakeFiles/CBreakCompiler.dir/src/IRCompiler.cpp.o CMakeFiles/CBreakCompiler.dir/src/CompiledOutput.cpp.o CMakeFiles/CBreakCompiler.dir/src/JIT.cpp.o -o …/bin/CBreakCompiler

If I change this:
llvm_map_components_to_libnames(llvm_libs all)
to:
llvm_map_components_to_libnames(llvm_libs core)

Then I start to see libraries being added:
[ 8%] Linking CXX executable …/bin/CBreakCompiler
/usr/bin/c++ CMakeFiles/CBreakCompiler.dir/src/main.cpp.o CMakeFiles/CBreakCompiler.dir/src/Parser.cpp.o CMakeFiles/CBreakCompiler.dir/src/SourceTokenizer.cpp.o CMakeFiles/CBreakCompiler.dir/src/IRCompiler.cpp.o CMakeFiles/CBreakCompiler.dir/src/CompiledOutput.cpp.o CMakeFiles/CBreakCompiler.dir/src/JIT.cpp.o -o …/bin/CBreakCompiler /usr/local/lib/libLLVMCore.a /usr/local/lib/libLLVMBinaryFormat.a /usr/local/lib/libLLVMRemarks.a /usr/local/lib/libLLVMBitstreamReader.a /usr/local/lib/libLLVMSupport.a -lrt -ldl -ltinfo -lpthread -lm /usr/local/lib/libLLVMDemangle.a

Perhaps I can figure out which ones I need and manually specify them all, rather than using “all”.

I’ve managed to get 10.0.0 working now… there were a couple things I had to adjust.

The Kaleidoscope example had me doing this before creating the object file:
llvm::InitializeAllTargetInfos();
llvm::InitializeAllTargets();
llvm::InitializeAllTargetMCs();
llvm::InitializeAllAsmParsers();
llvm::InitializeAllAsmPrinters();

It turns out I can get away with just this, since I’m not (yet) worried about targeting other machines:
llvm::InitializeNativeTarget();
llvm::InitializeNativeTargetAsmPrinter();

Since “all” doesn’t work anymore for some reason, I’ve managed to (through trial and error, guessing at different names shown from llvm-config --components) end up with this set of libnames:
llvm_map_components_to_libnames(llvm_libs core executionengine support nativecodegen)

That left me with 2 link errors referring to llvm::raw_ostream and llvm::raw_pwrite_stream. After much more digging through similar complaints on the internet… the last fix turned out to be adding this to CMakeLists.txt:

set(CMAKE_CXX_FLAGS “${CMAKE_CXX_FLAGS} -fno-rtti”)

I am a little worried that the rtti flag may come back and bite me later when I get around to building this on windows again (since I do use exceptions), but that’s a problem for another day.

One last question… is there a good way to know which libnames I need, based on which #includes I pull in or which classes I’m using? I didn’t see anything obvious in the doxygen documentation.

I’ve managed to get 10.0.0 working now… there were a couple things I had to adjust.

The Kaleidoscope example had me doing this before creating the object file:
llvm::InitializeAllTargetInfos();
llvm::InitializeAllTargets();
llvm::InitializeAllTargetMCs();
llvm::InitializeAllAsmParsers();
llvm::InitializeAllAsmPrinters();

It turns out I can get away with just this, since I’m not (yet) worried about targeting other machines:
llvm::InitializeNativeTarget();
llvm::InitializeNativeTargetAsmPrinter();

Since “all” doesn’t work anymore for some reason

Did it use to work with LLVM 9? It may be worth investigating if this is the case.

, I’ve managed to (through trial and error, guessing at different names shown from llvm-config --components) end up with this set of libnames:
llvm_map_components_to_libnames(llvm_libs core executionengine support nativecodegen)

That left me with 2 link errors referring to llvm::raw_ostream and llvm::raw_pwrite_stream. After much more digging through similar complaints on the internet… the last fix turned out to be adding this to CMakeLists.txt:

set(CMAKE_CXX_FLAGS “${CMAKE_CXX_FLAGS} -fno-rtti”)

I am a little worried that the rtti flag may come back and bite me later when I get around to building this on windows again (since I do use exceptions), but that’s a problem for another day.

You can build LLVM with RTTI enable if you need it.

One last question… is there a good way to know which libnames I need, based on which #includes I pull in or which classes I’m using? I didn’t see anything obvious in the doxygen documentation.

LLVM includes are organized by subdirectory with a directory in the include/llvm/ folder matching directories in lib/ in general, and the lib name matches the component’s name. If you include “llvm/Support/raw_ostream.h” you need support, if you include “include/llvm/IRReader/IRReader.h” you need “irreader”, etc. In general for the IR-level library this is fairly straightforward I think, but it becomes a bit more complex for the backends, jit, and these kind of things.
There are also “pseudo components” like providing the name of a target would link all aspects of this targets, while using “AllTargetsAsmParsers” would link only the assembly parser for all available targets… I don’t find a good doc on this unfortunately.

The “all” option did work on the LLVM 6 that was bundled with Ubuntu. Unfortunately I jumped from that up to LLVM 10 (downloaded from https://releases.llvm.org/download.html and extracted into /usr/local), so can’t say whether it worked on LLVM 9.