I guess you mean producing a single .bc file for the entire project comprises of potentially thousands of C/C++ files. If that’s the case you definitely need to use LTO (Link-Time Optimization) to produce that kind of “merged” bitcode file. (Theoretically you can use llvm-link
but my experiences showed that that tool didn’t scale up really well)
Here are the steps:
- First, add the following compiler flag:
-flto
. In the case of CMake you can add something like-DCMAKE_C_FLAGS="-flto"
/-DCMAKE_CXX_FLAGS="-flto"
to your cmake invocation. - Second, add the following linker flag:
-flto -Wl,--plugin-opt=-lto-embed-bitcode=post-merge-pre-opt
. In the case of CMake you can add something like-DCMAKE_EXE_LINKER_FLAGS="-flto -Wl,--plugin-opt=-lto-embed-bitcode=post-merge-pre-opt"
to your cmake invocation. - Build the project
- Let’s say an executable
foo
is now built. Run the following command to extract the (merged) LLVM bitcode embedded in the ELF:objcopy foo --dump-section .llvmbc=foo.bc
. The extracted bitcode will be placed infoo.bc
. You can usellvm-dis
to get the textual LLVM IR.
Note that IIRC -lto-embed-bitcode
is only exposed after LLVM 14. Also, (full) LTO will consume A LOT of memory. One way to mitigate this is using better linkers like LLD (preferred) or gold. For instance, to use LLD, add -fuse-ld=lld
to the linker flag after you installed LLD.