Hi,
I am working on constructing a llvm::Module
from a given bitCode string. The string is constructed from a llvm::Module
object using WriteBitcodeToFile()
, and it is compiled into IR from the ORC API ExecuteAction
. However, I am aware that after ExecutionAction
I can also take a LLVMContext
object from the action object by action.takeLLVMContext
, and LLVMContext
may include important information for this llvm::Module
.
Nevertheless, I can’t find any approach to serialize the LLVMContext
. I am wondering if I declare a new one and try to parse the IR string like:
llvmContext context;
llvm::parseIR(ModuleString, context);
Can I get the identical llvm::Module
and compile it as it is never serialized?
If not, how can I serialize LLVMContext
, and why do we need LLVMContext
for parsing an IR file?
Thank you!
You’re not supposed to, it’s for storing per-context types, and in your use-case you just create it on demand
For example, Type::getInt8PtrTy(LLVMContext& ctx)
returns the same pointer if the same LLVMContext is passed in, so you can do fast pointer comparison when checking types
You can. However, as explained above you’ll need to redo the work if you need:
- custom MDKindID
- ValueHandlers
- Type Comparisions
- …
Basically, you can’t/ shouldn’t serialize LLVMContext, or perform operations on IR Units from different LLVMContext
Thank you for the answer!
Just want to check my understanding: A LLVMContext
is a data structure maintaining metadata for the LLM to process programs. We can’t perform operations on IR units (llvm::Module
) with different LLVMContext
because LLVMContext
owns some metadata related to the IR units.
However, we can declare a new LLVMContext
and use that with the bitcode to construct a IR unit. Is this because bitcode contains additional information so that it doesn’t require the original LLVMContext
?
LLVMContext
is basically a container for various global tables that are used to work with IR–it does not contain any IR itself.
For example, the entire Type
hierarchy in LLVM has the property that types are “uniqued”: if you call Type::getIntNTy
twice with the same LLVMContext
and the same integer bitwidth, the resulting pointers will be the same. To guarantee this property, it’s necessary to have a hash table somewhere that maps integer bitwidths to their corresponding Type
instance–and you don’t want to eagerly construct them, because it consumes inordinate amounts of memory to keep around a i3245234
type that is legal to create in LLVM but no one is likely to create. This hash table lives in the LLVMContext
instance.
Effectively, serialization means storing the sequence of C++ calls necessary to reconstruct the Module
from scratch. However, none of those calls are going to be a member function of LLVMContext
itself: LLVMContext
is generally passed around as a parameter to the internal details of LLVM classes (so that they might use the global tables contained inside it], but it’s usually never used directly by user code. This means that there is nothing to actually serialize in an LLVMContext
.
Thank you so much! I think I got it.