TL;DR: Synthesize automatic printf
to print execution results in clang-repl and generalize the approach to use an object used to bridge compiled/interpreted code taking inspiration from what was done in Cling.
Introduction
The Cling interpreter is a unique interpretative technology for C++ based on Clang developed by high-energy physics (HEP). It is used to deliver reflection and type information for exabytes of scientific data and is heavily used during data analysis of particle physics data from the Large Hadron Collider (LHC) and other particle physics experiments.
In RFC Moving (parts of) the Cling REPL in Clang we discussed and shipped the initial incremental compilation facilities into LLVM mainline, called clang-repl.
In this RFC we propose two distinct features and their interaction: automatic printf
and connecting compiled and interpreted C++ through a class called Value
as an abstraction layer used to carry expression results and support value pretty printing in clang-repl.
Goals
Automatic printf
One of the key aspects of interactive C++ is exploratory programming which encourages showing execution results on screen easily. Typing every time printf
or similar is too laborious and too annoying. Taking inspiration from Cling, we could achieve this effect by an extension that lives purely in libclangInterpreter. We propose to have a special mode to indicate when we want to do value pretty printing: A expression in the global scope (without the semicolon). Coincidently Rust takes a similar approach:
clang-repl> int x = 42;
clang-repl> x // equivalent to calling printf("(int &) %d\n", x);
(int &) 42
clang-repl> std::vector<int> v = {1,2,3};
clang-repl> v // This syntax is fine after [D127284](https://reviews.llvm.org/D127284)
(std::vector<int> &) {1,2,3}
clang-repl> "Hello, interactive C++!"
(const char [24]) "Hello, interactive C++!"
In the RFC below we discuss at length how to make this technique extensible, versatile and efficient by introducing simple concepts that live in libclangInterpreter only. For example, we demonstrate how a simple pair of (clang type and execution result) can be used to write custom pretty printers.
The implementation is uncomplicated since the Clang parser is responsible for parsing code in clang-repl, we need to teach it to recognize this pattern and propagate some flags that can be used later.
The implementation for this might be trivial after the patch addressing (RFC: Flexible Lexer Buffering for Handling Incomplete Input in Interactive C/C++)
Crossing the compiled/interpreted world
In some scenarios, we can embed a C++ interpreter in a C++ program. In the example below, we create an interpreter and define and increment a variable p
.
#include "clang/Interpreter/Interpreter.h"
int main(int argc, char** argv) {
std::vector<const char *> ClangArgs = {};
auto CI = cantFail(clang::IncrementalCompilerBuilder::create(ClangArgs));
auto interp = return cantFail(clang::Interpreter::create(std::move(CI)));
interp.ParseAndExecute("int p=0; ++p;");
}
In many cases, it is useful to bring back the execution result to the compiled program. In particular, if we could instantiate a template with a user type on demand and use its value or call directly the symbol. This has been utilized by Cling for a decade now, allowing it to build patterns such as:
float Global = 3.141f;
float getGlobal() { return Global; }
void setGlobal(float val) { Global = val; }
void Demo(cling::Interpreter& interp) {
// We could use a header as well.
interp.declare("float getGlobal();\n"
"void setGlobal(float val);\n");
cling::Value res; // This will hold the result of the expression evaluation.
interp.process("getGlobal();", &res);
std::cout << "getGlobal() returned " << res.getAs<float>() << '\n';
setGlobal(1.); // We can modify the value in compiled code.
interp.process("getGlobal();", &res); // The interpreter can see it.
std::cout << "getAnotherGlobal() returned " << res.getAs<float>() << '\n';
// We modify using the interpreter, now the binary sees the new value.
interp.process("setGlobal(7.777);");
std::cout << "getGlobal() returned " << getGlobal() << '\n';
}
Here Cling introduces a concept called cling::Value
to connect the compiled/interpreted worlds. The Value
object carries the execution results and we can pass it around between two sides. Supporting this feature is essential for interoperability as it provides extended control over the object’s lifetime if requested.
Value Interface – An execution result – type pair
A value
is a container that can carry the arbitrary result of an expression in an endian-independent way with small buffer optimization. Its design is driven by performance-critical use cases (sec Performance Considerations). In addition, the value container should support out-of-process/remote execution to support microcontrollers such as Arduino Due which cannot host the entire LLVM JIT infrastructure.
12 + 30 // This is a BinaryExpression, which yields a value whose type is int and the content is 42.
In general, we should implement the interface below for Value
:
class Value {
public:
clang::QualType* getType(); // Obtain the type information of the expression.
template<typename T>
T castAs(); // Cast the value to corresponding type.
void printType(llvm::raw_ostream& OS);
void printData(llvm::raw_ostream& OS);
void print(llvm::raw_ostream& OS);
void dump() const; // Dump the value, called print(llvm::outs()) internally.
};
Note that the actual implementation is slightly more complex as there are several optimizations to avoid repetitive, expensive operations such as getType
.
Implementation
Note this implementation we proposed is inspired by Cling.
Stealing an execution result
After capturing the desired expression, we need to create the Value
object. Here we achieved this by doing code generation or synthesizing Clang AST. In general, first, we synthesize the wrapper function ValueGetter
, which is used as a user interface for passing the Value
object out. Then we generate the function body, which is another function call (SetValue
) to construct the Value
object. The function SetValue
is declared the first time when we enter the REPL and defined somewhere else in the library (exported using LLVM_EXTERNAL_VISIBILITY
).
Lifetime and temporaries
Let’s consider this example:
clang-repl> struct S {};
clang-repl> S foo() { return S{};}
clang-repl> foo()
foo()
here returns an rvalue which means it will be destroyed immediately after being created. So we can’t simply pass that Expr
to SetValue
since it will cause dangling problems. Therefore, we need to make a copy of the original expression.
Thus, we need two branches to deal with objects:
- If the object is a
lvalue
– theValue
will not get involved in the lifetime management of the object, but only stores its address, which can be used later. - If the object is an
rvalue
, or temporary – the Value will allocate an internal buffer that is enough to contain the object, and use placement new to construct the object in the buffer. In this case, we manually extend the lifetime of the temporary object, so it is possible to use that later.
Overview of the implementation in clang-repl
In general, based on its type, we transform:
clang-repl> x
into
clang-repl> void ValueGetter(void* OpaqueValue) {
// 1. if x is a built-in type like int, float.
SetValueNoAlloc(OpaqueValue, xQualType, x);
// 2. if x is a struct, and a lvalue.
SetValueNoAlloc(OpaqueValue, xQualType, &x);
// 3. if x is a struct, but a rvalue.
new (SetValueWithAlloc(OpaqueValue, xQualType) (x);
}
Then in the interpreter, we can ask JIT for a function pointer to ValueGetter :
auto* F = (void(*)(void*))Interp.getSymbolAddr("ValueGetter");
Value V;
(*F)((void*)&V);
V.dump(); // Do pretty printing or return the value to the user.
After we have the Value
object, the pretty print logic could be implemented in its Value::dump()
method.
STL types and user-defined class
In the implementation of Value::dump()
, it’s pretty straightforward to support printing built-in types, and we can always support any arbitrary user-defined struct/class by printing its address. However, we want to achieve more!
- Is it possible to obtain more useful information for types that almost everybody knows and uses like STL containers?
- How can the user customize the behavior for their own types?
To address the issues above, we propose adding a fallback in Value::dump()
when clang-repl fails to handle all possible cases. We synthesize a call again to a function like PrintValueRuntime
, which lives in a header that is processed by the JIT ahead of time. Any types that are not primitive types fall back to it, like STL components and user-defined types. They distinguish each other via overloads and SFINAE. In default, we provide a general implementation for standard library facilities and only print the address for unknown types. In this case, the equivalent code becomes:
clang-repl> std::vector<int> v {1,2,3};
clang-repl> #include "PrintValueRuntime.h"
clang-repl> PrintValueRuntime(v);
If users need to customize behavior for their own types like S
, they only have to write a corresponding overload for PrintValueRuntime
function in their code:
clang-repl> struct S { int i = 42; };
clang-repl> S s
(S&) 0x123
clang-repl> std::string PrintValueRuntime(S* s) {
return std::string("i = ") + std::to_string(s->i);
}
clang-repl> s // Picks up the custom pretty printer.
(S&) i = 42
Performance consideration
We want this facility to be as fast as possible. Note that we are actually “interpreting” the language, which means compiling the code when executing code. So the runtime performance will suffer if the compile time increases too much. This is likely to happen if there are too many overloads for PrintValueRuntime
and heavy usage of templates in Value
class design.
The implementation aims to be minimalistic because its header needs to be included during the interpreter’s runtime. These requirements prevent us from using basically any other concepts with standard implementations that are template heavy such as std::any
or std::variant
. Therefore, when representing the internal structure of the Value class, we intentionally choose the combination of a union and an enum class. Here is a rudimentary, illustrative implementation of the idea: Compiler Explorer
Except for the speed, the design can blow up the memory when using modules which would be called cross-module deserialization. However, this is a common problem for all overloaded functions across modules, and we would love to hear your feedback!