If you have a function which has multiple returns does LLVM have some way for a section of code to always be run, regardless of which return was reached first? Not knowing much I’m thinking you basically use something like a goto which is always run before each return instruction and goes to the tail of the function body.
Is that correct?, and if so does LLVM have such a concept?
I don’t think LLVM itself has such a concept. It would be up to the frontend to produce IR that did this. The example that comes to mind would be in C++, if a function has local variables that need to have destructors called, the frontend will probably generate that code in an exit block and all the “return” statements will branch to it.
That makes sense to have the returns jump to the exit block. I haven’t gotten this far but do LLVM functions even support more than 1 “ret” instruction or do you always need to jump to a single exit block if your function returns in multiple places?
According to the IR definition of ‘ret’ it terminates a block, but it doesn’t say anything about having only one per function.
That said, I’ve been playing around with some simple cases and in fact clang seems to emit IR to branch to a single ‘ret’ even when I’d think it would make sense not to. So perhaps I am wrong about this.
The abstraction level of LLVM IR is C with vectors. If you have several return statements in your C program, then you will have several ret instructions in LLVM IR (per function).
@tschuett Sorry, how is this relevant? I asked if you had an example C/C++ function that produced IR with multiple ‘ret’ instructions. Telling me about -emit-llvm is not helpful.
Even with a function that has multiple raii variables and a return in the middle of a for loop still generates a single exit point.
More likely, clang organizes its AST to a single exit point, or perhaps clang’s CodeGen just works that way.
Anyway, to the OP’s point: While clang seems to always produce a single exit point, I don’t know that it’s a property you can depend on, and LLVM does not require it. Hand-written IR with multiple ‘ret’ instructions passes the verifier just fine.
I’m finally trying this now and I see it doesn’t like multiple ret instructions. Does this look right?
define void @MultipleExit(i64 %0) {
entry:
%i = alloca i64, align 8
store i64 %0, ptr %i, align 4
%i1 = load i64, ptr %i, align 4
%int_slt_tmp = icmp slt i64 %i1, 1
br i1 %int_slt_tmp, label %then, label %outer
then: ; preds = %entry
%const_tmp = alloca [255 x i8], align 1
store [12 x i8] c"less than 1\00", ptr %const_tmp, align 1
%1 = call i32 (ptr, ...) @printf(ptr %const_tmp)
ret void
br label %outer
else: ; No predecessors!
%const_tmp2 = alloca [255 x i8], align 1
store [12 x i8] c"more than 1\00", ptr %const_tmp2, align 1
%2 = call i32 (ptr, ...) @printf(ptr %const_tmp2)
ret void
br label %outer
outer: ; preds = %else, %then, %entry
ret void
}
Gives the error:
Terminator found in the middle of a basic block!
label %then
Terminator found in the middle of a basic block!
label %else
LLVM ERROR: Broken function found, compilation aborted!
Terminators are the last instruction of a basic block. ret and br are terminators. They may only be the last instruction (terminator) of a block. You have br after ret, which is illegal. It is questionable how to reach the br after the ret.