Hi,
As I understand the ownership, any given Expr is owned by the Expr it's
an argument for, which is owned by the Stmt it makes up, which is owned
by the function declaration, and so on, up to the translation unit.
Except for some rare cases, like VLA sizes being owned by the
VariableLengthArrayType, and function default argument expressions being
owned by their FunctionDecl.
Yep.
Expressions that are currently being built - e.g. the argument
parameters of ActOnBinOp - are essentially unowned.
They are owned, but the ownership is transfered to the ActOn method that gets them. It is up to that method to deallocate them or transfer ownership to the return value.
As such, if the Sema
function returns early, without building some object to wrap them, they
leak. In addition, they leak when the parser returns early and abandons
the ExprTys it was carrying around. This is seen as acceptable, because
the program will soon end anyway if there was an error.
This is not acceptable, Clang can be used by other clients that are long lived, and leaking memory is a really bad thing.
Still, not everyone seems to think so. Doug goes out of his way to not
leak these objects in the Sema routines, and asked me not to leak them,
either. Chris thought it great that Doug wasn't leaking, too.
I don't know that anyone thinks it good that we leak memory :). The history here is that much of Sema was built before there *was* a clear ownership model. This means that there is a lot of "bad" code in it that doesn't obey the newly defined ownership model. Also, the parser has a couple of places where it leaks as well. If the parser does error recovery and ends up obviating a parsed expr/statement/etc, it should call the Actions::DeleteExpr/DeleteStmt methods to free them. This happens in some cases, but not all of them.
To destroy a Stmt or its derived classes, you don't use delete, but the
Destroy member. This may be overridden in subclasses, and the base
version first recursively calls Destroy for every child statement, and
then commits suicide (delete this). Calling delete yourself is *not*
sufficient.
I think that statements (exprs are statements) are freed with delete currently:
void Sema::DeleteExpr(ExprTy *E) {
delete static_cast<Expr*>(E);
}
void Sema::DeleteStmt(StmtTy *S) {
delete static_cast<Stmt*>(S);
}
We're slowly migrating decls to use a Create/Destroy interface, in order to let them be pool allocated or use some other allocation mechanism. It would be good to also move statements this way eventually, but decls should be finished first.
Is this correct? Because I want to clean this up. I want to make the
Sema routines (and later the parser, too) not leak. So I need to know
exactly how this works.
This would be great. Probably the first place to start is to use audit the parser for places where early exits cause parsed stmts to be leaked. Once that is done, start going through sema. The pattern we want to use in Sema for statements is using OwningPtr to hold incoming arguments. For example, ActOnBinOp should look like:
ExprTy *LHS, ExprTy *RHS) {
BinaryOperator::Opcode Opc = ConvertTokenKindToBinaryOpcode(Kind);
OwningPtr<Expr> lhs = (Expr *)LHS, rhs = (Expr*)RHS;
...
// Build a built-in binary operation.
return CreateBuiltinBinOp(TokLoc, Opc, lhs.take(), rhs.take());
For routines that have more incoming values or are variadic, the pattern in ActOnCallExpr should be followed.
Incidentally, I'd like for C++ specific code to be moved to one of the C++ Sema files where it makes sense. For example, the Sema::ActOnBinOp contains a bunch of code for binary operator overloading. I'd prefer the code to look like this:
if (getLangOptions().CPlusPlus &&
(lhs->getType()->isRecordType() || lhs->getType()- >isEnumeralType() ||
rhs->getType()->isRecordType() || rhs->getType()- >isEnumeralType())) {
if (HandleCXXBinOpOverloading(...))
return result;
}
// Non overloading code.
-Chris