[PATCH] Extend the use of DeclGroup

This patch is the first step to solve the problem that no one is owning the
struct decl in the following code:

typedef struct s {} s_t;

Later a TypeSpecifier holding this struct decl will be added to DeclGroup as
discussed on the mailing list months before.

The root of changes is at Sema::FinalizeDeclaratorGroup. It returns
DeclGroupOwningRef instead of DeclTy*. A bunch of related methods now return
DeclGroupOwningRef instead of DeclTy*.

Top level declarations are DeclGroups instead of Decls. TranslationUnit
uses a vector of DeclGroupOwningRef to track all Decls. Most of the dtor of
TranslationUnit is disabled. The cleanup works should be done by the DeclGroup.

LinkageSpecDecl now uses a DeclGroup array to track all Decls.

Three ObjC static analysis test cases fail. I haven’t figured out the reasons.

dg.patch (41.6 KB)

Hi Zhongxing,

Thanks for taking this on! This is a big patch, so I'm taking some time to go through it. I had a quick question though about one thing that caught my attention:

  TranslationUnit::~TranslationUnit() {
+ for (std::vector<DeclGroupOwningRef>::reverse_iterator
+ I = TopLevelDecls.rbegin(), E = TopLevelDecls.rend(); I != E; ++I) {
+ I->clear();
+ }

Hi Zhongxing,

Thanks for taking this on! This is a big patch, so I’m taking some time to go through it. I had a quick question though about one thing that caught my attention:

TranslationUnit::~TranslationUnit() {

  • for (std::vector::reverse_iterator
  • I = TopLevelDecls.rbegin(), E = TopLevelDecls.rend(); I != E; ++I) {
  • I->clear();
  • }

+#if 0
if (OwnsDecls) {
llvm::DenseSet<Decl*> Killed;
for (std::vector<Decl*>::reverse_iterator I=TopLevelDecls.rbegin(),
@@ -102,7 +108,7 @@
(*I)->Destroy(*Context);
}
}

+#endif

It looks like that in your patch that Decls will not be freed in ~TranslationUnit since the “Destroy” method for each DeclGroupRef is not called. Did you decide to do this because of issues with ownership?

Yes.

If so, there needs to be a big FIXME comment here documenting the issue.

Yes.

Actually, I don’t think we should have a regression in functionality here (i.e., releasing Decls) unless it is absolutely necessary.

I tried to Destroy() the DeclGroup in both order, but both crashed on some test cases. I think there may be bugs in the DeclGroupOwningRef’s Destroy() method. I didn’t investigate that further because I want to keep this patch as small as possible.

While the code in ~TranslationUnit that bends over backwards to avoid deleting a Decl more than once is gross, it also works, and IMHO shouldn’t be removed until we have cleaned up the ownership issues in the AST.

May be similar code should be added to DeclGroupOwningRef::Destroy() to make it work.

New patch due to the changes in r61735.

dg.patch (32.6 KB)

Hello Zhongxing,

Thanks for looking into this. I'm looking at your latest patch, but I'm replying to your original message because I want to discuss a few of the issues surrounding decl ownership. It's an area that's in flux right now, and in some sense the DeclContext work I've been doing is clashing with DeclGroups.

This patch is the first step to solve the problem that no one is owning the
struct decl in the following code:

typedef struct s {} s_t;

The result that struct s { } isn't really owned here is because, at present, we have two entities in the AST that are trying to own the declarations in the translation unit, TranslationUnit and TranslationUnitDecl, and the one that currently owns the declarations in the translation unit doesn't know about "struct s { }".

With the typedef above, TranslationUnit's TopLevelDecls knows about only "s_t", not "struct s", because TranslationUnit::AddTopLevelDecl only gets called for "top-level" declarations.

TranslationUnitDecl, on the other hand, is a DeclContext that knows about all declarations in the scope of the translation unit, including both "struct s" and "s_t". (It has to know about all declarations, because it's used for name lookup).

My intent with DeclContexts is that a DeclContext should own all of the declarations within that context. So TranslationUnitDecl (which is a DeclContext) should own both "struct s" and "s_t". The problem, in our current state, is that TranslationUnit also tries to own top-level decls (currently, "s_t" but not "struct s") through its TopLevelDecls member, and DeclGroups try to own declarations in a declaration statement (e.g., "x" and "y" in "void f() { int x, y; }"), leading to this insanely gross hack in the serialization of DeclContexts (see DeclContext::EmitOutRec):

     bool Owned = ((*D)->getLexicalDeclContext() == this &&
                   DeclKind != Decl::TranslationUnit &&
                   !isFunctionOrMethod());

What's that's saying is, basically, DeclContext owns its decls except in the cases where we have DeclGroups. That's the clash between the two mechanisms that I mentioned above.

I was working toward a solution that would change DeclGroups and TranslationUnit so that they don't own any decls. Instead, DeclContext would handle all of the ownership issues. The benefit of this is that there are many places where we have DeclContexts already managing ownership of their decls (namespaces, classes, linkage specifications, and enumerations come to mind), and all of those places could benefit from having support for DeclGroups to describe the syntax the user wrote.

Trying to summarize: I think DeclContext should handle all of the ownership issues, DeclGroups should handle representing the syntax, and I'd like the two to work together.

Later a TypeSpecifier holding this struct decl will be added to DeclGroup as
discussed on the mailing list months before.

The root of changes is at Sema::FinalizeDeclaratorGroup. It returns
DeclGroupOwningRef instead of DeclTy*. A bunch of related methods now return
DeclGroupOwningRef instead of DeclTy*.

Top level declarations are DeclGroups instead of Decls. TranslationUnit
uses a vector of DeclGroupOwningRef to track all Decls. Most of the dtor of
TranslationUnit is disabled. The cleanup works should be done by the DeclGroup.

Ah, this is the most direct place where we clash. Your patch updates TranslationUnit's vector (TopLevelDecls) to store DeclGroups rather than Decls (which is good), but my plan for DeclContexts was to eliminate TranslationUnit::TopLevelDecls entirely: clients that want to iterate through the declarations in a translation unit would, instead, use TranslationUnitDecl's decls_begin()/decls_end().

My hunch is that what we want is to find a way to iterate through a DeclContext in a way that allows us to see the DeclGroups (or, at least, see the syntactic relationship between "struct s { }" and "s_t"). However, I'd like to hear your thoughts on these ownership and representation issues.

  - Doug

Hi Doug,

I’ll take a deep look at all the DeclContext stuff and mechanism in the following days. But my feeling is that DeclContext should be the ultimate solution to the ownership problem. I’ll wait for your progress in this area.

On the other hand, as you said, we still need a mechanism to recover the original syntax structure of the source (and more ideally, the layout of the source code). I think the DeclGroup could play that role with the following structure:
DeclGroup => TypeSpecifier Decls

Instead of owning the decls, the DeclGroup and TypeSpecifier could only refer to the decls with a pointer.

Also the TopLevelDecls in TranslationUnit probably could be kept to reflect the code layout?

  • Zhongxing

My hunch is that what we want is to find a way to iterate through a DeclContext in a way that allows us to see the DeclGroups (or, at least, see the syntactic relationship between “struct s { }” and “s_t”). However, I’d like to hear your thoughts on these ownership and representation issues.

  • Doug

Hi Doug,

I’ll take a deep look at all the DeclContext stuff and mechanism in the following days. But my feeling is that DeclContext should be the ultimate solution to the ownership problem. I’ll wait for your progress in this area.

I’ll try to get this done sooner rather than later. There is documentation on DeclContexts here:

http://clang.llvm.org/docs/InternalsManual.html#DeclContext

On the other hand, as you said, we still need a mechanism to recover the original syntax structure of the source (and more ideally, the layout of the source code). I think the DeclGroup could play that role with the following structure:
DeclGroup => TypeSpecifier Decls

Instead of owning the decls, the DeclGroup and TypeSpecifier could only refer to the decls with a pointer.

Yes, I agree.

Also the TopLevelDecls in TranslationUnit probably could be kept to reflect the code layout?

We shouldn’t need it. A DeclContext’s list of declarations (accessible via decls_begin/decls_end) contains all of the declarations in the order they occur within the source code, which is the same information that TopLevelDecls tries to capture now.

  • Doug

My hunch is that what we want is to find a way to iterate through a DeclContext in a way that allows us to see the DeclGroups (or, at least, see the syntactic relationship between “struct s { }” and “s_t”). However, I’d like to hear your thoughts on these ownership and representation issues.

  • Doug

Hi Doug,

I’ll take a deep look at all the DeclContext stuff and mechanism in the following days. But my feeling is that DeclContext should be the ultimate solution to the ownership problem. I’ll wait for your progress in this area.

I’ll try to get this done sooner rather than later. There is documentation on DeclContexts here:

http://clang.llvm.org/docs/InternalsManual.html#DeclContext

Thanks.

On the other hand, as you said, we still need a mechanism to recover the original syntax structure of the source (and more ideally, the layout of the source code). I think the DeclGroup could play that role with the following structure:
DeclGroup => TypeSpecifier Decls

Instead of owning the decls, the DeclGroup and TypeSpecifier could only refer to the decls with a pointer.

Yes, I agree.

Also the TopLevelDecls in TranslationUnit probably could be kept to reflect the code layout?

We shouldn’t need it. A DeclContext’s list of declarations (accessible via decls_begin/decls_end) contains all of the declarations in the order they occur within the source code, which is the same information that TopLevelDecls tries to capture now.

OK.

How do we want to proceed with DeclGroups? From my offline conversations my understanding is that we plan on going ahead with DeclGroups.

How do we want to proceed with DeclGroups? From my offline conversations my understanding is that we plan on going ahead with DeclGroups.

Yes. I think DeclContext needs to store DeclGroups (or, at least, provide a view so that the user sees DeclGroups), so that we’ll get the benefits of DeclGroups everywhere we can store declarations.

  • Doug