Source code cross reference tool based on libclang

Hi all,

We’re thinking about building a web-based source code cross reference tool based on libclang, which uses type information, call graphs, etc. So before we start writing this tool, I wanted to ask you guys if you know of any such tool which is already available. I have not been able to find one, but I’m sure that you guys should know about it if one exists.

Thanks!

Hi Ehsan,

Sounds cool!

I have a weekend hack that adds a plugin (instead of using libclang)
that dumps some type information to a sqlite db (running this _while_
a normal build is running slows down the build ~10% in addition to
also doing the normal build), and a tiny server prototype that serves
some of the contents of this db. It's very incomplete, but maybe it's
useful to you: https://github.com/nico/complete

Nico

Hi Ehsan,

I have been working on Synopsis (http://synopsis.fresco.org), which started as a source code documentation tool (similar to doxygen), but with many more features. In particular, at some point we introduced an "SXR" mode (akin to LXR: http://lxr.linux.no). In contrast to LXR, however, Synopsis / SXR will use actual type and symbol information for the cross-referencing, so instead of just looking up via text matching, Synopsis supports C++ symbol lookup.

I have been working on a new version of Synopsis that uses libclang as its CPP/C/C++ parser, and hope to be able to announce a first version with that new frontend soon after an LLVM version is released that contains all the required bug fixes and feature additions to CLang that Synopsis depends on. (I'm right now working off LLVM trunk.)

I'd be happy to discuss details concerning what features you are interested in, to see how much of it is already supported, and how hard it is to implement the remainder.

Thanks,
         Stefan

PS: Unfortunately the Synopsis website is partly broken right now, since the old server was replaced and I haven't managed to set up and adjust all the services yet that it relies on.

Hi all,

We’re thinking about building a web-based source code cross reference
tool based on libclang, which uses type information, call graphs,
etc. So before we start writing this tool, I wanted to ask you guys
if you know of any such tool which is already available. I have not
been able to find one, but I’m sure that you guys should know about it
if one exists.

Hi Ehsan,

Hi Stefan,

I have been working on Synopsis (http://synopsis.fresco.org), which
started as a source code documentation tool (similar to doxygen), but
with many more features. In particular, at some point we introduced an
“SXR” mode (akin to LXR: http://lxr.linux.no). In contrast to LXR,
however, Synopsis / SXR will use actual type and symbol information for
the cross-referencing, so instead of just looking up via text matching,
Synopsis supports C++ symbol lookup.

I have been working on a new version of Synopsis that uses libclang as
its CPP/C/C++ parser, and hope to be able to announce a first version
with that new frontend soon after an LLVM version is released that
contains all the required bug fixes and feature additions to CLang that
Synopsis depends on. (I’m right now working off LLVM trunk.)

I’d be happy to discuss details concerning what features you are
interested in, to see how much of it is already supported, and how hard
it is to implement the remainder.

I had actually looked at Synopsis (not the libclang-based version, of course), but looking at the web site, it seemed to me that the project is not maintained any more. I’m happy to hear that is not the case. :slight_smile:

It would be really interesting for us to see your tool. It’s OK that you’re working off of LLVM/clang trunk, that’s where we are too! :slight_smile:

In terms of features, what I think we would need is a class library, a global function library, call graphs (list of callers, callees), type information (what’s the type of this variable), macro expansion information (what does the compiler see here?), macro instantiation sites (where in the code is this macro used?), inheritance information, and maybe some other things which I’m forgetting.

There’s this tool called dxr that some of the folks here at Mozilla have built on top of Dehydra/gcc <http://dxr.mozilla.org/>, but unfortunately it has a lot of rough edges, not enough documentation (or even people who can guide others through getting started), and it’s Mozilla specific. What we have in mind is replacing it with a tool which is better maintained, and is more fit towards the needs of other projects besides Mozilla as well.

I would love to contribute to SXR if we have the same general picture in our heads. :slight_smile:

PS: Unfortunately the Synopsis website is partly broken right now, since
the old server was replaced and I haven’t managed to set up and adjust
all the services yet that it relies on.

Yes, this is very unfortunate. I think it would be a really good idea to add a note to the main page saying just that, so that others looking at it don’t get the impression that the project is dead. :slight_smile:

Cheers,

I had actually looked at Synopsis (not the libclang-based version, of course), but looking at the web site, it seemed to me that the project is not maintained any more. I'm happy to hear that is not the case. :slight_smile:

I'll take a note to restore the server ASAP, to make it clear that the project is still alive. Sorry for that.

It would be really interesting for us to see your tool. It's OK that you're working off of LLVM/clang trunk, that's where we are too! :slight_smile:

Fine. I'll take another note to check my work in and give you enough information to try it out. Let me follow-up offline, unless anyone else here is interested in this.

In terms of features, what I think we would need is a class library, a global function library, call graphs (list of callers, callees), type information (what's the type of this variable), macro expansion information (what does the compiler see here?), macro instantiation sites (where in the code is this macro used?), inheritance information, and maybe some other things which I'm forgetting.

I believe all of the above but call graphs are already supported. (You can already ask where a symbol is used, and it will give you the source location. It won't generate a call graph though, and tell you what symbols are used in a given scope. But, that's a natural thing to add, and I'd be happy to see Synopsis grow support for this.

There's this tool called dxr that some of the folks here at Mozilla have built on top of Dehydra/gcc <http://dxr.mozilla.org/&gt;, but unfortunately it has a lot of rough edges, not enough documentation (or even people who can guide others through getting started), and it's Mozilla specific. What we have in mind is replacing it with a tool which is better maintained, and is more fit towards the needs of other projects besides Mozilla as well.

I would love to contribute to SXR if we have the same general picture in our heads. :slight_smile:

Great. Let's talk a little more about that picture to see whether we can converge on something useful. What you are saying sounds exactly like the kind of tool I had been aiming for right from the start.

    PS: Unfortunately the Synopsis website is partly broken right now,
    since
    the old server was replaced and I haven't managed to set up and adjust
    all the services yet that it relies on.

Yes, this is very unfortunate. I think it would be a really good idea to add a note to the main page saying just that, so that others looking at it don't get the impression that the project is dead. :slight_smile:

OK. I'll do that if I can't fix it quickly.

I'll follow up in a private mail to send you some code, so you have something to look at while I'm fixing the website.

Thanks,
         Stefan

Stefan Seefeld <seefeld-rieW9WUcm8FFJ04o6PK0Fg@public.gmane.org> writes:

[...]

It would be really interesting for us to see your tool. It's OK that
you're working off of LLVM/clang trunk, that's where we are too! :slight_smile:

Fine. I'll take another note to check my work in and give you enough
information to try it out. Let me follow-up offline, unless anyone else
here is interested in this.

I'm interested, too. (I must admit I'm not particularly interested in
web pages (LXR-type things), but that has value too.)

I'd have thought you checking in your code and giving a mention on
cfe-dev is perfectly reasonable.

[...]

Hi Ehsan, Stefan,

We're thinking about building a web-based source code cross reference tool
based on libclang, which uses type information, call graphs, etc. So before

I also think in that direction. Which DB you mean? Niko dumps to sqlite. I think about Sleepycat Berkeley dbxml. Actually in my dreams, all internal representations in Clang and LLVM can be stored and loaded from the DB (and besides visualization and queries, it is also possible to research/develop transformation passes using xquery, before the real C++ implementation).

@Stefan: Which DB uses Synopsys?

Kind Regards,
Alek

Hi Alek,

please note I'm talking about Synopsis, not Synopsys. The latter is an entirely different thing.
Synopsis right now isn't using a database at all. The SXR server would simply load data from a Python pickle. That just means the tool hasn't yet been used with projects where size or lookup time was an issue. If it becomes, we can certainly migrate to a real database backend with indexing capabilities.

Regards,
         Stefan

While I'm unfortunately still trying to get hold of a sysadmin to help me resurrect the Synopsis website / infrastructure, I did make quite a bit of progress on the Synopsis code, as far as integrating with libclang is concerned. The current state of this work can be accessed via http://synopsis.fresco.org/svn/synopsis/trunk/.

I'm developing on a Linux distribution. I haven't attempted to build on Windows for quite a while, and never worked on OSX. This is Free software, so patches as well as other help are certainly welcome.

If you want to discuss any related topic, please join me on irc://irc.oftc.net/synopsis or the mailing list at synopsis-devel@lists.fresco.org <mailto:synopsis-devel@lists.fresco.org>.

Thanks,
         Stefan

Interesting discussion.
I think the cross reference feature was not only used in tools you're going to create. I think it is also a nice feature for an IDE.

asmwarrior
ollydbg from codeblocks' forum

If you want to discuss any related topic, please join me on

In the long term, I really believe that we can achieve great results once we manage to store the Clang's internal representations to a powerful DB engine with appropriate query language (as I mentioned before, XQuery seems the best candidate for me [*]).

But until then, let's utilize the current Synopsis/SXR: Have you tried to apply your fresh Clang based version on the LLVM trunk itself?. If you give me some instructions, I also can start public Synopsis/SXR service for the Clang/LLVM codebase, as a test bench.

Kind Regards,
Alek

[*] Because of its natural namespaced tree with attributes model and existing well optimized open source implementations. Over time, various query components (functions in XQuery) on top of the LLVM representations can be accumulated as reusable XQuery modules.

P.S. It is a specific query language, but only as illustration of the approach:

(unfortunately it is closed source, implemented as compiler to SQL)

If you want to discuss any related topic, please join me on

In the long term, I really believe that we can achieve great results once we manage to store the Clang's internal representations to a powerful DB engine with appropriate query language (as I mentioned before, XQuery seems the best candidate for me [*]).

I'm not convinced at all. Isn't that a flavor of the same old "XML solves all your problems" argument again that was so popular ten years ago ? How do you express domain-specific semantics, such as C++ symbol lookup, in a generic querying language ?

As I mentioned, there is no reason not to scale Synopsis up to using a different backend to store its IR in (such as a database), if that will indeed improve performance. Let's cross that bridge when we come to it...

But until then, let's utilize the current Synopsis/SXR: Have you tried to apply your fresh Clang based version on the LLVM trunk itself?. If you give me some instructions, I also can start public Synopsis/SXR service for the Clang/LLVM codebase, as a test bench.

I'm afraid I'm not quite there yet, though it shouldn't be too long. (Still need to figure out how to configure system header search paths and system macros, etc.. Before clang, I used some heuristics from installed system compilers (GCC, MSVC). I'm not sure yet how clang is to be used in that respect, i.e. whether it figures any of this out on its own, or whether I need to feed all of those in explicitly.

That being said, by all means come and play with the code and provide feedback. Nothing will help more to get things done !
I'll gladly help you getting started...

Regards,
         Stefan