diagnostics database for multi-file build?

I’m interested in a way to collect unique errors/diagnostics over multiple files in a single build. Does such a thing exist? e.g. some kind of database that could be shared between multiple concurrent instances of clang which would allow me to review all unique errors across an entire code base, ideally in real time but perhaps once the build completes would be okay to start with too. Thanks.

Hi Samuel,

What do you mean by "unique"? Do you mean something like

clang -Wall -pedantic -Wextra -c *.c | perl -pwle
's/^\S+?:(\d+:)?(\d+:)?\s*//' | sort | uniq -c

Hi Csaba, thanks for the quick reply.

I have a build system which builds multiple files concurrently.

Because of this, the diagnostics get interleaved and it is hard to understand the output.

Additionally, I’d like to machine parse the diagnostics by a plugin to atom text editor so I can highlight lines, present a useful list of errors, provide navigation, etc.

What you’ve suggest looks good if you can invoke clang just once, or if you invoke it multiple times, collect all the output and then merge it together. I was hoping there might be a simple solution, e.g.

rm database.sqlite3
clang -c foo.c --diagnostic-database=database.sqlite3

clang -c bar.c --diagnostic-database=database.sqlite3

clang -c baz.c --diagnostic-database=database.sqlite3

The finally I could inspect database.sqlite3 to see a list of all diagnostics. Ideally, you’d still get the PTY colourized output too - the database would be an additional option to coalesce unique errors over a large code-base and give a structured format for understanding the diagnostics that could be processed by an editor or other tool.

Ideally this would be an open standard, e.g. this line, this character range, and could be used across multiple tools, e.g. gcc, python, ruby, etc.

Hi Samuel,
If your build system uses GNU make, the -O flag in version 4 and above
might help.

How are you invoking clang? Ninja solves this by redirecting standard output error for each tool invocation to a pipe that it can then collect, which allows it to present them separately. If you want to extract unique errors, then simply filtering the result on the file and line number would probably be enough (though, in general, it's a bad idea because you can have different bugs in the same line of code depending on prior inclusions).