@jayfoad, thanks for drawing my attention to this!
I don’t have a strong opinion on whether it’s more maintainable to put this mode into Tablegen proper, or to keep it at arm’s length as a consumer of the JSON. I can see pros and cons either way, assuming this is intended to end up in-tree one way or the other.
(If it’s not, then the JSON approach has all the advantages, because it saves some poor person from having to maintain a downstream patch on the Tablegen C++ code forever.)
But I’d have no objection to putting a few more pieces of information into the JSON output, if they’re useful to somebody. Source locations in particular seem as if they’d be useful to other users too, because one of the uses of -dump-json
is that it lets you write your own completely new Tablegen backends without having to compile them into the Tablegen binary itself (e.g. if you’re trying to do something that can never be upstreamed, or a “build one to throw away” level of rapid prototype). And Tablegen backends certainly have a legit need to report semantic errors in the input, and reporting them with a source location is more useful to the end user.
(The other use of -dump-json
is for people doing auxiliary analysis on data that is also being consumed by one of the existing built-in backends, such as extracting a target’s list of instructions, or the list of clang options, and one or two specific facts about each one. For that use, error reporting isn’t so critical, because the existing backend that consumes the same data has surely checked its semantic consistency already. That’s why source locations aren’t already in the JSON output.)
One of the early drafts of -dump-json
actually generated a lot more information than the final version does: it had basically anything you could find in -print-records
, including all the partially specified parametric expressions in the class definitions. My reasoning was that that way I was sure that anything you were previously doing by fragile text-matching on the output of -print-records
would be possible to do more reliably by consuming the JSON.
But code review suggested cutting down the data to something much less ambitious, partly because the full version would have been huge. So if we’re going to add more things to the JSON, we should keep it to only the things someone actually has a use for.
(Also, I’m not sure even the original draft of -dump-json
would have included the information about immediate class ancestry, because I don’t think -print-records
shows it either.)