[analyzer] Retrieving macro expansions in the plist output


I’m currently trying to implement a new node in the plist output that would show the expansion of a macro, like this:

void setToNull(int **vptr) {
*vptr = nullptr;

void print(void*);

#define TO_NULL(x)

#define DOES_NOTHING(x)
int b;
b = 5;

#define DEREF(x)

void f() {
int *a = new int(5);
DEREF(a) = 5;

For this code, two PathDiagnosticMacroPieces should be generated, and a message like this should be displayed:
Expanding macro ‘TO_NULL’ to ‘print(&a); setToNull(&a)’
Expanding macro ‘DEREF’ to ‘{ int sajt; sajt = 5; } print(a); *a’
I’ve made some progress on this issue, but here are the problems I faced.

Currently, HTML file generation supports macro expansions fairly well, however, the code that achieves this seems to be held together by sheer luck:
As I understand it, the entire file is re-lexed and preprocessed, during which macro expansions are added to the HTML file. I attempted to reuse the code here without having to re-lex everything, but so far my efforts lead to little success. I also found that any modification of this code quickly leads to instabilities – although this part of Clang is somewhat of a mystery to me just yet, so I’d concede to this being my fault.

I also fear that since HTML output is not as used as other outputs, and could lack vigorous testing, it could be crash-prone.

Do you know any way of obtaining the expansion of a macro expression that doesn’t involve such abusement of the preprocessor? Something like this would be ideal:

SourceLocation Loc = /* PathDiagnosticMacroPiece location */;



I was also taking the HTMLRewrite code as base for expanding macros. Maybe this is helping you as additional data point for achieving what you want. lists.llvm.org/pipermail/cfe-dev/2018-July/058489.html

Unfortunately I didn’t receive much feedback and couldn’t solve the issue with macro argument expansion. That issue shouldn’t affect your use-case.

One thing I improved on was to use a new Preprocessor instead of reusing the existing one. This was necessary because I needed recursion. Even though the code was indeed fragile, there don’t seem to be any API misuses and in the end it was stable for me.


I guess nobody did this before because plists were supposed to be used by IDEs, and IDEs were supposed to have their own jump-to-definition functionality (in this case, macro definition).

I don’t know that much about macros and source locations, but i suspect it should indeed be possible to obtain the expanded text by traversing spelling and expansion locations, though probably sometimes scanning token-by-token is inevitable. To state the obvious, there are a lot of useful methods in the SourceManager, but it’s often unobvious how to combine them correctly :confused:

I just wanted to say that it would be really amazing if HTML output knew how to show the macros properly (I guess it might be less critical for plists, as Artem says).
This is definitely one of my top complaints about HTML reports; I’ve tried working on it at the past, but other work took priority.

Thanks for the great responses!

I’ve had quite a few conversations with my colleges and I think I’ll take another direction with this, one that doesn’t involve plists. It’s a great point to mention that macro expansions shouldn’t be handled by the analyzer, although jump-to-definition isn’t an ideal solution to to my (our) problems, as macro definitions are often very hard to read due to them containing references to various other macros.



George Karpenkov <ekarpenkov@apple.com> ezt írta (időpont: 2018. aug. 18., Szo, 2:03):

I guess nobody did this before because plists were supposed to be used by IDEs, and IDEs were supposed to have their own jump-to-definition functionality (in this case, macro definition).

Is it correct that plists are only meant for IDEs? IDEs cannot always know which compilation action was used to analyze the TU. Especially if the analysis (or the compilation) is made for multiple targets (32-64 bits etc).

I’ve often been made aware of user complaints that the Static Analyzer found a false positive, but it’s often virtually impossible to trace back what really happened, as the cluster of macros was so hard to follow, not even jump-to-definition was a satisfactory solution.

I think it would valuable to show the macro expansion as the Static Analyzer expands it. I’m however a little uncertain as to how I’d implement this. There’s two options that came to my mind:

  • Add macro expansion info into the plist output (behind an optional flag, and maybe a macros-as-events flag too). This would easily be the best solution for my needs, but seems to be hard to implement well. It also depends on other consumers of the plist output whether there’s an actual desire for this feature other than mine.

  • Develop a standalone clang tool that would create a new file with the macro expansions. This would be the least invasive solution for plist file generation, but also the least comfortable for my needs.

My main question is, would you be fine with adding a new macro node to the plist output?



Kristóf Umann <dkszelethus@gmail.com> ezt írta (időpont: 2018. aug. 27., H, 17:11):

I mean, i understand that you’re trying to extend the format for a different but completely valid use case, and there’s nothing wrong about it, but i just shared my random guess on why was it not important to support, originally, in a historical perspective. That sounds strange to me. IDEs are assumed to be responsible for both compilation and analysis (otherwise what does “I” stand for?), so they definitely do know this stuff. And it’s clear that in our case different build targets would produce different analysis results, so any IDE that integrates the analyzer would need to keep track of that. Users should either send preprocessed files with run-lines (a-la what scan-build automatically dumps; possibly preprocessed with -frewrite-includes to preserve macros), or very clear instructions on how to build their code. Otherwise there are tons of other reasons why you would be unable to reproduce the bug. One of the most infuriating problems with reproducing bugs is hitting various complexity thresholds in the analyzer. Eg., contains an epic story of how in order to reproduce the bug i needed to hit a threshold with ±0.08% precision. It’s great that we have determinism, right? I would probably not be using it in the foreseeable future, but i’m definitely not opposed to it. It seems that plists are quite reliable, i.e. adding more keys to their dictionaries usually doesn’t break existing stuff. Because George recently observed that generating plists might be a relatively time-consuming operation (not as slow as generating htmls but still noticeable in some cases), it might be good to keep it under the flag; i’m not fully sure that these observations are accurate, so if they aren’t confirmed, i’m fine with having these dumps without a flag. Are you planning to dump all macro expansions, or only expansions around diagnostic pieces?

Are you planning to dump all macro expansions, or only expansions around diagnostic pieces?

Sorry for the late reply – it simply took this long to make a functioning prototype, so I didn’t know whether dumping all macro expansions or just the related ones would be up to a simple if branch, or a completely different approach. Right now it looks like (well, its mostly already decided) only related macro expansions will be dumped.

I’m planning to upload a patch once I can prettify the current code in the coming days! :slight_smile:



Mm, actually you may want to use the middle-ground solution of referring to George's coverage dumps.

Can you elaborate? I’ve been trying to follow closely the patches and overall discussion in the analyzer in the last couple of months, but I don’t really see how macro expansions could be acquired from that.

I mean, not macros but the information on which macros are of interest to the user.

You know which lines of code were executed, so you can figure out which macros the user may want to look into.

The downside is that it’s a bit counter-intuitive for the user, i.e. it’ll be unobvious why some macros are expanded and others are not.

That sounds good, but I think dumping the macro expansions as proposed would be a lot better for our needs as of now.
But then again, why not both after this one?