Improving -Wdocumentation


I recently started working on the clang compiler and started to improve
the -Wdocumentation option. I like this option a lot, but unfortunately
it has a lot of false positives. It also seems that the code is longer
actively maintained. My main issue is the detection whether paragraphs
are empty. It has a lot of false positives and also does not recognise
some empty paragraphs. So I have been working on some patches to improve
the empty paragraph detection.

I noticed the parsed comment is stored in the AST and the AST can be
written to XML and HTML files. I wonder whether this output is
considered a part of clang's interface and whether changes to this output
are allowed.

For example following code:
/** @pre @em foo of @ref bar "foo" is @b foobar */
void foo();

Results in the following AST:
`-FunctionDecl 0x55dd786dafc8 <, col:10> col:6 foo 'void ()'
  `-FullComment 0x55dd786db4c0 <line:186:4, col:49>
    >-ParagraphComment 0x55dd786db340 <col:4>
    > `-TextComment 0x55dd786db310 <col:4> Text=" "
    >-BlockCommandComment 0x55dd786db360 <col:5, col:20> Name="pre"
    > `-ParagraphComment 0x55dd786db440 <col:9, col:20>
    > >-TextComment 0x55dd786db390 <col:9> Text=" "
    > >-InlineCommandComment 0x55dd786db3e0 <col:10, col:12> Name="em" RenderEmphasized Arg[0]="foo"
    > `-TextComment 0x55dd786db400 <col:17, col:20> Text=" of "
    `-VerbatimLineComment 0x55dd786db460 <col:21, col:49> Text=" bar "foo" is @b foobar "

The comment starting with @ref is not part of the paragraph and not
'attached' to the @pre command. All this comment is part of a single
line based comment.

After my changes the AST becomes:
`-FunctionDecl 0x55f46f20f7e8 <, col:10> col:6 foo 'void ()'
  `-FullComment 0x55f46f20ff20 <line:187:4, col:48>
    >-ParagraphComment 0x55f46f20fc80 <col:4>
    > `-TextComment 0x55f46f20fc50 <col:4> Text=" "
    `-BlockCommandComment 0x55f46f20fcb0 <col:5, col:48> Name="pre"
      `-ParagraphComment 0x55f46f20fee0 <col:9, col:48>
        >-TextComment 0x55f46f20fce0 <col:9> Text=" "
        >-InlineCommandComment 0x55f46f20fd30 <col:10, col:12> Name="em" RenderEmphasized Arg[0]="foo"
        >-TextComment 0x55f46f20fd60 <col:17, col:20> Text=" of "
        >-ReferenceCommandComment 0x55f46f20fdd0 <col:21, col:24> Name="ref" Link="bar" Text="foo"
        >-TextComment 0x55f46f20fe00 <col:35, col:38> Text=" is "
        >-InlineCommandComment 0x55f46f20fe50 <col:39, col:40> Name="b" RenderBold Arg[0]="foobar"
        `-TextComment 0x55f46f20fe80 <col:48> Text=" "

The @ref is now part of a new group 'ReferenceCommandComment' and part
of the paragraph. Also the @b is now recognised again. This new result
looks more like how Doxygen renders the paragraph. The change also fixes
a false positive like:

/** @pre @ref Init is called before using this function. */
void foo();

Currently the paragraph of @pre is considered empty, after the change it
is no longer considered empty.

The changes to the AST will result in changes in the XML and HTML output.
(I haven't started working on this part yet so no examples.) Is this a
problem? If so what would be the best way forward?

Mark de Wever