[PATCH] Documentation parsing: allow some commands to have multiple paragraphs attached


Fariborz and I would like to propose a change to our current comment parsing
model to allow multi-paragraph parameter and return value descriptions.

Please take a look and tell what you think.


comment-parsing-allow-multiple-paragraph-args-v1.patch (78.9 KB)

Hi Dmitri,


Fariborz and I would like to propose a change to our current comment parsing
model to allow multi-paragraph parameter and return value descriptions.

Please take a look and tell what you think.

Doxygen works differently than what you propose.
Doxygen has commands that have "top level" scope, like \param and \returns.
These commands:
- automatically end a brief description
- stop at the end of the paragraph (if the command has paragraph scope),
  or at the next command with top level scope, whichever comes first.

I've recently introduced \parblock .. \endparblock to deal with the
case where a user actually wants to write multiple paragraphs at a place
where a single paragraph is expected.


Case 1

/// \param x1 Aaa.
/// Aaa. Aaa.
/// Bbb.
/// \param x2 Ccc.
void doSomething(int x1, int x2);

In this case, the user most likely intended "Bbb" to be a second paragraph of
the \param x1. But our current parsing model only allows a single paragraph to
be attached to a block command, so we treat "Bbb" as a part of *function
description*. Because this is the first paragraph of the function description,
"Bbb" also becomes the brief description.

Doxygen treats Bbb as part of the detailed description, since the first \param has already
ended the brief description. Bbb is not part of the \param's documentation since
it is in the next paragraph. To get the behaviour you describe a user should write:

/// \param x1
/// \parblock
/// Aaa. Aaa. Aaa.
/// Bbb.
/// \endparblock
/// \param x2 Ccc.

Case 2

/// \returns
/// \li Foo, or
/// \li EnchancedFoo.
Foo *makeFoo();

In this case, \returns and \li are block commands, so \returns has an empty
paragraph, and \li points are separate from \returns and become a part of the
function description. Furthermore, "Foo, or" becomes the brief function

Proposed change to parsing model

\param, \tparam and \returns commands consume paragraphs and block commands
until we hit a command that is only allowed to appear at the top level. For

/// \returns Either:
/// \li Foo, or
/// \li EnchancedFoo.
/// \param isEnchanced Aaa.
Foo *makeFoo(bool isEnchanced);

Everything starting from "Either:" until "\param" -- one paragraph and two \li
commands -- become child nodes of the \returns command. Because \param is a
top-level-only command, we stop attaching children to \returns and return to
top-level at that point.

Right now I have identified that it makes sense to allow only \li, \arg (alias
of \li), and \verbatim-like commands to be nested within other commands. All
other block commands are top-level-only.

Doxygen's \li command is not a command that has "top-level" scope, so it can indeed
be nested inside a \param, or \returns.

The command does has paragraph scope, so it ends at the next paragraph.

/// \returns Either
/// \li First item
/// \li Second item
/// This text ends the list but not the returns section
/// Top level text continues outside of returns

Note that for automatic lists the indentation of the paragraph
determines the end of a list item:

/// A list:
/// - item 1
/// - sub item 1
/// - sub item 2
/// text of sub item 2 continues...
/// text of item 1 continues...
/// - item 2
/// More text for item 2.
/// Text after the list

What comments will parse differently

Comments where the user placed the long description after parameter or return
value description will parse differently. For example:

/// \param x1 Aaa.
/// This functions does...
void foo(int x1);

"This function does..." used to be a brief description, now it is the second
paragraph of parameter description.

One can get the previous behavior again by using explicit \brief or \details
commands, depending on the intent:

/// \param x1 Aaa.
/// \brief This function does...

How Doxygen handles this

As far as I see, in its output, Doxygen preserves the sequence of paragraphs,
and it also does not try to assign semantic meaning to paragraphs. Because of
this, Doxygen will not hit any issues regardless whether it uses the original
Clang's parsing model or this proposed model -- it does not make a differece
for the output that Doxygen produces.

I think I've explained that it does differ. It would make it harder for users
to write documentation that works well with clang and doxygen. So I hope it
is possible to make the implementation more in line with the way
doxygen processes comments. Let me know if I can help.



Thank you for your reply. I will work towards making Clang to match
Doxygen better.