new discussion about ading doxygen comments to the LLMV intrinsics

Hi all,

I’d like to propose adding doxygen comments for intrinsics. I’m also proposing a specific format for these comments, although I welcome any opinions on how it could be improved.

There are several benefits to adding doxygen comments to the intrinsics headers:

(1) Both MS tooltips and XCode will be able to display intrinsics documentation.

(2) Online documentation for LLVM intrinsics headers can be easily generated and browsed in http://llvm.org/doxygen

(3) It is possible to convert doxygen comments to other formats such as pdf or chm. This is an important benefit for companies that need to deliver update-to-date intrinsics documents in printable formats. It also helps to reduce documentation support cost.

(4) Maintaining intrinsic documentation and code in one place makes it easier for developers to keep them in sync.

Our documentation used to be kept in a closed format (CHM) and we decided to convert it into doxygen with the intention to keep it in the doxygen-annotated header files exclusively moving forward, as it would be much easier to maintain.

Converting hundreds of documented intrinsics manually would have been tedious and error prone, so I have created a hack of a tool (DCG) that inserts doxygen comments into LLVM intrinsics header files. With the help of this tool we have added corresponding intrinsic comments (describing many of the SSE4a, AVX, BMI, SSE2, F16C, MMX, SSE3, POPCNT, PREFETCHW, SSE4.1, SSSE3, SSE intrinsics) to the LLVM header files. DCG takes a CHM file containing PS4 internal documentation as input, processes it, and matches the intrinsics from LLVM headers with the intrinsics from the documentation. The tool does all the necessary formatting to create the doxygen comments that comply with LLVM coding style and then inserts each generated comment in front of the corresponding intrinsic definition. Again, DCG was written as a one-time use tool. It’s large, I’m not clear what the long term value or use of it would be, and it would require much work to get it ready to open source.

Below are the format examples of two main kinds of doxygen comments I’m proposing (Intrinsics are taken from ammintrin.h):

  • The first example contains a regular intrinsics definition.

  • The second example contains an intrinsic that is described via macro definition.

/// \brief Extracts the specified bits from the lower 64 bits of the 128-bit

/// integer vector operand.

///

/// \headerfile <x86intrin.h>

///

/// \code

/// This intrinsic corresponds to the \c EXTRQ instruction.

/// \endcode

///

/// \param __x

/// The value from which bits are extracted.

/// \param __y

/// Specifies the index of the least significant bit at [13:8],

/// and the length at [5:0].

/// \returns A 128-bit vector whose lower 64 bits contain the bits extracted

/// from the operand.

static inline __m128i attribute((always_inline, nodebug))

_mm_extract_si64(__m128i __x, __m128i __y)

{

return (__m128i)__builtin_ia32_extrq((__v2di)__x, (__v16qi)__y);

}

/// \brief Extracts the specified bits from the lower 64 bits of the 128-bit

/// operand, using the length and bit index specified in the immediate

/// bytes.

///

/// \headerfile <x86intrin.h>

///

/// \code

/// __m128i _mm_extracti_si64(__m128i x, const int len, const int idx);

/// \endcode

///

/// \code

/// This intrinsic corresponds to the \c EXTRQ instruction.

/// \endcode

///

/// \param x

/// The value from which bits are extracted.

/// \param len

/// Specifies the length at [5:0].

/// \param idx

/// Specifies the index of the least significant bit at [5:0].

/// \returns The bits extracted from the operand.

#define _mm_extracti_si64(x, len, idx) \

((__m128i)__builtin_ia32_extrqi((__v2di)(__m128i)(x), \

(char)(len), (char)(idx)))

The format is slightly different for these two cases. The second example contains an additional \code section describing the equivalent function prototype for the macro. Even though the intrinsics is implemented as a macro, it does have expectations on the parameter types and the return type, and if left undocumented, such information is not always obvious to the user. Please review the proposed format and comments’ examples below.

Thanks,

Katya

This sounds like a good idea to me.

To be clear, once the initial conversion happens, the doxigen comments
will the be the canonical source. That is, DCG is to be used once,
right?

Can you upload the patch to phabricator?

Hi Rafael,

Yes, after the initial conversion, the doxygen comments will be the canonical source. Afterwards, they will live their own lives in LLVM headers (i.e. they could get added, modified, etc).
DCG will be used only once (for the initial conversion).

I haven't seen any responses regarding the format of the comments for the intrinsics yet. However, I have spent quite some time discussing/changing the format with Dmitri Gribenko offline earlier (thanks, Dmitri) and he gave me his "informal" OK.

I will prepare the patch for the smallest of the intrinsics headers and send it out for the review. Hopefully, I will see some feedback about the format of the comments afterwards.

Katya.

This seems pretty reasonable to me.