Update on clang-extract-api: Clang Support for API Information Generation in JSON

Hi All!

About a year ago I sent out an RFC proposing a new Clang tool for collecting and serializing API information from header files. Thanks everyone for your interest, and great feedbacks and suggestions in the original email thread as well as Phabricator code reviews. Today I would like to share an update on the status of the development of clang-extract-api, and a simple demo of the current workflow.

Current Status

We’ve implemented all of the core components of clang-extract-api in a new clang library, ExtractAPI (checkout clang/include/clang/ExtractAPI and clang/lib/ExtractAPI):

  • API : This component defines the representations of the API information collected. Individual declarations are captured by records derived from the base APIRecord struct. And APISet holds all the records from the product defined by the input header files.
  • Serialization : Serialization contains the APISerializer interface that can be implemented to serialize an APISet, as well as a SymbolGraphSerializer implementation to serialize in the Symbol Graph format, as proposed.
  • DeclarationFragments : This component defines the Declaration Fragments representation, which is an abstraction of a symbol’s declaration, with language-agnostic annotations about syntactic/semantic properties of the fragments.
  • Finally, ExtractAPIConsumer.cpp glues everything together, defines the ExtractAPIAction frontend action that hooks into the new driver option -extract-api, processes the input header files, and kicks off the ExtractAPIVisitor that visits Decl nodes in the AST and collects API information.


Here is a simple demo of everything put together in action:

❯ tree
└── headers
    ├── anotherCoolAPI.h
    └── coolAPI.h

1 directory, 2 files

We provide some cool APIs in the two headers in the headers directory.

// coolAPI.h
#ifndef COOL_API_H
#define COOL_API_H

#include <stdint.h>

 * Defines 8-bit RGB+alpha colors
typedef struct Color {
  uint8_t red;   ///< Red component.
  uint8_t green; ///< Green component.
  uint8_t blue;  ///< Blue component.
  uint8_t alpha; ///< Opacity component.
} Color;

#define RGB(r,g,b) (Color){ .red=r, .green=g, .blue=b, .alpha=255 }


// anotherCoolAPI.h

#include "coolAPI.h"

const Color black = RGB(0, 0, 0);

/// Add opacity to a given color.
/// - Parameters:
///   - color: The original color.
///   - opacity: The amount of opacity to be added.
void addOpacity(Color *color, uint8_t opacity);


Now if we want to extract structural information about these APIs, we can use the following command-line to invoke the extract-api driver:

❯ clang -extract-api \
    -x c-header \
    headers/coolAPI.h \
    headers/anotherCoolAPI.h \
    -isysroot <SDK> \
    -Iheaders \
    --product-name=Demo \
    -o APIInfo.json

Clang will parse the two headers, visit the AST to collect information about the APIs, and finally write out the Symbol Graph output APIInfo.json (attached: APIInfo.json.txt (15.8 KB))

We’ve had great comments and reviews during the past year and I’d like to thank you again for your interest and help. Looking forward to bring this tool further and better as a community together!


Would you expect clang -extract-api -x c++-header headers/cool_cxx_API.h ... to work too?

Unfortunately the C++ support is not yet there in extract-api. The command-line you got is valid, but the AST consumer won’t visit C++ specific nodes so information might be missing, for example templates etc. And also the Symbol Graph serializer won’t be able to handle it.