some of you might have noticed that the Python SB API documentation on the website hasn’t been regenerated since more than a year. I pinged Andrei about fixing the broken generation script. While I was trying to recreate the doc generation setup on my machine I noticed that our documentation is generated by epydoc which is unmaintained and had its last release 12 years ago. It seems we also have some mocking setup for our native _lldb module in place that stopped working with Python 3.
While the setup we currently have will probably work for a bit longer (assuming no one upgrades the web server generating the API docs), I would propose we migrate to a newer documentation generator. That would not only make this whole setup a bit more future-proof but we probably could also make the API docs a more user-friendly while we’re at it.
From what I can see we have at least three alternative generators:
- pydoctor (epydoc’s maintained fork) - Example LLDB docs: https://teemperor.de/pub/pydoctor/index.html
- Doesn’t really change the user-experience compared to epydoc.
Doesn’t really change the user-experience compared to epydoc.- The website is rather verbose and you need to click around a lot to find anything.
Horrible user-experience when viewed on mobile.
No search from what I can see.
It seems we can’t filter out certain types we don’t care about (like Swig generated variables/wrappers etc.)
It doesn’t include LLDB’s globals/enum values in the API (even when I manually document them in the source). This seems to be just a Python thing that opinions are split on how/if globals are supposed to be documented.
Somehow ignores certain doc strings (I assume it fails to parse them because of the embedded code examples).
- sphinx (which is also generating the rest of the LLVM websites) - Example LLDB docs: https://teemperor.de/pub/sphinx/index.html
- The most flexible alternative, so we potentially could fix all the issues we have if we spend enough time implementing plugins.
- We already use sphinx for generating the website. We however don’t use its autodoc plugin for actually generating documentation from what I can see.
- The two plugins I tried for autogenerating our API are hard to modify for our needs (e.g. to implement filters for SWIG generated vars/wrappers).
- In general sphinx is much better if we would hand-write dedicated Python documentation files, but I don’t think we want to do that.
- LLDB’s global variables are displayed but for some reason getting assigned the doc string of
- pdoc3 (dedicated Python API generator) - Example LLDB docs: https://teemperor.de/pub/pdoc.html
- Straightforward to modify pretty much every part of the documentation to our needs (the example is created with a slightly modified config):
- Dedicated docs for single-module APIs, so we don’t have all the awkward boilerplate text concerned with modules when we only have one ‘lldb’ module in our API.
- It only shows global variables that are documented. However, SWIG doesn’t seem to support generating documentation for globals (?). We can work around that by having a script assign all our globals/enum a dummy doc string before generating the docs (that’s what I do in the example)
- Generates a single page with HTML anchors (might also be a good thing as you can now always Ctrl+F for identifiers and it’s much faster to generate than the others).
I think we can all agree that this topic is great bikeshedding material, so this mail thread shall be the official RFC thread where everyone can voice their opinion about how our Python API docs should look like.
I’ll make the start and say that I think pdoc3 is the way to go. The generated web page feels great to use and it’s straightforward to add all the custom filters we need to get rid of SWIG-generated code. Also the only bug we need to fix here has a simple workaround (assign our defines/enum/etc. dummy strings via some script).