[RFC] Moving to Sphinx for LLVM and friends documentation (with partial implementation (in both 10pt and 12pt font)).

Moving the LLVM Documentation to Sphinx

Hi Michael,

Awesome work!

I'm a strong supporter of using Sphinx. I've been using it for the LNT
docs (http://llvm.org/docs/lnt/) and it is quite nice to work with. +1
for migrating over.

- Daniel

I'll give another +1. It looks pretty decent and I particularly like the formatting.

-eric

What Do You Think?
------------------

I realize that changing the documentation format is non-trivial, but I believe
that the benefits are worth the effort. If we go forward with this I will finish
the first two points above and work to integrate doxygen and keep everything
running smoothly.

I strongly support this. I've been interested in moving the documentation over to a better system for a long time, but the only decent one I've known about was docbook, which is ... a bit more complicated than I hoped we'd need. :slight_smile:

What's Left?
------------

#. Pick the paint color for the bike shed. I have done no work on this other
  than copying over the default ``sphinxdoc`` theme to the source dir as
  ``llvm-theme``. Creating Sphinx themes is actually rather easy, so anyone is
  welcome to take a stab at this if they wish.

I suggest you just dictate what you think the right thing is, asking for and ignoring or responding to feedback as you think is best. Don't try to make everyone happy. I agree with Vikram that there are some major formatting issues that should be fixed before this is rolled out.

#. Integrate building the docs into the autoconf and CMake build systems.

Yes, this seems important. Also, the web page needs to auto-update the docs in response to commits.

#. Finish moving over the docs.
#. Finish Sphinxifying them.

Can this be done incrementally? It would be great to start with one doc (like LangRef) and make it really really good and get the infrastructure right, then move on to other docs (which others might help out with).

-Chris

BoostBook is a simplification of DocBook, with many things specific
for C/C++ source. it can pull in parts of a source file so the doc
examples always show examples of code that is actually tested, it has
proper links to source code files, can integrate with doxygen with a
bit of work, can export to html, pdf, and a few other things... I
like Sphinx, a lot actually, but BoostBook is more designed for this
kind of project.

I'm sure that there are many other alternatives. Sphinx has the marked advantage over any alternatives because it has someone willing to make it happen :slight_smile:

-Chris

That and BoostBook are all I had heard of either until Daniel popped
into the conversation and mentioned Sphinx.

Michael Spencer <bigcheesegs@gmail.com> writes:

[snip]

I'll need help integrating it into the build system.

I can help with the CMake build.

[snip]

OvermindDL1 <overminddl1@gmail.com> writes:

BoostBook is a simplification of DocBook, with many things specific
for C/C++ source. it can pull in parts of a source file so the doc
examples always show examples of code that is actually tested, it has
proper links to source code files, can integrate with doxygen with a
bit of work, can export to html, pdf, and a few other things... I
like Sphinx, a lot actually, but BoostBook is more designed for this
kind of project.

BoostBook/QuickBook is really cool but I found it nearly impossible to
use outside of the Boost tree. In other words, a few years ago, at
least, it was intimately tied to the Boost project. That may have
changed recently, I don't know.

I also really hate the way QuickBook chops up and represents pages. The
Table-of-Contents + prev/next links format is really terrible. It's
very difficult to navigate non-linearly.

The QuickBook language is really nice for specifying documentation so if
we can fix the output format and make it work outside of Boost I'd be
all for using it.

                            -Dave

The BoostBook tool-chain is a total nightmare to configure and maintain. It produces decent, standards-style documentation and has a lot of great features, but the setup and learning curve is far too steep for me to recommend its use for LLVM.

  - Original author of BoostBook

You are right that BoostBook is targeted directly for large C++
projects, and I seriously considered BoostBook for this project, but
ran into a few road blocks.

* It's tightly integrated into boost and makes quite a few assumptions
about that.

What assumptions is that?

* It requires boost-build, which is a rather complex dependency and
yet another full build system.

How does it require it? I have used it directly before, without
boost-build (even if you wanted to simplify its use by calling it
using boost-build, it is not like it is CMake; boost-build, once
compiled, is a single executable, it requires no other files).

* It just feels really complex all around.

Complex how so? It is 'interesting' to set up, but once set up it is
extremely simple. As well as the fact that to do some things in it
would be extremely simple compared to Sphinx which would not even
support the ability, such as the simple example of pulling in part of
some source code into the documentation straight from a compilable
source file, how would you do that with Sphinx? Plenty of other
examples too.

In the end I felt that there would be too much resistance to
boost-build and the complexity. I then discovered Sphinx and found
that it did everything I wanted and just felt clean and simple.

OvermindDL1 <overminddl1@gmail.com> writes:

BoostBook is a simplification of DocBook, with many things specific
for C/C++ source. it can pull in parts of a source file so the doc
examples always show examples of code that is actually tested, it has
proper links to source code files, can integrate with doxygen with a
bit of work, can export to html, pdf, and a few other things... I
like Sphinx, a lot actually, but BoostBook is more designed for this
kind of project.

BoostBook/QuickBook is really cool but I found it nearly impossible to
use outside of the Boost tree. In other words, a few years ago, at
least, it was intimately tied to the Boost project. That may have
changed recently, I don't know.

What of it is too tied to Boost? Once compiled it is completely standalone.

I also really hate the way QuickBook chops up and represents pages. The
Table-of-Contents + prev/next links format is really terrible. It's
very difficult to navigate non-linearly.

Actually that is an option when you define your documentation, a
simple single-line change can make it so it chops it up, or introduces
it as a single mass page properly linked together by anchors, and
there are many other patterns that can be done. The Boosty way is
just to chop it up, but if you notice, not all of the library's
documentation does that.

The QuickBook language is really nice for specifying documentation so if
we can fix the output format and make it work outside of Boost I'd be
all for using it.

What output format issues do you have? It is completely configurable,
a single file can define a specific look for all of the documentation.

I could, but it may take a little bit, time is something I do not have
currently, but as long as someone keeps poking/reminding me daily then
I shall.

I whipped up a quick example, this is a total of about an hour of work
due to finding parameters to set it up, once that is finished though
the actual documentation is simple.

Look in the html subdirectory to see the generated html as separate
files, but it can also generate a single large html file, pdf, man
pages, docbook, whatever.

The source qbk files and Jamroot is included, but it would be a cinch
to have CMake build it instead. If you do:
  bjam -n html
then it will not build it, but it outputs all of the commands and
everything it does, which is surprisingly little, so yes, it could be
integrated with ease.

I ported the first two chapters of the C++ Kaleidoscope tutorial to
it. I used the standard boost-style setup, except as an example I
changed the section headers to be LLVM-doc style, was simple. The
boost standard DTD and XSL and such files can be edited to change
style as necessary as well, but CSS should work for the bulk. I did
not bother changing style of much, but to have autogenerated TOC at
the top of those would be a single option change, etc... Many things
can be done, and as you can see the documentation style of the qbk
files is simple, and C++ and such is auto color coded, although that
could be disabled if wanted.

To show how the documentation can stay in sync with the source, I
included the Kaleidoscope source in the example and I edited the
Chapter 2 source to add a few lines of markup to let it pull in the
source automatically so the documentation is always in sync with the
source as they are one and the same. I also showed how to link to the
cpp source instead of embedding it at the bottom of Chapter 2, but
that could be done as well if needed.

I have all the chapter's in there, but the rest are empty, I just
included them to let all the links work.

If there is a specific example that you want demonstrated, just say.

To build the example yourself, just put the two qbk files and the
Jamroot in a directory, make sure you have boost's quickbook and
boostbook built, then just do:
  /path/to/bjam
and it will build the multi-part html by default, or of course do:
  /path/to/bjam -n
to see what it is doing instead, which basically involves just
creating a system-customized link to the dtd and xsl and such
directories in boostbook (if LLVM used this then you would include
those in LLVM itself, probably customized, and the manifest would not
need to be system-generated-specific each time, it would be static),
then calling quickbook, then calling xsltproc twice, quite simple, and
I have ran the steps manually myself to confirm they work, and they
do.

Correction, even the zip by itself is too big, here is the 7z, if
someone wants a giant zip, I can host it somewhere...

llvm_doc.7z (28.6 KB)

OvermindDL1 <overminddl1@gmail.com> writes:

BoostBook/QuickBook is really cool but I found it nearly impossible to
use outside of the Boost tree. In other words, a few years ago, at
least, it was intimately tied to the Boost project. That may have
changed recently, I don't know.

What of it is too tied to Boost? Once compiled it is completely standalone.

It's been a while, but IIRC the doxygen integration relied on bjam
somehow, finding the tools perhaps? The BoostBook XML files also
hard-code some assumptions about the project layout, IIRC.

If building the tool requires Boost.Build that is a pretty heavy burden.

I also really hate the way QuickBook chops up and represents pages. The
Table-of-Contents + prev/next links format is really terrible. It's
very difficult to navigate non-linearly.

Actually that is an option when you define your documentation, a
simple single-line change can make it so it chops it up, or introduces
it as a single mass page properly linked together by anchors, and
there are many other patterns that can be done. The Boosty way is
just to chop it up, but if you notice, not all of the library's
documentation does that.

Well, that's good news! I know that various Boost libraries do it
different ways (compare MPL to Proto), but I've never found a scheme I
really liked. Of course, that's personal preference, but ease of
navigation does matter.

The QuickBook language is really nice for specifying documentation so if
we can fix the output format and make it work outside of Boost I'd be
all for using it.

What output format issues do you have? It is completely configurable,
a single file can define a specific look for all of the documentation.

If that's true and we can make it do what we want, I'm all for
QuickBook.

OvermindDL1 <overminddl1@gmail.com> writes:

Correction, even the zip by itself is too big, here is the 7z, if
someone wants a giant zip, I can host it somewhere...

Please do. 7z is not supported on Linux. I would love to take a look
at this, but I can't. :frowning:

                            -Dave

Thanks for setting this up, I didn't even attempt to use boostbook. I
like the quickbook syntax.

I finally got around to trying this out on my windows machine. The
problem is that you have to have a full boost installation so that you
can compile quickbook and boostbook as there are no packaged binaries.
There are also 3 other packages you have to install.

Sphinx can include a file as source code. It is also extensible in
python by simply adding a python file and adding a reference to it in
the config file.

- Michael Spencer

Does not this work?
http://packages.debian.org/lenny/p7zip-full

Eugene

OvermindDL1 <overminddl1@gmail.com> writes:

BoostBook/QuickBook is really cool but I found it nearly impossible to
use outside of the Boost tree. In other words, a few years ago, at
least, it was intimately tied to the Boost project. That may have
changed recently, I don't know.

What of it is too tied to Boost? Once compiled it is completely standalone.

It's been a while, but IIRC the doxygen integration relied on bjam
somehow, finding the tools perhaps? The BoostBook XML files also
hard-code some assumptions about the project layout, IIRC.

If building the tool requires Boost.Build that is a pretty heavy burden.

None of them rely on bjam, bjam just knows the commands, but as stated
you can use:
  bjam -n
to see the exact commands it runs and everything it does, very easy to
call from CMake, and in fact the CMake build system port for Boost
does handle that already. So no, it does not at all require
Boost.Build.

I also really hate the way QuickBook chops up and represents pages. The
Table-of-Contents + prev/next links format is really terrible. It's
very difficult to navigate non-linearly.

Actually that is an option when you define your documentation, a
simple single-line change can make it so it chops it up, or introduces
it as a single mass page properly linked together by anchors, and
there are many other patterns that can be done. The Boosty way is
just to chop it up, but if you notice, not all of the library's
documentation does that.

Well, that's good news! I know that various Boost libraries do it
different ways (compare MPL to Proto), but I've never found a scheme I
really liked. Of course, that's personal preference, but ease of
navigation does matter.

As mentioned above, the BoostBook part does not compile anything, it
just downloads and sets up the BoostBook look and feel, which you can
easily change all as you wish.

The QuickBook language is really nice for specifying documentation so if
we can fix the output format and make it work outside of Boost I'd be
all for using it.

What output format issues do you have? It is completely configurable,
a single file can define a specific look for all of the documentation.

If that's true and we can make it do what we want, I'm all for
QuickBook.

It is all possible, and as stated it will require a touch of setup for
the CMake part, but there is already a CMake port to replace bjam for
building Boost that knows about that, as well as I know exactly what
commands bjam runs and I have run them myself and they are very
simple, all but 3 application calls, very simple.

The BoostBook tool-chain is a total nightmare to configure and maintain. It produces decent, standards-style documentation and has a lot of great features, but the setup and learning curve is far too steep for me to recommend its use for LLVM.

       - Original author of BoostBook

It is difficult to set up, but once setup it never needs to be touched
again, adding new pages are very simple, should not make it sound more
difficult then it is.

But it should be possible for all of us to build LLVM's documentation
tools. I really don't want to see bjam code in LLVM *shudder*.

It would be a great contribution if someone could port the
BoostBook/QuickBook build to make and completely divorce it from Boost
proper. It's a tool a lot of projects could really use, IMHO.

QuickBook itself uses Boost.Spirit as the parser and the regex engine
inside Boost, to separate that from Boost 'would' be difficult, except
there exists a nice little Boost tool call "bcp". What bcp can do is
pull out parts of Boost that a project requires out into a specific
location, even changing the boost namespace to something else if you
wish to prevent collisions should someone use Boost itself of a
different version. As an example, if LLVM decided to use Boost in it,
say they decided to use Boost.Spirit.Qi as the parser engine for
tablegen instead of bison or whatever, they could run bcp on the LLVM
source and it pulls out the necessary Boost source/headers/etc... that
is necessary to compile it (again, changing the boost namespace to
something else if you want).

So, yes, you could divorce it from Boost, and the license fully allows it.

QuickBook is the only part that would need to be compiled, the
BoostBook part is not a program but the 'look and feel' of the
documentation, which you can put anywhere that you want, and of course
change to look like anything that you want.

OvermindDL1 <overminddl1@gmail.com> writes:

Correction, even the zip by itself is too big, here is the 7z, if
someone wants a giant zip, I can host it somewhere...

Please do. 7z is not supported on Linux. I would love to take a look
at this, but I can't. :frowning:

7z is supported just fine on Linux, I use it all the time, and it
compresses text like that a heck of a lot better then zip (about a 4:1
ratio). But I will still put the zip up on my server, now at url:
  http://www.overminddl1.com/stuff/llvm_doc.zip

Thanks for setting this up, I didn't even attempt to use boostbook. I
like the quickbook syntax.

I finally got around to trying this out on my windows machine. The
problem is that you have to have a full boost installation so that you
can compile quickbook and boostbook as there are no packaged binaries.
There are also 3 other packages you have to install.

You need to compile quickbook, but as stated above that could easily
be put into LLVM itself so there are no dependencies to download with
regards to Boost. boostbook is not something you compile, but rather
the look and feel infrastructure for the xsltproc compiler (the only
dependency you would need to get, basically docbook), and you would
just copy that into the LLVM tree and edit it to suit to taste. The
only other dependency you would have is if you want to build pdf's.

Requiring people to get and install boost to work on docs is a non-starter.

What is wrong with sphinx?

-Chris