The main problem it was intended to solve was code completion speed in IDEs. Any given source file starts with a bunch of includes, typically (depending on one’s preference) first some library headers, then some internal project headers. Only then comes the actual code.
When you type an identifier in a Clang-powered IDE and request code completion, what happens is that the IDE calls on Clang to parse the file up to the point where code completion is requested, and Clang will return a list of possible completions, which the IDE then displays. In order to be useful, this has to be fast. If you’ve developed with Visual Studio, I’m sure you’re familiar with how disrupting the delay in IntelliSense’s reaction can be. If the project gets big and complicated, IntelliSense often becomes unusable simply because it takes several seconds to pop up.
If you have to wait for Clang to parse your entire source file, including all the headers, you’re going to wait just as long, especially for a complicated C++ project. The main way to speed this up is precompiled headers. Compile the headers first, then just load the binary format when you need to reparse the file. However, PCHs typically have to be configured. You take a set of headers that is common to your project and rarely changes (because rebuilding PCHs is slow) and tell the compiler to use it. But that still leaves the project-specific headers to be reparsed.
So Clang has another feature, called the precompiled preamble (PCP). Basically, Clang will look at a source file, decide where the include directives for the file end (the preamble) and automatically build a PCH from that, which it will use when it needs to reparse the file. (I think the C API has ReparseTranslationUnit for this.) Once the preamble is built (which should happen once when the file is opened in the IDE), Clang only needs to reparse the actual source file, which can be done in less than a second usually, especially if the PCP is kept in memory.
The downside of this approach is that it takes a long time to do the initial compiling of the preamble when you open the file. It would be a lot faster if you had a PCH of all the third party headers and just combined it with the file-specific part of the preamble into a PCP. And Clang used to be able to do that. You could use a PCH and it would load it completely (PCHs are usually loaded lazily by Clang), parse the new parts, and create a new PCH consisting of the combination. This is faster than reparsing everything, but it is still not fast enough (fully loading a PCH isn’t very fast and needs a lot of memory), and each resulting PCP is rather big (tens of megabytes if not more), which is a problem if you want to keep all the PCPs for your open files (and I know enough programmers who keep dozens of files open) in memory for fast access.
Enter chained PCH. You take your big third party library PCH as the primary. You created a diff for the rest of the preamble, which is usually nice and small (maybe a megabyte), and fast to create (because you only load the parts of the PCH that you need). You have one big block and a multitude of small blocks that reference it in memory, you have fast loading, fast parsing, and all-around goodness.
As a side effect, the work on chained PCH made the PCH system more flexible and was thus the first step towards the true module system being developed in Clang.