I agree with the comments that Eli and Chris made; the code duplication is something we want to avoid. Eli brought up an excellent point that key pieces of the driver should be factored off to a separate library, and I too have felt this way for some time. I think that even resolving all the various preprocessor and compiler options (e.g., -I, -D, etc.) that is needed to instantiate a Preprocessor should also be factored out of clang.cpp into a separate library.
I also agree with Chris's comments that separating the "distcc" driver from a regular clang driver is a good idea. That keeps the distcc implementation simpler, and potentially allows it to be used with multiple compilers (not just clang). I myself was fine with integrating the distcc support directly into the clang driver for a first pass, but because the distcc driver will not use all of the same functionality as the regular clang driver (and obviously do a few things that the regular clang driver does not), the better long term approach is to factor key components of the clang driver into libraries, make clang and distcc-clang separate executables, and simplify the logic for both.
One thing that hasn't emerged in this discussion is whether or not the clang distcc should interoperate with the traditional distcc implementation, or (a different but related issue) is whether we should require that the compiler itself be clang. One advantage of a clang-based distcc, independent of using clang to perform compilation, is that clang-distcc can do the source preprocessing itself without forking off a separate process (which is what the traditional distcc implementation does). This seems more like a good step one: build a distcc client that just takes care of preprocessing in-process, and see what kind of speedups you get over forking and preprocessing. Ultimately we're interested in speed and scalability, and small steps like these help guide the design.
Interoperability with other compilers doesn't mean we should limit the design of clang-distcc. We can certainly implement special functionality when multiple compiler "workers" are based on clang (e.g., serializing ASTs, special caching, etc.).
I like the concept of the NetSession class, although the issue of interoperability with existing distcc implementations is something that is worth discussing. Chris is right that the system-specific APIs, such as the use of sockets, should not be in header files. A PIMPL approach, like what we use for FileManager, would probably work well (where the system-specific stuff only appears in the .cpp file).
As for the clang server, both pthreads and sockets are system-specific APIs. We'll want a design that keeps the threading modeling separate from the code that processes a unit of work. This will allow us to tailor the implementation to use the best parallel computing primitives that are available on a specific architecture.
I'm also a little confused with the overall design. It looks like a client (a 'clang' process) connects to a server, sends the preprocessed source to the server, waits for the server to chew on the file, gets the processed output from the server, and then writes the output to disk. It appears that the client attempts to connect to different servers in a serial fashion, and then picks the first available server. Is this how traditional distcc works? (I actually don't know) It's a simple design, but it doesn't amend itself well to good load balancing as well as reducing the latencies in firing off compilation jobs (a bunch of connection attempts in serial fashion seems potentially disastrous for performance). This particular point isn't a criticism of your patch; what's there is fine to get things started. I'm not a distributed computing expert, but something akin to the Google MapReduce system (which has workers and controllers) seems more flexible for fault tolerance, load balancing, and so forth. This is certainly something worth discussing in a higher-level discussion of the overall design of the system.
A few comments inline.
Here is the final patch for clang to support network distributed compilation. (clang.patch file)
There is also the server part attached. (the tar.gz file)
Like the client, the server shouldn't have so much code copied from the Driver, and it certainly doesn't need to use all of the ASTConsumers in the regular Clang driver. General work (by anyone who is interested) on modularizing the driver will help make this much easier.
There 3 new files added to Driver directory: PrintPreprocessedOutputBuffer.cpp what is a modification of PrintPreprocessedOutput.cpp to support print text to a std::ostream.
I'm not certain why a separate version of PrintPreprocessedOutput was necessary. iostreams are slow, and writing to sockets using the FILE* abstraction is perfectly acceptable (via fdopen()).
Other new files: NetSession.h and NetSession.cpp which handles and contains all networking code (portable thin networking code).
There are some files changed, mostly to support saving its output to a std::stream. I've used that way to pass clang ASTConsumers data to an other computer via network.
There are 3 new option added to clang. The basic one is -distribute what enables distributed compilation. The other two are: -dist-preprocesslocally and -dist-serializelocally.
If the first one enabled then clang sends a preprocessed file for clangserver (a process in an other machine) to compile. In the second case the lexing and parsing is done locally and the built and serialized AST is sent to clangserver.
You can play with this using -dist-preprocesslocally because it is working.
Overall, I think this a good initial start! I think that next logical steps would be to look at both overall design as well as issues of code structure (addressing the comments on modularity, isolating various implementation details, etc.). Getting a few interesting performance timings would also be extremely useful to help shape some of those design decisions.
Incidentally, how well does the code work when the two processes (client and server) are on actually two different machines? Right now, the client always connects to "localhost". Getting performance timings when both client and server are on the same and different machines is also interesting to see how much things like network latency, etc., are a factor in the design. There may also be some correctness issues that are masked by having the client on server on the same machine.