Question on clang design - Why clang doesn't use pooling of Process

Hello, list

I am reading clang source code.
And while my reading of it I found that clang as a driver invokes
another clang as a worker process to convert a input file to the
output file.
This method looks weird to me because the driver clang invokes worker
clang so many times. It would be the number of input source file!
This means that 1000 C source code compile would invokes clang 1000 +
1 times.

For server programming this kind of work is handled by processpool or
threadpool.

Pooling invokes process' or threads only a certain times like 10 times
and with this 10 process of threads tasks are handled with no more
loading of process load or creating new thread load.

This is just my personal question and I believe clang's current
architecture would have reasonable reasons for every implementations
of it.

Any answer is welcome ^^; I want to learn from you.

Thank you very much in advance as always.

Cheers!

Journeyer J. Joh

But if you want to run 1000 working processes from one driver, it does need
to know which files to compile, in which order, and which options use for
each file.

It shouldn't be difficult to integrate clang driver into Ninja, but this way may not
be appropriate for other build systems.

Hello, list

I am reading clang source code.
And while my reading of it I found that clang as a driver invokes
another clang as a worker process to convert a input file to the
output file.
This method looks weird to me because the driver clang invokes worker
clang so many times. It would be the number of input source file!
This means that 1000 C source code compile would invokes clang 1000 +
1 times.

For server programming this kind of work is handled by processpool or
threadpool.

Pooling invokes process' or threads only a certain times like 10 times
and with this 10 process of threads tasks are handled with no more
loading of process load or creating new thread load.

I'm a little confused. Are you suggesting that if you run "clang a.cpp
b.cpp -fsyntax-only" (just to ignore linking & other issues) there are
a total of 3 processes created? The driver and then two separate
frontend actions for a.cpp and b.cpp? That would seem surprising to
me, though I haven't looked at the driver logic carefully - I would
expect it to only create one frontend process for a given driver
execution. And if my model of this is correct, exactly what pooling
would you expect to help here?

Yes, that is correct. There will be one -cc1 process per.

The real question is whether the process startup overhead is a measurable part of compilation time for a realistic source file. IIRC, we looked at this a while ago, and the answer is "no, not measurable", so it wasn't worth changing.

  - Doug

Hi Konstantin Tokarev, David Blaikie and Douglas Gregor

David Blaikie, please refer to Douglas Gregor's message. He pointed
exactly what I wondered.
Konstantin Tokarev, for the compatibility of many different systems as
you said it wouldn't be that difficult, because there exist many
middleware C++ library like ACE, boost, ....

And, Douglas Gregor, I understood that my worrying was worried about
by many other people before already.

Thank you very much for paying attention to my question.

Journeyer J. Joh