Can WriteBitcodeToFile be parallelized?

This function (understandably) takes quite a long time, because it has to go through each function in module and write its binary.
But it probably can be parallelized if different threads would write binaries separately, and then merge them together.

Is this implemented or planned?

Yuri

I've never heard this proposed before. Are you interested in working on it?

Note that we require the output to be deterministic, so what exactly would
you be doing? Performing the write out to a memory buffer for each
function, then another thread which passes through these memory buffers
sequentially and issues the actual I/O calls?

I don't think there's currently even primitives in LLVM to do thread
management yet, but now that we've moved to C++11, <thread> may be unlocked?

I'm hopeful the same technique could end up being used for multi-threaded
per-function .s and .o emission.

Nick

Note that we require the output to be deterministic, so what exactly would you be doing? Performing the write out to a memory buffer for each function, then another thread which passes through these memory buffers sequentially and issues the actual I/O calls?

If there are N CPUs, and the list of functions to be processed, one can split the list in N roughly equal sublists, and process them separately as much as possible. Then in the end merge the resulting code into one block, and resolve the remaining inter-sublist relocation items. Same with the DWARF part.

This can produce equivalent binaries if implemented correctly.

I've never heard this proposed before. Are you interested in working on it?

I normally would have been interested, but have no time currently.

Yuri

You should do some measurement and try to find the bottleneck. Without knowing the bottleneck it’s hard to know whether threading will even speed things up.

– Sean Silva