Maintaining a consistant path across two clang instances

Hi cfe-dev,

I’m working on a distributed compiler project using clang(see here for more information). One issue I am coming across is that the “master” must have the same working directory as the client it is preprocessing for, so that relative paths resolve correctly. My first intuition was to send the output of getcwd on the slave to the master, and use chdir on the master to set the appropriate working directory, before invoking the preprocessor. However, this brings to mind two concerns:

  1. The cache in the FileManager might get confused(e.g. it might cache /u/mike/dir1/test/test.c while /u/mike/dir1 is the cwd, and attempt to access /u/mike/dir2/test/test.c while the cwd is /u/mike/dir2).

  2. Since setting the cwd would affect the whole process, it would not be possible to preprocess multiple files simultaneously with different threads.

Are these two concerns valid? If so, how would you recommend resolving them?

Additionally, I have two more questions:

  1. Is the preprocessor threadsafe? Specifically, I want to know if I can invoke multiple, completely disjoint preprocessors on different threads, without negative consequence.

  2. Is the FileManager threadsafe? Specifically, I want to know if I can invoke multiple preprocessors using the same FileManager without negative consequence. If not, would you imagine it to be better to have a single preprocessor with a very large cache, or multiple preprocessors running in parallel with separate, disjoint, but smaller caches?

Thanks!
Mike

Hi cfe-dev,

I’m working on a distributed compiler project using clang(see here for more information). One issue I am coming across is that the “master” must have the same working directory as the client it is preprocessing for, so that relative paths resolve correctly. My first intuition was to send the output of getcwd on the slave to the master, and use chdir on the master to set the appropriate working directory, before invoking the preprocessor. However, this brings to mind two concerns:

  1. The cache in the FileManager might get confused(e.g. it might cache /u/mike/dir1/test/test.c while /u/mike/dir1 is the cwd, and attempt to access /u/mike/dir2/test/test.c while the cwd is /u/mike/dir2).

  2. Since setting the cwd would affect the whole process, it would not be possible to preprocess multiple files simultaneously with different threads.

Are these two concerns valid? If so, how would you recommend resolving them?

The preprocessor should not depend on the actually current working directory; the working directory should be queried once (in the driver) and provided to clang -cc1 via a command-line parameter, which the header search will used as its working directory for searches.

Additionally, I have two more questions:

  1. Is the preprocessor threadsafe?

No.

Specifically, I want to know if I can invoke multiple, completely disjoint preprocessors on different threads, without negative consequence.

You can use completely disjoint preprocessors concurrently.

  1. Is the FileManager threadsafe?

No.

Specifically, I want to know if I can invoke multiple preprocessors using the same FileManager without negative consequence.

No, FileManager isn’t prepared for this.

If not, would you imagine it to be better to have a single preprocessor with a very large cache, or multiple preprocessors running in parallel with separate, disjoint, but smaller caches?

I would guess that having only a single preprocessor would be best, but since the point of this project is to improve performance, you’ll have to measure it.

  • Doug