examples of using clang's VFS layer

Is there any published paper/tech report/talk about how virtual file systems are intended to be used in CLANG?

As far as I understand it, presently there’re a number of implementations in the code base including one that gets a mapping virtual->physical from a json file. Assuming one wanted to take unmodified files from some server that caches the version control system, is the intention to point some of the mappings to a fuse file system that handle caching/communication w/ the VCS?

Or are people writing their own implementation of the abstract file system that does whatever they need and merge it with any new clang release they’re interested in?

Thanks,

Maurizio

+benjamin

There are currently very few implementations of the VFS around and
virtually no docs, only some examples in the code base. For your
problem I see a couple of options:

- If you know what files will be used you can load them eagerly into
an InMemoryFileSystem. There are some examples of doing this in Clang.
- If you have a fuse implementation you can use RedirectingFileSystem
(the YAML thing). Eventually I want to decouple this from YAML but
it's not a high priority.
- You can still implement your own VFS. This doesn't tie you to fuse
and allows lazy loading. However it will require code changes when
Clang changes.

Combinations of the approaches are possible by using file system overlays.

Thanks,

I did consider all these options (except for the first, because my reasoning was that once you’ve transferred a file you can as well persist it in a cache and I was thinking a disk backed cache; this should help for include files that are probably reused. But a memory FS, or a combination using unionFS, would also work).

I was more interested in what people have in mind when designing this feature. I’m quite far from actually be ready to implement anything in this area. For my application, this is very much tied to move my organization’s build system to Bazel and that’s a hugely difficult move given the size and convolution of our non-standard makefiles (we have our own “make” :frowning: ). Bazel is also making strides in remote execution and I’m mainly trying to get a good idea of where things were going.

The advantages of a FUSE file system is that you can use compilers different from clang. In our case we’ll have gcc and close source compilers such as icpc (Intel) and nvcc (nVidia).
I know Google has build many FUSE file system and try to stay away as much as possible. The FUSE file system can very well implemented using the same service that is used by a modified Clang VFS, so code can be shared.
But I don’t see how to do without at least a FUSE implementation. Of course this is a bit outside clang discussion and have more to do w/ a build system.

If you can say, which direction has google taken/working on? still using srcFS/objFS or planning to replace them?

Thanks,

I did consider all these options (except for the first, because my reasoning was that once you’ve transferred a file you can as well persist it in a cache and I was thinking a disk backed cache; this should help for include files that are probably reused. But a memory FS, or a combination using unionFS, would also work).

I was more interested in what people have in mind when designing this feature. I’m quite far from actually be ready to implement anything in this area. For my application, this is very much tied to move my organization’s build system to Bazel and that’s a hugely difficult move given the size and convolution of our non-standard makefiles (we have our own “make” :frowning: ). Bazel is also making strides in remote execution and I’m mainly trying to get a good idea of where things were going.

The advantages of a FUSE file system is that you can use compilers different from clang. In our case we’ll have gcc and close source compilers such as icpc (Intel) and nvcc (nVidia).
I know Google has build many FUSE file system and try to stay away as much as possible. The FUSE file system can very well implemented using the same service that is used by a modified Clang VFS, so code can be shared.
But I don’t see how to do without at least a FUSE implementation. Of course this is a bit outside clang discussion and have more to do w/ a build system.

I’m not sure why you would need to use fuse if you can just overlay in-memory. Generally, a dependency on fuse is unfortunate, as that will not work across different platforms - fuse will just act as-if you have a full filesystem, which already works today, no VFS necessary.

If you can say, which direction has google taken/working on? still using srcFS/objFS or planning to replace them?

Why I can’t talk about our internal stuff much, I can say that our use case is to load all relevant files via potentially higher latency connections and overlay them to have fully reproducible in-memory builds, for various use cases.