Recently I was trying to hijack into clang's file management to do a
custom pre-processing of input files, and found that VirtualFileSystem
(VFS) is pretty flexible and can be extended to support any weird
use-cases related to files: preprocessing, in-memory files, code
Unfortunately, it is not exposed from clang driver, and you have to
patch clang to add your own custom VFS. Thus I added an option to clang
cc1 driver that loads a said shared library, gets a VFS and pushes it on
top of real file system.
The patch is on the review: https://reviews.llvm.org/D47190
Any commets/suggestions are welcome.
That is an interesting idea with providing VFS as a shared library. Can you please provide more details about what you are trying to accomplish with custom VFS? It is hard to judge about merits of the approach without knowing the intended usage.
I'm trying to setup a pre-processing of input files: input source file
contains a region, which is not a valid C code, but external tool can
consume it and replace it with meaningful code. This replacement may
have standard C preprocessor directives (e.g. #include), so the tool
should run before C preprocessor and follow #include directives,
preprocessing #included files as well.
With VFS this is pretty easy to do: you just implement a
FileSystem::openFileForRead function which returns a file after
pre-processing, and Clang calls this function whenever it needs a file.
In general, VFS is similar to Filesystem in Userspace (FUSE): it allows
you to read files from memory instead of a real file system, but it is
more portable than FUSE.
Can you preprocess files with code-for-external-tool as a separate step during the build? And then the compiler treats #include as usual.
My main concern is that this feature seems to be pretty heavy-weight and I want to make sure there are enough use cases and they are compelling enough to maintain this feature in the future. I might sound too conservative and too negativistic but that shouldn’t discourage you.
VFS is more portable between different operating systems but not between different tools. For example, if you write preprocessed files to the disk, you’ll get reasonable debugger support for free. While with pure in-memory VFS, it is more challenging.
Can you preprocess files with code-for-external-tool as a separate
step during the build? And then the compiler treats #include as usual.
The problem is that the external tool cannot pre-process all files in
advance, because it doesn't know what files will be #included. And I
cannot run the standard C preprocessor before the external tool, because
'raw' files are not suitable for it.
So an ideal solution would be if the tool and the standard C
preprocessor run one after another on each file.
Anyway, my use case is not the best example of how this option can be
used. I do not encourage anybody to write another preprocessor for C
My main concern is that this feature seems to be pretty heavy-weight
and I want to make sure there are enough use cases and they are
compelling enough to maintain this feature in the future.
On the other hand, we already have the -ivfsoverlay option to remap
files using YAML config, which is using VFS too. It is more specific and
cannot be easily extended. With -ivfsoverlay-lib this functionality can
be implemented as an external library.
So, this feature is obviously not something that will be used a lot, but
it may be useful for tooling. After all, there are many use cases for
FUSE, so some of them may be applicable for clang as well.
For example, if you write preprocessed files to the disk, you’ll get
reasonable debugger support for free. While with pure in-memory VFS,
it is more challenging.