Locking files on Windows with clangd

Hi!

A short background: we at JetBrains are experimenting with using clangd with
CLion, and we’re very happy with the results so far. Unfortunately, we’ve hit
a problem when header files are locked on Windows, so they can’t be saved,
they break git rebase (leading to the loss of work), etc.

It happens when the file is opened using a memory mapped file. It’s fairly
easy to reproduce with clangd:

  1. Create a header big enough to pass the threshold in shouldUseMmap()
    (lib/Support/MemoryBuffer.cpp)
  2. Include it in a source file
  3. Keep this source file opened via clangd
  4. Now an attempt to modify this header will fail

The situation is especially unpleasant because the handle is locked not only
during the parse, but for all the time clangd holds the preamble containing
this header.

This issue is mitigated to an extent in libclang: non-system files are
considered “volatile” (see r160074, r334070), so memory mapped files are not
used for them. However, I feel like it’s not enough: you can easily have a
#pragma system_header in your codebase (IIUC it affects the mentioned
behavior), locking such file and i.e. losing user’s work during git rebase
is still unacceptable.

Also, I think the fact that the proper compiler has this behavior is also
unfortunate (forgotten build in the background might lead to the loss of data).

IIUC the main reason for having memory mapped files is to reduce the memory
footprint. However, for a regular translation unit, headers usually take several
megabytes, which IMO is tolerable, and is usually topped by other per-TU data
anyway.

Hence, what do you think about not using memory mapped files on Windows at all
for source files? Are there any implications that I’m not aware of?

The naive patch (obviously not for commit, just for the illustration):

— a/lib/Basic/SourceManager.cpp
+++ b/lib/Basic/SourceManager.cpp
@@ -109,7 +109,12 @@ llvm::MemoryBuffer *ContentCache::getBuffer(DiagnosticsEngine &Diag,
return Buffer.getPointer();
}

+#ifndef _WIN32
bool isVolatile = SM.userFilesAreVolatile() && !IsSystemFile;
+#else

  • bool isVolatile = true;
    +#endif

Hi Dmitry,

We’ve had a pull request with a similar fix at some point: https://reviews.llvm.org/D35200, but it was abandoned and never made it in.
@ivan.donchevskii@qt.io, did you manage to figure out a way to make this work on Windows? Any ideas on what’s the best way to fix this?

Hi!

Yes, we also still have that issue on Windows and have some of our own bug reports (https://bugreports.qt.io/browse/QTCREATORBUG-15449)

For some time we used the workaround from D35200 but abandoned it finally because of the complaints about high memory consumption.

There’s no way I know to fix it because AFAIK windows always locks mmapped files. But i can’t say that this behavior is broken because you are not supposed to mark file ‘system’ if you plan to modify it. What would I do if I had bugreport with ‘#pragma system_header’ - I would simply remove that line from the unsaved file content I pass to clang.

Regards,

Ivan

There’s no way I know to fix it because AFAIK windows always locks mmapped files.

That’s unfortunate, have you seen any pointers of this behavior in official docs?
I think I was able to modify the mmapped file when I was playing around with Windows APIs while looking at D35200, but I’ll need to rerun my experiments and make them reproducible.

But i can’t say that this behavior is broken because you are not supposed to mark file ‘system’ if you plan to modify it. What would I do if I had bugreport with ‘#pragma system_header’ - I would simply remove that line from the unsaved file content I pass to clang.

I’m not sure there’s a single definition of what a “system” header is, e.g. we pass -isystem when adding include paths to third_party code to suppress warnings in that code.

There are different ways to suppress warnings other than passing -isystem so we’re trying to make fewer -isystem paths (for example https://reviews.llvm.org/D48116 which i really like and am planing to update in the nearest future)

It’s not something we do in clangd, it’s something we do in our buildsystem for usual compilations as well. And the idea is not to avoid warnings from headers (we want those), but to avoid warnings from headers of third_party dependencies (those we usually don’t control, so we can’t fix them).

I see. In that case we need a new flag for clang, something like “-ithird_party” which does the same as -isystem but with some differences :slight_smile:

Hi, Ivan!

Thank you for the insights!

But i can't say that this behavior is broken because you are not supposed to mark file 'system' if you plan to modify it.

There are different ways to suppress warnings other than passing -isystem so we're trying to make fewer -isystem paths

Not sure I can agree with that:

- whether a directory is marked w/ `-isystem` or not is the user's/buildsystem decision, not the IDE's, so we can't say to the user "just don't mark directories you want to modify with `-isystem` - it looks like a poor way do deal with Windows locking issues.
- locking "proper system" files also seems suboptimal, as it e.g. would break package managers updates

What would I do if I had bugreport with '#pragma system_header' - I would simply remove that line from the unsaved file content I pass to clang.

It also applies to the transitively included headers - it would be really unfortunate to do the same for all of them

For some time we used the workaround from D35200 but abandoned it finally because of the complaints about high memory consumption.

Have you ever managed to pinpoint the problem? I would expect that the issue with D35200 would be keeping preambles in memory, not source files, but it would be great to have a conformation or a contradiction.

So far I haven’t found a way to use mmap and avoid locks on Windows at the same time (also after experiments with Windows API. though quite limited). That’s why we try to deal only with symptoms. Of course it’s not ideal and I will be happy if we get a better solution.