Help solve bug 17216 for clang-format on Windows

Hello all,
   I've been looking into why clang-format cannot write in-place for large files (> 16 kB) on Windows [1].
I've found the culprit, but need help actually fixing it; the file handling code is somewhat complex for a new-comer like me. :slight_smile:

The difference between small and large files is the returned value by shouldUseMmap in /llvm/lib/Support/MemoryBuffer.cpp. For small files, shouldUseMmap returns false and the already created normal MemoryBuffer is kept and all is fine. But for larger files, shouldUseMmap returns true and the already created normal MemoryBuffer is destroyed and a new memory mapped one is created. This becomes the OwningPtr<MemoryBuffer> Code memorybuffer in ClangFormat.cpp.
Now when the time comes to write the reformatted text in-place to the input file, llvm::raw_fd_ostream FileStream cannot be opened if the MemoryBuffer is a memory mapped one. Apparently Windows does not allow that. For small files, no Mmap buffer was created and all is fine.

I do not know nearly enough about the file handling internals to propose a good fix.
If shouldUseMmap is modified to always return false, clang-format works fine on large files. But that would be a very course "fix". I wanted to try releasing the Mmap interface to the file, right before writing back to it. But I was not able to figure out how to do that.

Thanks for the help,
     Johan

[1] http://llvm.org/bugs/show_bug.cgi?id=17216

Hello Johan,

I've attached a patch I'm using locally to reduce flicker in my IDE
while running clang-format using atomic file save.

I suspect it resolves you issue too, could you try it and report back?

Alp.

clang-format-atomic.patch (1.26 KB)

Hi Alp,
   The patch solves the problem, thanks!

One comment: the format function returns 'false' when no errors occurred. So I believe it should read
       if (Rewrite.overwriteChangedFiles()) {
         return false;
       }

Cheers,
   Johan

Hi Alp,
The patch solves the problem, thanks!

Glad to hear it!

One comment: the format function returns ‘false’ when no errors occurred. So I believe it should read
if (Rewrite.overwriteChangedFiles()) {
return false;
}

Disagree. My patch is fine :slight_smile:

It’s just that the return value of overwriteChangedFiles() was mis-documented until my fix last week (r193594) so that might have caused confusion.

But I’ll appreciate if you can confirm:

/// overwriteChangedFiles - Save all changed files to disk.
///
/// Returns true if any files were not saved successfully.
/// Outputs diagnostics via the source manager's diagnostic engine
/// in case of an error.
bool overwriteChangedFiles();
``
// Returns true on error.
static bool format(std::string FileName) {
...
if (Rewrite.overwriteChangedFiles())
return true;
...
}

To land this fix we’ll need a test case. I don’t want to check in a 16k+ source file into SVN so we’ll have to either generate a test input or use one of the existing large file inputs for that.

Alp.

Also I'd like to audit MemoryBuffer and try to make the interface
defensive against unportable usage, looks like other code paths may be
affected:

test/Index/crash-recovery-code-complete.c-// FIXME: Please investigate
abnormal path in MemoryBuffer.
test/Index/crash-recovery-code-complete.c:// XFAIL: mingw32,win32

Alp.

Hi Alp,
The patch solves the problem, thanks!

Glad to hear it!

One comment: the format function returns ‘false’ when no errors occurred. So I believe it should read
if (Rewrite.overwriteChangedFiles()) {
return false;
}

Disagree. My patch is fine :slight_smile:

It’s just that the return value of overwriteChangedFiles() was mis-documented until my fix last week (r193594) so that might have caused confusion.

But I’ll appreciate if you can confirm:

/// overwriteChangedFiles - Save all changed files to disk.
///
/// Returns true if any files were not saved successfully.
/// Outputs diagnostics via the source manager's diagnostic engine
/// in case of an error.
bool overwriteChangedFiles();
``
// Returns true on error.
static bool format(std::string FileName) {
...
if (Rewrite.overwriteChangedFiles())
return true;
...
}

Agreed :slight_smile:

To land this fix we’ll need a test case. I don’t want to check in a 16k+ source file into SVN so we’ll have to either generate a test input or use one of the existing large file inputs for that.

(I’m sorry, I’m not sure how to help with this. It’s been only 3 days since I’ve first seen the codebase, so… :slight_smile:
Adding a test case means adding to /llvm/tools/clang/test/Format ?
I’ll see if I can figure it out from the other tests in the codebase.

-Johan

Hi Alp,
  The patch solves the problem, thanks!

Glad to hear it!

One comment: the format function returns 'false' when no errors occurred.
So I believe it should read
      if (Rewrite.overwriteChangedFiles()) {
        return false;
      }

Disagree. My patch is fine :slight_smile:

It's just that the return value of overwriteChangedFiles() was
mis-documented until my fix last week (r193594) so that might have caused
confusion.

But I'll appreciate if you can confirm:

   /// overwriteChangedFiles - Save all changed files to disk.
   ///
   /// Returns true if any files were not saved successfully.
   /// Outputs diagnostics via the source manager's diagnostic engine
   /// in case of an error.
   bool overwriteChangedFiles();

// Returns true on error.
static bool format(std::string FileName) {
...
       if (Rewrite.overwriteChangedFiles())
         return true;
...
}

Agreed :slight_smile:

To land this fix we'll need a test case. I don't want to check in a 16k+
source file into SVN so we'll have to either generate a test input or use
one of the existing large file inputs for that.

(I'm sorry, I'm not sure how to help with this. It's been only 3 days
since I've first seen the codebase, so... :slight_smile:
Adding a test case means adding to /llvm/tools/clang/test/Format ?
I'll see if I can figure it out from the other tests in the codebase.

Yes, a test like you are describing needs to go in clang/test/Format (which
mostly does integration tests of the clang-format binary). There is also
clang/unittests/Format, but those test specific formatting aspects of stuff
in clang/lib/Format.

I think you should just generate a 16kB file (doesn't have to contain valid
C++ .. basically any 16k characters should do). Or just check in a 16kB
file, there are a lot of tests of that size ..

Cheers,
Daniel

-Johan

Hello Johan,

I've landed the fix by itself in r194250.

I've discovered that this is a problem that exists in various other bits
of LLVM and clang, and will be working to fix those uses as well.

You can see the commit message for the full background.

Daniel, let's start looking at the performance issues while we're at it,
can you review my RewriteBuffer optimisation posted to cfe-commits as a
starting point?

Alp.

Hi Alp,
   I'm sorry I did not follow up on making a test case.
Thanks again for the fix.

regards,
   Johan