MemoryBuffer/raw_ostream hybrid for linker?

Existing llvm code tends to use raw_ostream for writing files. But raw_ostream is not a good match for a linker for a couple of reasons:

1) When the linker creates an executable, the file needs the 'x' bit set. Currently raw_fd_ostream has no way to set that.

2) The Unix conformance suite actually has some test cases where the linker is run and the output file does exists but is not writable, or is not writable but is in a writable directory, or with funky umask values. raw_fd_ostream interface has no way to match those semantics.

3) On darwin we have found the linker performs better if it opens the output file, truncates it to the output size, then mmaps in the file, then writes directly into that memory buffer. This avoids the memory copy from the private buffer to the OS file system buffer in the write() syscall.

4) In the model we are using for lld, a streaming output interface is not optimal. Currently, lld copies chunks of code from the (read-only) input files, to a temporary buffer, then applies any fixups (relocations), then streams out that temporary buffer. If instead we had a big output buffer, the linker could copy the code chunks directly to the output buffer and apply the fixups there, avoiding an extra copy.

Is there an existing solution for these issues in llvm I've overlooked? I've searched the bug database and did not find any similar requests.

Should I propose a new llvm/Support/ class?

-Nick

Existing llvm code tends to use raw_ostream for writing files. But raw_ostream is not a good match for a linker for a couple of reasons:

1) When the linker creates an executable, the file needs the 'x' bit set. Currently raw_fd_ostream has no way to set that.

If this were the only problem, I'd suggest just generalizing raw_fd_ostream to support this use case. It would be straight-forward to do.

2) The Unix conformance suite actually has some test cases where the linker is run and the output file does exists but is not writable, or is not writable but is in a writable directory, or with funky umask values. raw_fd_ostream interface has no way to match those semantics.

If this were the only problem :), I would suggest using a new raw_ostream subclass, where you do custom stuff to get the file system stuff happening that you want, but reuse all the streaming API aspect of raw_ostream.

3) On darwin we have found the linker performs better if it opens the output file, truncates it to the output size, then mmaps in the file, then writes directly into that memory buffer. This avoids the memory copy from the private buffer to the OS file system buffer in the write() syscall.

This should also be possible with raw_ostream.

4) In the model we are using for lld, a streaming output interface is not optimal.

This is a show-stopper for raw_ostream. I really don't want raw_ostream to support seeking or other non-stream behavior. :slight_smile:

CIs there an existing solution for these issues in llvm I've overlooked? I've searched the bug database and did not find any similar requests.

Should I propose a new llvm/Support/ class?

Yes please. We have a variety of other places in the codebase that are using open/close/read etc directly (grep for #include's of unistd.h). Using these APIs is annoying because they are really low level (an expose nonsense like EINTR handling) and that windows make them annoying to use.

Having a better wrapper for doing real low-level file system stuff (including seeking) makes perfect sense for llvm/Support!

-Chris

For the reasons listed in my 03-May-2012 email, I am proposing a new llvm/Support class for using in writing binary files:

/// OutputBuffer - This interface provides simple way to create an in-memory
/// buffer which when done will be written to a file. During the lifetime of
/// these objects, the content or existence of the specified file is undefined.
/// That is, creating an OutputBuffer for a file may immediately remove the
/// file.
/// If the OutputBuffer is committed, the target file’s content will become
/// the buffer content at the time of the commit. If the OutputBuffer is not
/// committed, the file will be deleted in the OutputBuffer buffer destructor.
class OutputBuffer {
public:
enum Flags {
F_executable = 1, /// set the ‘x’ bit on the resulting file
};

/// Factory method to create an OutputBuffer object which manages a read/write
/// buffer of the specified size. When committed, the buffer will be written
/// to the file at the specified path.
static error_code createFile(StringRef filePath, Flags flags, size_t size,
OwningPtr &result);

/// Returns a pointer to the start of the buffer.
uint8_t *bufferStart();

/// Returns a pointer to the end of the buffer.
uint8_t *bufferEnd();

/// Returns size of the buffer.
size_t size();

/// Flushes the content of the buffer to its file and deallocates the
/// buffer. If commit() is not called before this object’s destructor
/// is called, the file is deleted in the destructor. The optional parameter
/// is used if it turns out you want the file size to be smaller than
/// initially requested.
void commit(int64_t newSmallerSize = -1);
};

The Flags will probable need to be extended over time to handle other clients needs.

For Unix/Darwin, my plan is to implement this by:

  1. delete the file
  2. create a new file with a random name in same directory
  3. truncate the file to the new size
  4. mmap() in the file r/w
  5. On commit, unmap the file, rename() to final name
  6. In destructor, if not committed, unmap, delete the randomly named file

I’ll leave the windows implementation empty and let someone with windows experience do the implementation.

Comments? Suggestions?

-Nick

FWIW, I’d be interested in working on the Windows implementation. I’ve been knee-deep in *nixes lately and wouldn’t mind the refresher. J

Gordon Keiser
Software Development Engineer

Arxan Technologies

w:+1.765.889.4756

gkeiser@arxan.com www.arxan.com

FWIW, I'd be interested in working on the Windows implementation. I've been knee-deep in *nixes lately and wouldn't mind the refresher. J

Cool!

Does my proposed interface make sense to implement on top of Windows APIs?

-Nick

I like it in general. The only comment I have is that it should match
the MemoryBuffer interface where possible.

  static error_code getFile(...);

  const char *getBufferStart() const;
  const char *getBufferEnd() const;
  size_t getBufferSize() const;
  StringRef getBuffer() const;
  const char *getBufferIdentifier() const;

- Michael Spencer

Hi,

Yes. It could be done fairly easily with memory mapped files, which would probably be the most efficient for this type of buffered access, using MapViewOfFile and FlushViewOfFile. I’d have to do some speed tests to be sure though. I’ll begin playing around with it soon (probably this weekend, work and all that) and try to determine whether a single streamed write on flush or a memory map ends up being faster.

Cheers,

Gordon

Except that MemoryBuffer is about reading which is "getting". OutputBuffer is about writing, so the "get" part seems wrong. That said, some of the methods could use "const". And there should be an accessor for getting the file path. If wanted to be C++11 crazy, we could rename the start/end methods to begin() and end(), but I doubt any will want to easily iterate the bytes in the buffer. My proposed update is then:

static error_code createFile(StringRef filePath, Flags flags, size_t size,
                                                  OwningPtr<OutputBuffer> &result);
  uint8_t *bufferStart() const;
  uint8_t *bufferEnd() const;
  size_t size() const;
  void commit(int64_t newSmallerSize = -1);
  StringRef path() const;

-Nick

For the reasons listed in my 03-May-2012 email, I am proposing a new
llvm/Support class for using in writing binary files:

/// OutputBuffer - This interface provides simple way to create an in-memory
/// buffer which when done will be written to a file. During the lifetime
of
/// these objects, the content or existence of the specified file is
undefined.
/// That is, creating an OutputBuffer for a file may immediately remove the
/// file.
/// If the OutputBuffer is committed, the target file's content will become
/// the buffer content at the time of the commit. If the OutputBuffer is
not
/// committed, the file will be deleted in the OutputBuffer
buffer destructor.
class OutputBuffer {
public:
enum Flags {
   F_executable = 1, /// set the 'x' bit on the resulting file
};

/// Factory method to create an OutputBuffer object which manages a
read/write
/// buffer of the specified size. When committed, the buffer will be
written
/// to the file at the specified path.
static error_code createFile(StringRef filePath, Flags flags, size_t
size,
                              OwningPtr<OutputBuffer> &result);

/// Returns a pointer to the start of the buffer.
uint8_t *bufferStart();

/// Returns a pointer to the end of the buffer.
uint8_t *bufferEnd();

/// Returns size of the buffer.
size_t size();

/// Flushes the content of the buffer to its file and deallocates the
/// buffer. If commit() is not called before this object's destructor
/// is called, the file is deleted in the destructor. The optional
parameter
/// is used if it turns out you want the file size to be smaller than
/// initially requested.
void commit(int64_t newSmallerSize = -1);
};

The Flags will probable need to be extended over time to handle other
clients needs.

For Unix/Darwin, my plan is to implement this by:
1) delete the file
2) create a new file with a random name in same directory
3) truncate the file to the new size
4) mmap() in the file r/w
5) On commit, unmap the file, rename() to final name
6) In destructor, if not committed, unmap, delete the randomly named file

I'll leave the windows implementation empty and let someone with windows
experience do the implementation.

Comments? Suggestions?

-Nick

Existing llvm code tends to use raw_ostream for writing files. But
raw_ostream is not a good match for a linker for a couple of reasons:

1) When the linker creates an executable, the file needs the 'x' bit set.
Currently raw_fd_ostream has no way to set that.

2) The Unix conformance suite actually has some test cases where the linker
is run and the output file does exists but is not writable, or is not
writable but is in a writable directory, or with funky umask values.
raw_fd_ostream interface has no way to match those semantics.

3) On darwin we have found the linker performs better if it opens the output
file, truncates it to the output size, then mmaps in the file, then writes
directly into that memory buffer. This avoids the memory copy from the
private buffer to the OS file system buffer in the write() syscall.

4) In the model we are using for lld, a streaming output interface is not
optimal. Currently, lld copies chunks of code from the (read-only) input
files, to a temporary buffer, then applies any fixups (relocations), then
streams out that temporary buffer. If instead we had a big output buffer,
the linker could copy the code chunks directly to the output buffer and
apply the fixups there, avoiding an extra copy.

_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

I like it in general. The only comment I have is that it should match
the MemoryBuffer interface where possible.

static error_code getFile(...);

const char *getBufferStart() const;
const char *getBufferEnd() const;
size_t getBufferSize() const;
StringRef getBuffer() const;
const char *getBufferIdentifier() const;

Except that MemoryBuffer is about reading which is "getting". OutputBuffer is about writing, so the "get" part seems wrong. That said, some of the methods could use "const". And there should be an accessor for getting the file path. If wanted to be C++11 crazy, we could rename the start/end methods to begin() and end(), but I doubt any will want to easily iterate the bytes in the buffer. My proposed update is then:

The "get" prefix is just LLVM nomenclature for accessor methods and isn't intended to convey information about the semantics of the object itself.

For the reasons listed in my 03-May-2012 email, I am proposing a new
llvm/Support class for using in writing binary files:

/// OutputBuffer - This interface provides simple way to create an in-memory
/// buffer which when done will be written to a file. During the lifetime
of
/// these objects, the content or existence of the specified file is
undefined.
/// That is, creating an OutputBuffer for a file may immediately remove the
/// file.
/// If the OutputBuffer is committed, the target file's content will become
/// the buffer content at the time of the commit. If the OutputBuffer is
not
/// committed, the file will be deleted in the OutputBuffer
buffer destructor.
class OutputBuffer {
public:
enum Flags {
  F_executable = 1, /// set the 'x' bit on the resulting file
};

/// Factory method to create an OutputBuffer object which manages a
read/write
/// buffer of the specified size. When committed, the buffer will be
written
/// to the file at the specified path.
static error_code createFile(StringRef filePath, Flags flags, size_t
size,
                             OwningPtr<OutputBuffer> &result);

/// Returns a pointer to the start of the buffer.
uint8_t *bufferStart();

/// Returns a pointer to the end of the buffer.
uint8_t *bufferEnd();

/// Returns size of the buffer.
size_t size();

/// Flushes the content of the buffer to its file and deallocates the
/// buffer. If commit() is not called before this object's destructor
/// is called, the file is deleted in the destructor. The optional
parameter
/// is used if it turns out you want the file size to be smaller than
/// initially requested.
void commit(int64_t newSmallerSize = -1);
};

The Flags will probable need to be extended over time to handle other
clients needs.

For Unix/Darwin, my plan is to implement this by:
1) delete the file
2) create a new file with a random name in same directory
3) truncate the file to the new size
4) mmap() in the file r/w
5) On commit, unmap the file, rename() to final name
6) In destructor, if not committed, unmap, delete the randomly named file

I'll leave the windows implementation empty and let someone with windows
experience do the implementation.

Comments? Suggestions?

-Nick

Existing llvm code tends to use raw_ostream for writing files. But
raw_ostream is not a good match for a linker for a couple of reasons:

1) When the linker creates an executable, the file needs the 'x' bit set.
Currently raw_fd_ostream has no way to set that.

2) The Unix conformance suite actually has some test cases where the linker
is run and the output file does exists but is not writable, or is not
writable but is in a writable directory, or with funky umask values.
raw_fd_ostream interface has no way to match those semantics.

3) On darwin we have found the linker performs better if it opens the output
file, truncates it to the output size, then mmaps in the file, then writes
directly into that memory buffer. This avoids the memory copy from the
private buffer to the OS file system buffer in the write() syscall.

4) In the model we are using for lld, a streaming output interface is not
optimal. Currently, lld copies chunks of code from the (read-only) input
files, to a temporary buffer, then applies any fixups (relocations), then
streams out that temporary buffer. If instead we had a big output buffer,
the linker could copy the code chunks directly to the output buffer and
apply the fixups there, avoiding an extra copy.

_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

I like it in general. The only comment I have is that it should match
the MemoryBuffer interface where possible.

static error_code getFile(...);

const char *getBufferStart() const;
const char *getBufferEnd() const;
size_t getBufferSize() const;
StringRef getBuffer() const;
const char *getBufferIdentifier() const;

Except that MemoryBuffer is about reading which is "getting". OutputBuffer is about writing, so the "get" part seems wrong. That said, some of the methods could use "const". And there should be an accessor for getting the file path. If wanted to be C++11 crazy, we could rename the start/end methods to begin() and end(), but I doubt any will want to easily iterate the bytes in the buffer. My proposed update is then:

The "get" prefix is just LLVM nomenclature for accessor methods and isn't intended to convey information about the semantics of the object itself.

static error_code createFile(StringRef filePath, Flags flags, size_t size,
                                                OwningPtr<OutputBuffer> &result);
uint8_t *bufferStart() const;
uint8_t *bufferEnd() const;
size_t size() const;
void commit(int64_t newSmallerSize = -1);

Also commit can fail, so it needs an error_code return.

- Michael Spencer

I now have an implementation of FileOutputBuffer (OutputBuffer was already taken). The patch supports the functionality listed below and I've tested that it works for lld.

FileOutputBuffer.patch (24.7 KB)

I now have an implementation of FileOutputBuffer (OutputBuffer was already taken). The patch supports the functionality listed below and I've tested that it works for lld.

To implement the FileOutputBuffer, I needed to add some more functions to llvm/Support/FileSystem.h, including:
is_writable_file()
is_executable_file()
set_file_executable()

For these parts I would like to follow
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3365.html#TOC
which is what the rest of the fs library is modeled after. So we would
use:

error_code permissions(const Twine &path, perms p);

If the name is confusing (it says nothing about modification) I'm fine
with modify_permissions or something similar.

unique_file_sized()

Instead of adding this I would much rather add resize_file and call it
after unique_file.

map_file_pages()
unmap_file_pages()

These are good.

I've implemented the unix side for all these, but have the windows side just stubbed out.

I've also added a test case (unittests/Support/FileOutputBufferTest.cpp) which exercises FileOutputBuffer.

-Nick

As a generally comment there is lots of trailing white space and tabs
and white space only changes.

Inline comments:

Index: include/llvm/Support/FileSystem.h

--- include/llvm/Support/FileSystem.h (revision 157010)
+++ include/llvm/Support/FileSystem.h (working copy)
@@ -426,12 +454,35 @@
/// @param result_path Set to the opened file's absolute path.
/// @param makeAbsolute If true and @model is not an absolute path, a temp
/// directory will be prepended.
+/// @param private_file If true file will have permissions set to only
+/// be accessible by the current user. If false, then the file
+/// permissions are set to be globally readable.
/// @results errc::success if result_{fd,path} have been successfully set,
/// otherwise a platform specific error_code.
error_code unique_file(const Twine &model, int &result_fd,
- SmallVectorImpl<char> &result_path,
- bool makeAbsolute = true, unsigned mode = 0600);
+ SmallVectorImpl<char> &result_path,
+ bool makeAbsolute = true,
+ bool private_file = true);

Incorrect whitespace only change and tabs.

Index: include/llvm/Support/FileOutputBuffer.h

--- include/llvm/Support/FileOutputBuffer.h (revision 0)
+++ include/llvm/Support/FileOutputBuffer.h (revision 0)
@@ -0,0 +1,97 @@
+//=== FileOutputBuffer.h - File Output Buffer -------------------*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// Utility for creating a in-memory buffer that will be written to a file.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_FILEOUTPUTBUFFER_H
+#define LLVM_SUPPORT_FILEOUTPUTBUFFER_H
+
+#include "llvm/Support/DataTypes.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/SmallString.h"
+
+namespace llvm {
+
+class error_code;
+template<class T> class OwningPtr;
+
+/// FileOutputBuffer - This interface provides simple way to create an in-memory
+/// buffer which will be written to a file. During the lifetime of these
+/// objects, the content or existence of the specified file is undefined. That
+/// is, creating an OutputBuffer for a file may immediately remove the file.
+/// If the FileOutputBuffer is committed, the target file's content will become
+/// the buffer content at the time of the commit. If the FileOutputBuffer is
+/// not committed, the file will be deleted in the FileOutputBuffer destructor.
+class FileOutputBuffer {
+public:
+
+ enum {
+ F_executable = 1 /// set the 'x' bit on the resulting file
+ };
+
+ /// Factory method to create an OutputBuffer object which manages a read/write
+ /// buffer of the specified size. When committed, the buffer will be written
+ /// to the file at the specified path.
+ static error_code create(StringRef filePath, size_t size,
+ OwningPtr<FileOutputBuffer> &result,
+ unsigned flags=0);

Arguments should start with an upper case letter. (same for all of these)

+ /// Returns a pointer to the start of the buffer.
+ uint8_t *getBufferStart() const {
+ return bufferStart;
+ }
+
+ /// Returns a pointer to the end of the buffer.
+ uint8_t *getBufferEnd() const {
+ return bufferEnd;
+ }
+
+ /// Returns size of the buffer.
+ size_t getBufferSize() const {
+ return bufferEnd - bufferStart;
+ }
+
+ /// Returns path where file will show up if buffer is committed.
+ StringRef getPath() const {
+ return finalPath;
+ }
+
+ /// Flushes the content of the buffer to its file and deallocates the
+ /// buffer. If commit() is not called before this object's destructor
+ /// is called, the file is deleted in the destructor. The optional parameter
+ /// is used if it turns out you want the file size to be smaller than
+ /// initially requested.
+ virtual error_code commit(int64_t newSmallerSize = -1);

Why is this virtual? I don't ever see a need to subclass FileOutputBuffer.

+ /// If this object was previously committed, the destructor just deletes
+ /// this object. If this object was not committed, the destructor
+ /// deallocates the buffer and the target file is never written.
+ virtual ~FileOutputBuffer();

Same.

+
+
+protected:
+ FileOutputBuffer(const FileOutputBuffer &); // DO NOT IMPLEMENT
+ FileOutputBuffer &operator=(const FileOutputBuffer &); // DO NOT IMPLEMENT
+ FileOutputBuffer(uint8_t *start, uint8_t *end,
+ StringRef path, StringRef tempPath);
+
+ uint8_t *bufferStart;
+ uint8_t *bufferEnd;
+ SmallString<128> finalPath;
+ SmallString<128> tempPath;

These should all start with an uppercase letter.

+};
+
+
+
+} // end namespace llvm
+
+#endif
Index: unittests/Support/FileOutputBufferTest.cpp

--- unittests/Support/FileOutputBufferTest.cpp (revision 0)
+++ unittests/Support/FileOutputBufferTest.cpp (revision 0)
@@ -0,0 +1,131 @@
+//===- llvm/unittest/Support/FileOutputBuffer.cpp - unit tests ------------===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/PathV2.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Support/FileOutputBuffer.h"
+#include "llvm/ADT/OwningPtr.h"
+
+#include "gtest/gtest.h"
+
+using namespace llvm;
+using namespace llvm::sys;
+
+#define ASSERT_NO_ERROR(x) \
+ if (error_code ASSERT_NO_ERROR_ec = x) { \
+ errs() << #x ": did not return errc::success.\n" \
+ << "error number: " << ASSERT_NO_ERROR_ec.value() << "\n" \
+ << "error message: " << ASSERT_NO_ERROR_ec.message() << "\n"; \
+ } else {}
+
+namespace {
+
+
+TEST(FileOutputBuffer, Test) {
+ // Create unique temporary directory for these tests
+ SmallString<128> TestDirectory;
+ {
+ int fd;
+ ASSERT_NO_ERROR(
+ fs::unique_file("FileOutputBuffer-test-%%-%%-%%-%%/dir", fd,
+ TestDirectory));
+ ::close(fd);
+ TestDirectory = path::parent_path(TestDirectory);
+ }
+
+ // TEST 1: Verify commit case.
+ SmallString<128> File1(TestDirectory);
+ File1.append("/file1");
+ {
+ OwningPtr<FileOutputBuffer> Buffer;
+ ASSERT_NO_ERROR(FileOutputBuffer::create(File1, 8192, Buffer));
+ // Start buffer with special header.
+ memcpy(Buffer->getBufferStart(), "AABBCCDDEEFFGGHHIIJJ", 20);
+ // Write to end of buffer to verify it is writable.
+ memcpy(Buffer->getBufferEnd() - 20, "AABBCCDDEEFFGGHHIIJJ", 20);
+ // Commit buffer.
+ ASSERT_NO_ERROR(Buffer->commit());
+ }
+ // Verify file exists and starts with special header.
+ bool MagicMatches = false;
+ ASSERT_NO_ERROR(fs::has_magic(Twine(File1), Twine("AABBCCDDEEFFGGHHIIJJ"),
+ MagicMatches));
+ EXPECT_TRUE(MagicMatches);
+ // Verify file is correct size.
+ uint64_t File1Size;
+ ASSERT_NO_ERROR(fs::file_size(Twine(File1), File1Size));
+ ASSERT_EQ(File1Size, 8192ULL);
+
+ // TEST 2: Verify abort case.
+ SmallString<128> File2(TestDirectory);
+ File2.append("/file2");
+ {
+ OwningPtr<FileOutputBuffer> Buffer2;
+ ASSERT_NO_ERROR(FileOutputBuffer::create(File2, 8192, Buffer2));
+ // Fill buffer with special header.
+ memcpy(Buffer2->getBufferStart(), "AABBCCDDEEFFGGHHIIJJ", 20);
+ // Do *not* commit buffer.
+ }
+ // Verify file does not exist (because buffer not commited).
+ bool Exists = false;
+ ASSERT_NO_ERROR(fs::exists(Twine(File2), Exists));
+ EXPECT_FALSE(Exists);
+
+
+ // TEST 3: Verify sizing down case.
+ SmallString<128> File3(TestDirectory);
+ File3.append("/file3");
+ {
+ OwningPtr<FileOutputBuffer> Buffer;
+ ASSERT_NO_ERROR(FileOutputBuffer::create(File3, 8192000, Buffer));
+ // Start buffer with special header.
+ memcpy(Buffer->getBufferStart(), "AABBCCDDEEFFGGHHIIJJ", 20);
+ // Write to end of buffer to verify it is writable.
+ memcpy(Buffer->getBufferEnd() - 20, "AABBCCDDEEFFGGHHIIJJ", 20);
+ // Commit buffer, but size down to smaller size
+ ASSERT_NO_ERROR(Buffer->commit(5000));
+ }
+ // Verify file exists and starts with special header.
+ bool MagicMatches3 = false;
+ ASSERT_NO_ERROR(fs::has_magic(Twine(File3), Twine("AABBCCDDEEFFGGHHIIJJ"),
+ MagicMatches3));
+ EXPECT_TRUE(MagicMatches3);
+ // Verify file is correct size.
+ uint64_t File3Size;
+ ASSERT_NO_ERROR(fs::file_size(Twine(File3), File3Size));
+ ASSERT_EQ(File3Size, 5000ULL);
+
+
+ // TEST 4: Verify file can be made executable.
+ SmallString<128> File4(TestDirectory);
+ File4.append("/file4");
+ {
+ OwningPtr<FileOutputBuffer> Buffer;
+ ASSERT_NO_ERROR(FileOutputBuffer::create(File4, 8192, Buffer,
+ FileOutputBuffer::F_executable));
+ // Start buffer with special header.
+ memcpy(Buffer->getBufferStart(), "AABBCCDDEEFFGGHHIIJJ", 20);
+ // Commit buffer.
+ ASSERT_NO_ERROR(Buffer->commit());
+ }
+ // Verify file exists and is executable.
+ bool IsExecutable;
+ ASSERT_NO_ERROR(fs::is_executable_file(Twine(File4), IsExecutable));
+ EXPECT_TRUE(IsExecutable);
+
+
+ // Clean up.
+ uint32_t RemovedCount;
+ ASSERT_NO_ERROR(fs::remove_all(TestDirectory.str(), RemovedCount));
+}
+
+
+} // anonymous namespace
Index: unittests/CMakeLists.txt

--- unittests/CMakeLists.txt (revision 157010)
+++ unittests/CMakeLists.txt (working copy)
@@ -165,6 +165,7 @@
   Support/CommandLineTest.cpp
   Support/ConstantRangeTest.cpp
   Support/EndianTest.cpp
+ Support/FileOutputBufferTest.cpp
   Support/LeakDetectorTest.cpp
   Support/MathExtrasTest.cpp
   Support/Path.cpp
Index: lib/Support/Unix/PathV2.inc

--- lib/Support/Unix/PathV2.inc (revision 157010)
+++ lib/Support/Unix/PathV2.inc (working copy)
@@ -24,6 +24,9 @@
#if HAVE_FCNTL_H
#include <fcntl.h>
#endif
+#ifdef HAVE_SYS_MMAN_H
+#include <sys/mman.h>
+#endif
#if HAVE_DIRENT_H
# include <dirent.h>
# define NAMLEN(dirent) strlen((dirent)->d_name)
@@ -52,6 +55,7 @@
# define PATH_MAX 4096
#endif

+
using namespace llvm;

namespace {
@@ -347,10 +351,19 @@
   return error_code::success();
}

-// Since this is most often used for temporary files, mode defaults to 0600.
+static mode_t current_umask() {
+ // There is no posix way to just get current umask. You
+ // have to set it to something else, and get the "old" value.
+ // So, get and reset to get the current umask value;
+ mode_t mask = ::umask(0);
+ ::umask(mask);
+ return mask;
+}
+
+
error_code unique_file(const Twine &model, int &result_fd,
- SmallVectorImpl<char> &result_path,
- bool makeAbsolute, unsigned mode) {
+ SmallVectorImpl<char> &result_path,
+ bool makeAbsolute, bool private_file) {

Incorrect whitespace change.

   SmallString<128> Model;
   model.toVector(Model);
   // Null terminate.
@@ -380,7 +393,8 @@

   // Try to open + create the file.
rety_open_create:
- int RandomFD = ::open(RandomPath.c_str(), O_RDWR | O_CREAT | O_EXCL, mode);
+ int perms = (private_file ? 0600 : 0666) & ~(current_umask());
+ int RandomFD = ::open(RandomPath.c_str(), O_RDWR | O_CREAT | O_EXCL, perms);
   if (RandomFD == -1) {
     // If the file existed, try again, otherwise, error.
     if (errno == errc::file_exists)
@@ -426,6 +440,28 @@
   return error_code::success();
}

+
+error_code unique_file_sized(const Twine &model, size_t size,
+ SmallVectorImpl<char> &result_path,
+ bool makeAbsolute) {
+ int fd;
+ if ( error_code ec = unique_file(model, fd, result_path, makeAbsolute, false) )
+ return ec;
+
+ if ( ::ftruncate(fd, size) == -1 ) {
+ int error = errno;
+ result_path.push_back('\0');

This is incorrect because result_path now actually includes a \0. I added a
function c_str(container) specifically for this case that adds a 0 then removes
it.

+ ::unlink(result_path.begin());
+ ::close(fd);
+ return error_code(error, system_category());
+ }
+
+ ::close(fd);
+
+ return error_code::success();
+}
+
+
error_code detail::directory_iterator_construct(detail::DirIterState &it,
                                                 StringRef path){
   SmallString<128> path_null(path);
@@ -496,6 +532,97 @@
   return error_code::success();
}

+error_code map_file_pages(const Twine &path, off_t file_offset, size_t size,
+ bool map_writable, void *&result) {
+ SmallString<128> path_storage;
+ StringRef name = path.toNullTerminatedStringRef(path_storage);
+ int oflags = map_writable ? O_RDWR : O_RDONLY;
+ int fd = ::open(name.begin(), oflags);
+ if ( fd == -1 )
+ return error_code(errno, system_category());
+ int flags = map_writable ? MAP_SHARED : MAP_PRIVATE;
+ int prot = map_writable ? (PROT_READ|PROT_WRITE) : PROT_READ;
+#ifdef MAP_FILE
+ flags |= MAP_FILE;
+#endif
+ result = ::mmap(0, size, prot, flags, fd, file_offset);
+ if (result == MAP_FAILED) {
+ ::close(fd);
+ return error_code(errno, system_category());
+ }
+
+ ::close(fd);

Could AutoFD work here?

+ return error_code::success();
+}
+
+error_code unmap_file_pages(void *base, size_t size) {
+ if ( ::munmap(base, size) == -1 )
+ return error_code(errno, system_category());
+
+ return error_code::success();
+}
+
+
+error_code is_writable_file(const Twine &path, bool &result) {
+ result = false;
+ SmallString<128> path_storage;
+ StringRef name = path.toNullTerminatedStringRef(path_storage);
+ if ( ::access(name.begin(), W_OK) == -1 ) {
+ int error = errno;
+ if ( error == EACCES )
+ result = false;
+ else
+ return error_code(error, system_category());
+ } else {
+ result = true;
+ }
+ return error_code::success();
+}
+
+
+error_code is_executable_file(const Twine &path, bool &result) {
+ result = false;
+ SmallString<128> path_storage;
+ StringRef name = path.toNullTerminatedStringRef(path_storage);
+ if ( ::access(name.begin(), X_OK) == -1 ) {
+ int error = errno;
+ if ( error == EACCES )
+ result = false;
+ else
+ return error_code(error, system_category());
+ } else {
+ result = true;
+ }
+ return error_code::success();
+}
+
+
+
+error_code set_file_executable(const Twine &path) {
+ SmallString<128> path_storage;
+ StringRef name = path.toNullTerminatedStringRef(path_storage);
+
+ struct stat buf;
+ if ( stat(name.begin(), &buf) == -1 )
+ return error_code(errno, system_category());
+
+ // Only add 'x' bit where 'r' bit is already set
+ mode_t perms = buf.st_mode;
+ if ( perms & S_IRUSR )
+ perms |= S_IXUSR;
+ if ( perms & S_IRGRP )
+ perms |= S_IXGRP;
+ if ( perms & S_IROTH )
+ perms |= S_IXOTH;
+
+ if ( ::chmod(name.begin(), perms) == -1)
+ return error_code(errno, system_category());
+
+ return error_code::success();
+}
+
+
+
} // end namespace fs
} // end namespace sys
} // end namespace llvm
Index: lib/Support/FileOutputBuffer.cpp

--- lib/Support/FileOutputBuffer.cpp (revision 0)
+++ lib/Support/FileOutputBuffer.cpp (revision 0)
@@ -0,0 +1,127 @@
+//===- FileOutputBuffer.cpp - File Output Buffer ----------------*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// Utility for creating a in-memory buffer that will be written to a file.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Support/FileOutputBuffer.h"
+#include "llvm/Support/system_error.h"
+#include "llvm/Support/FileSystem.h"
+
+#include "llvm/Support/raw_ostream.h"
+
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/OwningPtr.h"

These should be sorted alphabetically except for the main header.

+
+namespace llvm {
+
+
+FileOutputBuffer::FileOutputBuffer(uint8_t *start, uint8_t *end,
+ StringRef path, StringRef tmpPath)

Missing an alignment space.

+ : bufferStart(start), bufferEnd(end) {
+ finalPath.assign(path);
+ tempPath.assign(tmpPath);
+}
+
+
+FileOutputBuffer::~FileOutputBuffer() {
+ // If not already commited, delete buffer and remove temp file.
+ if ( bufferStart != NULL ) {
+ sys::fs::unmap_file_pages((void*)bufferStart, getBufferSize());
+ bool existed;
+ sys::fs::remove(Twine(tempPath), existed);
+ }
+}
+
+
+
+error_code FileOutputBuffer::create(StringRef filePath,
+ size_t size,
+ OwningPtr<FileOutputBuffer> &result,
+ unsigned flags) {

Alignment whitespace and capitalized first character.

+ // If file already exists, it must be a regular file (to be mappable).
+ Twine filePathTwine(filePath);

Twine should not be used like this. It stores references to temporaries. It will
work in this case because filePath is not a temporary and you haven't invoked
any operators. It's also uneeded. StringRef is convertable to Twine.

+ sys::fs::file_status stat;
+ bool writable;

This is only used in the case below, so it should be folded in there.

+ error_code ec = sys::fs::status(filePathTwine, stat);

stat is undefined if ec isn't success. ec will be success even in the case of
file_not_found.

+ switch ( stat.type() ) {

No space after ( and before ). Same lots of other places.

+ case sys::fs::file_type::file_not_found:
+ // If file does not exist, we'll create one.
+ break;
+ case sys::fs::file_type::regular_file:
+ // If file is not currently writable, error out.
+ ec = sys::fs::is_writable_file(filePathTwine, writable);
+ if ( !ec && !writable )
+ return make_error_code(errc::permission_denied);
+ break;
+ default:
+ if ( ec )
+ return ec;
+ else
+ return make_error_code(errc::operation_not_permitted);
+ }
+
+ // Delete target file.
+ bool existed;
+ if ( error_code ec = sys::fs::remove(filePath, existed) )
+ return ec;
+
+ // Create new file of specified size in same directory but with random name.
+ SmallString<128> modelStr;
+ modelStr.assign(filePath);
+ modelStr.append(".tmp%%%%%%%");
+ SmallString<128> tempFilePath;
+ if ( error_code ec = sys::fs::unique_file_sized(Twine(modelStr), size,
+ tempFilePath, false) ) {

This should be: Twine(filePath) + ".tmp%%%%%%%"

+ return ec;
+ }

Unneeded block.

+
+ // If requested, make the output file executable.
+ if ( flags & F_executable )

Tab.

+ sys::fs::set_file_executable(Twine(tempFilePath));
+
+ // Memory map new file.
+ void *base;
+ if ( error_code ec = sys::fs::map_file_pages(Twine(tempFilePath),
+ 0, size, true, base) ) {
+ return ec;
+ }
+
+ // Create FileOutputBuffer object to own mapped range.
+ uint8_t* start = (uint8_t*)base;

* should be right aligned and the case should be a c++ cast.

+ result.reset(new FileOutputBuffer(start, start+size, filePath, tempFilePath));

Spaces around binary operator +.

+
+ return error_code::success();
+}
+
+
+error_code FileOutputBuffer::commit(int64_t newSmallerSize) {
+ // Unmap buffer, letting OS flush dirty pages to file on disk.
+ if ( error_code ec = sys::fs::unmap_file_pages((void*)bufferStart,

c++ cast.

+ getBufferSize()) ) {
+ return ec;
+ }
+
+ // If requested, resize file as part of commit.
+ if ( newSmallerSize != -1 ) {
+ if ( error_code ec = sys::fs::resize_file(Twine(tempPath), newSmallerSize) )
+ return ec;
+ }
+
+ // Rename file to final name.
+ if ( error_code ec = sys::fs::rename(Twine(tempPath), Twine(finalPath)) )
+ return ec;
+
+ return error_code::success();
+}
+
+
+} // namespace
+
Index: lib/Support/CMakeLists.txt

--- lib/Support/CMakeLists.txt (revision 157010)
+++ lib/Support/CMakeLists.txt (working copy)
@@ -23,6 +23,7 @@
   Dwarf.cpp
   ErrorHandling.cpp
   FileUtilities.cpp
+ FileOutputBuffer.cpp
   FoldingSet.cpp
   FormattedStream.cpp
   GraphWriter.cpp
Index: lib/Support/Windows/PathV2.inc

--- lib/Support/Windows/PathV2.inc (revision 157010)
+++ lib/Support/Windows/PathV2.inc (working copy)
@@ -501,7 +501,7 @@
// it currently comes in as a UNIX mode.
error_code unique_file(const Twine &model, int &result_fd,
                        SmallVectorImpl<char> &result_path,
- bool makeAbsolute, unsigned mode) {
+ bool makeAbsolute, bool private_file) {
   // Use result_path as temp storage.
   result_path.set_size(0);
   StringRef m = model.toStringRef(result_path);
@@ -755,6 +755,43 @@
   return error_code::success();
}

+error_code map_file_pages(const Twine &path, off_t file_offset, size_t size,
+ bool map_writable, void *&result) {
+ assert(0 && "NOT IMPLEMENTED");
+}
+
+error_code unmap_file_pages(void *base, size_t size) {
+ assert(0 && "NOT IMPLEMENTED");
+}
+
+error_code is_writable_file(const Twine &path, bool &result) {
+#if 0 // verify code below before enabling:
+ SmallString<128> path_storage;
+ StringRef name = path.toNullTerminatedStringRef(path_storage);
+
+ DWORD attr = GetFileAttributes(name.begin());
+
+ if (attr == INVALID_FILE_ATTRIBUTES)
+ return windows_error(::GetLastError());
+
+ // Files are writable by default, unless marked READONLY
+ result = !(attr & FILE_ATTRIBUTE_READONLY);
+ #endif

This looks right, the only thing it's missing is symlink handling.

Actually I was wrong. The Windows and UNIX implementation disagree on
this point. I'm going to change it to match
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3365.html#status
error_code semantics.

I now have an implementation of FileOutputBuffer (OutputBuffer was already taken). The patch supports the functionality listed below and I've tested that it works for lld.

To implement the FileOutputBuffer, I needed to add some more functions to llvm/Support/FileSystem.h, including:
  is_writable_file()
  is_executable_file()
  set_file_executable()

For these parts I would like to follow
Filesystem TR2 Proposal
which is what the rest of the fs library is modeled after. So we would
use:

error_code permissions(const Twine &path, perms p);

If the name is confusing (it says nothing about modification) I'm fine
with modify_permissions or something similar.

Oh my! I thought you were trying to create an abstraction that could be mapped onto unix or windows. That spec is just a posix interface, but with different names. There is even a table that shows the mapping of all the bit values (e.g. group_read == 040 == S_IRGRP. Seems like it would be less complicated to just implement the posix call interface on windows, than to create all these new names that add no value…

  unique_file_sized()

Instead of adding this I would much rather add resize_file and call it
after unique_file.

The problem is that the existing unique_file() function creates the file and returns the file-descriptor. My level does not need that file-descriptor and there is no close() function available in this N3365 spec.

  map_file_pages()
  unmap_file_pages()

These are good.

But if you are really trying to model N3365, should these go elsewhere?

+ /// Flushes the content of the buffer to its file and deallocates the
+ /// buffer. If commit() is not called before this object's destructor
+ /// is called, the file is deleted in the destructor. The optional parameter
+ /// is used if it turns out you want the file size to be smaller than
+ /// initially requested.
+ virtual error_code commit(int64_t newSmallerSize = -1);

Why is this virtual? I don't ever see a need to subclass FileOutputBuffer.

+ /// If this object was previously committed, the destructor just deletes
+ /// this object. If this object was not committed, the destructor
+ /// deallocates the buffer and the target file is never written.
+ virtual ~FileOutputBuffer();

Same.

I was anticipating there might be alternate implementations - like MemoryBuffer. I'll remove the virtual keyword.

+error_code unique_file_sized(const Twine &model, size_t size,
+ SmallVectorImpl<char> &result_path,
+ bool makeAbsolute) {
+ int fd;
+ if ( error_code ec = unique_file(model, fd, result_path, makeAbsolute, false) )
+ return ec;
+
+ if ( ::ftruncate(fd, size) == -1 ) {
+ int error = errno;
+ result_path.push_back('\0');

This is incorrect because result_path now actually includes a \0. I added a
function c_str(container) specifically for this case that adds a 0 then removes
it.

c_str() is only available on SmallString<>, not on its parent SmallVectorImpl<char>.

+error_code map_file_pages(const Twine &path, off_t file_offset, size_t size,
+ bool map_writable, void *&result) {
+ SmallString<128> path_storage;
+ StringRef name = path.toNullTerminatedStringRef(path_storage);
+ int oflags = map_writable ? O_RDWR : O_RDONLY;
+ int fd = ::open(name.begin(), oflags);
+ if ( fd == -1 )
+ return error_code(errno, system_category());
+ int flags = map_writable ? MAP_SHARED : MAP_PRIVATE;
+ int prot = map_writable ? (PROT_READ|PROT_WRITE) : PROT_READ;
+#ifdef MAP_FILE
+ flags |= MAP_FILE;
+#endif
+ result = ::mmap(0, size, prot, flags, fd, file_offset);
+ if (result == MAP_FAILED) {
+ ::close(fd);
+ return error_code(errno, system_category());
+ }
+
+ ::close(fd);

Could AutoFD work here?

Yes!

+//===- FileOutputBuffer.cpp - File Output Buffer ----------------*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// Utility for creating a in-memory buffer that will be written to a file.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Support/FileOutputBuffer.h"
+#include "llvm/Support/system_error.h"
+#include "llvm/Support/FileSystem.h"
+
+#include "llvm/Support/raw_ostream.h"
+
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/OwningPtr.h"

These should be sorted alphabetically except for the main header.

Ok.

+ // If file already exists, it must be a regular file (to be mappable).
+ Twine filePathTwine(filePath);

Twine should not be used like this. It stores references to temporaries. It will
work in this case because filePath is not a temporary and you haven't invoked
any operators. It's also uneeded. StringRef is convertable to Twine.

+ sys::fs::file_status stat;
+ bool writable;

This is only used in the case below, so it should be folded in there.

Fixed.

+
+ // Create new file of specified size in same directory but with random name.
+ SmallString<128> modelStr;
+ modelStr.assign(filePath);
+ modelStr.append(".tmp%%%%%%%");
+ SmallString<128> tempFilePath;
+ if ( error_code ec = sys::fs::unique_file_sized(Twine(modelStr), size,
+ tempFilePath, false) ) {

This should be: Twine(filePath) + ".tmp%%%%%%%"

Yes, that will save some copying from happening.

+ sys::fs::set_file_executable(Twine(tempFilePath));
+
+ // Memory map new file.
+ void *base;
+ if ( error_code ec = sys::fs::map_file_pages(Twine(tempFilePath),
+ 0, size, true, base) ) {
+ return ec;
+ }
+
+ // Create FileOutputBuffer object to own mapped range.
+ uint8_t* start = (uint8_t*)base;

* should be right aligned and the case should be a c++ cast.

+
+ return error_code::success();
+}
+
+
+error_code FileOutputBuffer::commit(int64_t newSmallerSize) {
+ // Unmap buffer, letting OS flush dirty pages to file on disk.
+ if ( error_code ec = sys::fs::unmap_file_pages((void*)bufferStart,

c++ cast.

Fixed.

This is a proposed patch to enhance FileSystem.h to add functionality (getting and setting permission bits and mapping an unmapping files). This implementation follows the N3365 proposal regarding permission bits.

This functionality is needed for my next patch which will implement llvm/include/Support/FileOutputBuffer.h which is needed by lld.

FileSystem.patch (9.78 KB)

This is a proposed patch to enhance FileSystem.h to add functionality (getting and setting permission bits and mapping an unmapping files). This implementation follows the N3365 proposal regarding permission bits.

This functionality is needed for my next patch which will implement llvm/include/Support/FileOutputBuffer.h which is needed by lld.

-Nick

Index: include/llvm/Support/FileSystem.h

--- include/llvm/Support/FileSystem.h (revision 158678)
+++ include/llvm/Support/FileSystem.h (working copy)
@@ -94,6 +94,56 @@
   uint64_t available;
};

+
+enum perms
+{

No new line before {.

+ no_perms = 0,
+ owner_read = 0400,
+ owner_write = 0200,
+ owner_exe = 0100,
+ owner_all = owner_read | owner_write | owner_exe,
+ group_read = 040,
+ group_write = 020,
+ group_exe = 010,
+ group_all = group_read | group_write | group_exe,
+ others_read = 04,
+ others_write = 02,
+ others_exe = 01,
+ others_all = others_read | others_write | others_exe,
+ all_all = owner_all | group_all | others_all,
+ set_uid_on_exe = 04000,
+ set_gid_on_exe = 02000,
+ sticky_bit = 01000,
+ perms_mask = all_all | set_uid_on_exe | set_gid_on_exe | sticky_bit,
+ perms_not_known = 0xFFFF,
+ add_perms = 0x1000,
+ remove_perms = 0x2000,
+ symlink_perms = 0x4000
+};
+
+// Helper functions so that you can use & and | to manipulate perms bits:
+inline perms operator|(perms l , perms r) {
+ return static_cast<perms>(
+ static_cast<unsigned short>(l) | static_cast<unsigned short>(r));
+}
+inline perms operator&(perms l , perms r) {
+ return static_cast<perms>(
+ static_cast<unsigned short>(l) & static_cast<unsigned short>(r));
+}
+inline perms &operator|=(perms &l, perms r) {
+ l = l | r;
+ return l;
+}
+inline perms &operator&=(perms &l, perms r) {
+ l = l & r;
+ return l;
+}
+inline perms operator~(perms x) {
+ return static_cast<perms>(~static_cast<unsigned short>(x));
+}
+
+
+
/// file_status - Represents the result of a call to stat and friends. It has
/// a platform specific member to store the result.
class file_status
@@ -113,12 +163,19 @@
   friend bool equivalent(file_status A, file_status B);
   friend error_code status(const Twine &path, file_status &result);
   file_type Type;
+ perms Perms;
public:
- explicit file_status(file_type v=file_type::status_error)
- : Type(v) {}
+ explicit file_status(file_type v=file_type::status_error,
+ perms prms=perms_not_known)
+ : Type(v), Perms(prms) {}

+ // getters
   file_type type() const { return Type; }
+ perms permissions() const { return Perms; }
+
+ // setters
   void type(file_type v) { Type = v; }
+ void permissions(perms p) { Perms = p; }
};

/// file_magic - An "enum class" enumeration of file types based on magic (the first
@@ -395,6 +452,13 @@
/// platform specific error_code.
error_code status(const Twine &path, file_status &result);

+/// @brief Modifies permission bits on a file
+///
+/// @param path Input path.
+/// @results errc::success if permissions have been changed, otherwise a
+/// platform specific error_code.
+error_code permissions(const Twine &path, perms prms);
+
/// @brief Is status available?
///
/// @param path Input path.
@@ -513,6 +577,33 @@
error_code GetMainExecutable(const char *argv0, void *MainAddr,
                              SmallVectorImpl<char> &result);

+
+/// @brief Memory maps the contents of a file
+///
+/// @param path Path to file to map.
+/// @param file_offset Byte offset in file where mapping should begin.
+/// @param size_t Byte length of range of the file to map.
+/// @param map_writable If true, the file will be mapped in r/w such
+/// that changes to the the mapped buffer will be flushed back
+/// to the file. If false, the file will be mapped read-only
+/// and the buffer will be read-only.
+/// @param result Set to the start address of the mapped buffer.
+/// @results errc::success if result has been successfully set, otherwise a
+/// platform specific error_code.
+error_code map_file_pages(const Twine &path, off_t file_offset, size_t size,
+ bool map_writable, void *&result);
+
+
+/// @brief Memory unmaps the contents of a file
+///
+/// @param base Pointer to the start of the buffer.
+/// @param size Byte length of the range to unmmap.
+/// @results errc::success if result has been successfully set, otherwise a
+/// platform specific error_code.
+error_code unmap_file_pages(void *base, size_t size);
+
+
+

/// @}
/// @name Iterators
/// @{
Index: lib/Support/Unix/PathV2.inc

--- lib/Support/Unix/PathV2.inc (revision 158680)
+++ lib/Support/Unix/PathV2.inc (working copy)
@@ -24,6 +24,9 @@
#if HAVE_FCNTL_H
#include <fcntl.h>
#endif
+#ifdef HAVE_SYS_MMAN_H
+#include <sys/mman.h>
+#endif
#if HAVE_DIRENT_H
# include <dirent.h>
# define NAMLEN(dirent) strlen((dirent)->d_name)
@@ -325,20 +328,22 @@
     return ec;
   }

+ perms prms = static_cast<perms>(status.st_mode & perms_mask);
+
   if (S_ISDIR(status.st_mode))
- result = file_status(file_type::directory_file);
+ result = file_status(file_type::directory_file, prms);
   else if (S_ISREG(status.st_mode))
- result = file_status(file_type::regular_file);
+ result = file_status(file_type::regular_file, prms);
   else if (S_ISBLK(status.st_mode))
- result = file_status(file_type::block_file);
+ result = file_status(file_type::block_file, prms);
   else if (S_ISCHR(status.st_mode))
- result = file_status(file_type::character_file);
+ result = file_status(file_type::character_file, prms);
   else if (S_ISFIFO(status.st_mode))
- result = file_status(file_type::fifo_file);
+ result = file_status(file_type::fifo_file, prms);
   else if (S_ISSOCK(status.st_mode))
- result = file_status(file_type::socket_file);
+ result = file_status(file_type::socket_file, prms);
   else
- result = file_status(file_type::type_unknown);
+ result = file_status(file_type::type_unknown, prms);

   result.fs_st_dev = status.st_dev;
   result.fs_st_ino = status.st_ino;
@@ -346,6 +351,34 @@
   return error_code::success();
}

+// Modifies permissions on a file.
+error_code permissions(const Twine &path, perms prms) {
+ // Get current permissions
+ file_status info;
+ if (error_code ec = status(path, info)) {
+ return ec;
+ }
+
+ // Set updated permissions.
+ SmallString<128> path_storage;
+ StringRef p = path.toNullTerminatedStringRef(path_storage);
+ perms permsToSet;
+ if (prms & add_perms) {
+ permsToSet = (info.permissions() | prms) & perms_mask;
+ } else if (prms & remove_perms) {
+ permsToSet = (info.permissions() & ~prms) & perms_mask;
+ } else {
+ assert(0 && "neither add_perms or remove_perms is set");

If neither add nor remove are set, it should set all perms. If both
add and remove are set, it should llvm_unreachable.

+ }
+ if (::chmod(p.begin(), static_cast<mode_t>(permsToSet))) {
+ return error_code(errno, system_category());
+ }
+
+ return error_code::success();
+}
+
// Since this is most often used for temporary files, mode defaults to 0600.
error_code unique_file(const Twine &model, int &result_fd,
                        SmallVectorImpl<char> &result_path,
@@ -495,6 +528,36 @@
   return error_code::success();
}

+error_code map_file_pages(const Twine &path, off_t file_offset, size_t size,
+ bool map_writable, void *&result) {
+ SmallString<128> path_storage;
+ StringRef name = path.toNullTerminatedStringRef(path_storage);
+ int oflags = map_writable ? O_RDWR : O_RDONLY;
+ int ofd = ::open(name.begin(), oflags);
+ if ( ofd == -1 )
+ return error_code(errno, system_category());
+ AutoFD fd(ofd);
+ int flags = map_writable ? MAP_SHARED : MAP_PRIVATE;
+ int prot = map_writable ? (PROT_READ|PROT_WRITE) : PROT_READ;
+#ifdef MAP_FILE
+ flags |= MAP_FILE;
+#endif
+ result = ::mmap(0, size, prot, flags, fd, file_offset);
+ if (result == MAP_FAILED) {
+ return error_code(errno, system_category());
+ }
+
+ return error_code::success();
+}
+
+error_code unmap_file_pages(void *base, size_t size) {
+ if ( ::munmap(base, size) == -1 )
+ return error_code(errno, system_category());
+
+ return error_code::success();
+}
+
+
} // end namespace fs
} // end namespace sys
} // end namespace llvm
Index: lib/Support/Windows/PathV2.inc

--- lib/Support/Windows/PathV2.inc (revision 158678)
+++ lib/Support/Windows/PathV2.inc (working copy)
@@ -497,6 +497,41 @@
   return error_code::success();
}

+
+// Modifies permissions on a file.
+error_code permissions(const Twine &path, perms prms) {
+#if 0 // verify code below before enabling:
+ // If the permissions bits are not trying to modify
+ // "write" permissions, there is nothing to do.
+ if (!(prms & (owner_write|group_write|others_write)))
+ return error_code::success();
+
+ SmallString<128> path_storage;
+ SmallVector<wchar_t, 128> path_utf16;
+
+ if (error_code ec = UTF8ToUTF16(path.toStringRef(path_storage),
+ path_utf16))
+ return ec;
+
+ DWORD attributes = ::GetFileAttributesW(path_utf16.begin());
+
+ if (prms & add_perms) {
+ attributes &= ~FILE_ATTRIBUTE_READONLY;
+ }
+ else if (prms & remove_perms) {
+ attributes |= FILE_ATTRIBUTE_READONLY;
+ }
+ else {
+ assert(0 && "neither add_perms or remove_perms is set");
+ }
+
+ if ( ! ::SetFileAttributesW(path_utf16.begin(), attributes))
+ return windows_error(::GetLastError());
+#endif
+ return error_code::success();
+}
+
+
// FIXME: mode should be used here and default to user r/w only,
// it currently comes in as a UNIX mode.
error_code unique_file(const Twine &model, int &result_fd,
@@ -755,6 +790,17 @@
   return error_code::success();
}

+error_code map_file_pages(const Twine &path, off_t file_offset, size_t size,
+ bool map_writable, void *&result) {
+ assert(0 && "NOT IMPLEMENTED");
+}
+
+error_code unmap_file_pages(void *base, size_t size) {
+ assert(0 && "NOT IMPLEMENTED");
+}
+
+
+
} // end namespace fs
} // end namespace sys
} // end namespace llvm

It also needs unit tests. Once tests are added I can do the Windows
implementation.

- Michael Spencer