Win32 COFF Support

Hi guys,

I have attached my patch to support generating win32 COFF object files. I would have posted earlier, but my system drive crashed and I had to rebuild my system; Luckily, my source code was on a secondary drive. I think this would be a good beginning for ongoing support of the COFF object file format and was hoping for some feedback as to whether it was commit worthy or what was needed to be done to get it their.

Thanks,
Nathan

changes.patch (45.7 KB)

The line for commit-worthy is a round of reviews and some tests
showing that basic functionality works.

Minor comments:
1. Is it possible to allocate extra memory with coff::symbol instead
of using std::string and std::vector as class members? The extra
allocations can add up.

2. For the following bit:
+ reinterpret_cast <uint32_t &> (Section->Symbol->Aux [0]) =
Section->Header.SizeOfRawData;
+ reinterpret_cast <uint16_t &> (Section->Symbol->Aux [4]) =
Section->Header.NumberOfRelocations;
+ reinterpret_cast <uint16_t &> (Section->Symbol->Aux [6]) =
Section->Header.PointerToLineNumbers;

This isn't endian-safe; I'd suggest expanding it to 8 assignments.
Same issues with the write_le functions. We need endian safety not
only so that cross-compiling, for example, from PPC Linux to MingW
works, but also so that tests work cross-platform.

3. For the following bit:
+ typedef std::list <symbol*> symbols;
+ typedef std::list <section*> sections;

Avoid std::list; see LLVM Programmer’s Manual — LLVM 16.0.0git documentation .

4. For the following bit:
+ typedef StringMap <coff::symbol *> name_symbol_map;
+ typedef StringMap <coff::section *> name_section_map;

Unused typedefs?

5. For the following bit (and a few others in the same function):
+ for (i = Sections.begin(), j = Asm.begin();
+ (i != Sections.end()) && (j != Asm.end());
+ i++, j++) {

http://llvm.org//docs/CodingStandards.html#ll_end

6.
+namespace llvm { MCObjectWriter * createWinCOFFObjectWriter
(raw_ostream & OS); }

If you need a function to be visible across files, use a header;
shortcuts like this make the code more difficult to read.

-Eli

Thanks for the feedback.

The line for commit-worthy is a round of reviews and some tests
showing that basic functionality works.

Regarding tests, I would like to have some form of automated regression testing, but am not sure how to go about doing so. My current testing involves running through my compilers regression test (compiles links and executes a number of test sources) and running a couple of .ll files and manually inspecting the results of Microsoft’s dumpbin utility.

Minor comments:

  1. Is it possible to allocate extra memory with coff::symbol instead
    of using std::string and std::vector as class members? The extra
    allocations can add up.

Yes, I believe ShortString & ShortVector are the appropriate alternative for these types of cases.

  1. For the following bit:
  • reinterpret_cast <uint32_t &> (Section->Symbol->Aux [0]) =
    Section->Header.SizeOfRawData;
  • reinterpret_cast <uint16_t &> (Section->Symbol->Aux [4]) =
    Section->Header.NumberOfRelocations;
  • reinterpret_cast <uint16_t &> (Section->Symbol->Aux [6]) =
    Section->Header.PointerToLineNumbers;

This isn’t endian-safe; I’d suggest expanding it to 8 assignments.
Same issues with the write_le functions. We need endian safety not
only so that cross-compiling, for example, from PPC Linux to MingW
works, but also so that tests work cross-platform.

OK, I had put the write_le functions so that I could address this later. I should have used them, or something similar for the auxiliary symbols. I will rework the aux symbol stuff & the write_le functions to be cross platform safe.

  1. For the following bit:
  • typedef std::list <symbol*> symbols;
  • typedef std::list <section*> sections;

Avoid std::list; see http://llvm.org//docs/ProgrammersManual.html#dss_list .

Good catch, I would normally not do that. I must have had a list of objects, then changed it to pointers to objects and didn’t rethink what type of container I was using.

  1. For the following bit:
  • typedef StringMap <coff::symbol *> name_symbol_map;
  • typedef StringMap <coff::section *> name_section_map;

Unused typedefs?

Left over from previous version of the implementation, will remove.

  1. For the following bit (and a few others in the same function):
  • for (i = Sections.begin(), j = Asm.begin();
  • (i != Sections.end()) && (j != Asm.end());
  • i++, j++) {

http://llvm.org//docs/CodingStandards.html#ll_end

I will recode those statements.

+namespace llvm { MCObjectWriter * createWinCOFFObjectWriter
(raw_ostream & OS); }

If you need a function to be visible across files, use a header;
shortcuts like this make the code more difficult to read.

On this one, it didn’t seem appropriate to include WinCOFFObjectWriters header into X86AsmBackend, it also seems to be overkill to add a header to contain a single function. I would like some more feedback on what the sanctioned course of action should be here.

  • Nathan

Hi Nathan,

According to http://llvm.org//docs/CodingStandards.html#hl_dontinclude you seem to be right on this one. If this isn’t what the team leaders meant, they shouldn’t have written it as it is.

–Sam

Hi all,

Here is the latest revision of my Win32 COFF Support patch. This version includes changes based on Eli’s feedback, and an implementation of weak reference symbols (COFF weak externals). If there is nothing major wrong with it, I would like to commit it.

I can understand the lack of automated regression testing being one of those reasons. If so, can I have some guidance on how to approach that. Aaron had suggested using a object file dumping utility to generate a text representation and then diffing that against the expected output. I think that is a good idea but don’t know where to put this utility, and I have no experience with the automated testing system currently in use.

Thanks,
Nathan

changes.patch (48.5 KB)

Essentially, the idea is to write something like
test/Scripts/macho-dump (a primitive object dumper), call it from
tests to get a textual representation, and check that you get what you
expect. Look at the tests in test/MC/MachO for example tests;
hopefully, it's relatively intuitive.

-Eli

Nathan Jeffords <blunted2night@gmail.com> writes:

Here is the latest revision of my Win32 COFF Support patch. This version
includes changes based on Eli's feedback, and an implementation of weak
reference symbols (COFF weak externals). If there is nothing major wrong
with it, I would like to commit it.

I can understand the lack of automated regression testing being one of those
reasons.

[snip]

I hope that the lack of regression tests will not constitute a blocker
for the patch. Not having the feature is worse than having the feature
with some occassional regression.

For those that are interested, I have attached the latest version of my Win32 COFF support patch. There is no new functionality, I just fixed some compiler errors due to recent changes in the MC library.

  • Nathan

changes.patch (48.1 KB)

Nathan Jeffords <blunted2night@gmail.com> writes:

For those that are interested, I have attached the latest version of my
Win32 COFF support patch. There is no new functionality, I just fixed some
compiler errors due to recent changes in the MC library.

Thanks Nathan.

Any objections for not accepting the patch into LLVM? If none, I'll
apply it after the weekend.

Nathan,

Sorry about this but…

On Cygwin with GCC-4.2.4 :-

In file included from /home/ang/svn/llvm/lib/MC/WinCOFFObjectWriter.cpp:32:
/home/ang/svn/llvm/lib/MC/WinCoff.h:68: error: comma at end of enumerator list

Otherwise builds okay.

I’ll run some tests on it…

Aaron

Okay C++ tests are fine.

llvm-mc should automatically be supported now as well to now :slight_smile:

Nice, will have to test as a backend to Clang as well now.

Cannot test llvm-gcc until inline assembly is supported for COFF though.

Aaron

There should minimally be a bare-bones test to make sure that basic
COFF writer functionality doesn't crash. Besides that, I don't see
any issues.

-Eli

One more review comment:

+ bool isVirtualSection(const MCSection &Section) const {
+// const MCSectionCOFF &SE = static_cast<const MCSectionCOFF&>(Section);
+// return SE.getType() == MCSectionCOFF::SHT_NOBITS;

I will fix the comment, and the GCC warning. As far a COMDAT’s go, at one point, they where being generated for weak & common symbols. I will have a look again to see if I broke something there.

I have a nearly complete python script to dump the contents of a COFF object file and am looking into to how to integrate that into the test system. Hopefully I can have it running within a day or too.

I can do the commit myself, Chris gave me commit access a couple of weeks ago. I was just trying to address the test case request from Eli.

Thanks for the feedback,

  • Nathan

p.s. I attached my dump script if anyone is interested (it will be reformatted before it is committed)

coff-dump.py (14.1 KB)

I was looking into writing this test for MC based on the MachO tests,
but llvm-mc cannot read .s files as COFF because it simply lacks
support for it. It tries to read everything as MachO.

I've attached a patch for the test anyway. This includes Nathan's
dump-coff.py script in the proper location for the test. I generated
the .s file with "echo int main(){} | clang -O3 -S -x c -o - -". And
then I generated the CHECK lines by running "echo int main(){} | clang
-O3 -c -x c -o temp.o - && dump-coff temp.o".

- Michael Spencer

basic-coff-tests.patch (17.9 KB)

This is cool, I was looking into something like this, but hit a little bit of a wall, and then got sidetracked on another project. I was going to use llc to generate COFF object files as opposed to clang. Seems to me llc would give better control over what was generated and would also not require clang to be in the tree.

Is there any documentation on how to run these tests from windows?

  • Nathan

I just tried to actually run the tests in llvm/test (as opposed to
llvm/tools/clang/test) and none of them work on windows. They all fail
due to windows using \ for directories, which the test runner seems to
want to use as an escape sequence.

What I don't get is how clang tests run fine on windows, but not llvm
tests. And why I have not seen this problem posted anywhere.

- Michael Spencer

I poked around, and after applying the attached patch, some tests run
just fine. (test's with grep and spaces are broken as are some
others).

Here's how I run the test I wrote:

C:\Users\Michael\Projects\llvm>"C:\Python26\python.exe" C:/Users/Michael/Proj
ects/llvm/utils/lit/lit.py --param llvm_site_config=C:/Users/Michael/Projects
/llvm-build/VS9/test/lit.site.cfg --param llvm_unit_site_config=C:/Users/Mich
ael/Projects/llvm-build/VS9/test/Unit/lit.site.cfg -v C:/Users/Michael/Pro
jects/llvm-build/VS9/test/MC/COFF/basic-coff.s

- Michael Spencer

test.patch (19 KB)

This is cool, I was looking into something like this, but hit a little bit
of a wall, and then got sidetracked on another project. I was going to use
llc to generate COFF object files as opposed to clang. Seems to me llc would
give better control over what was generated and would also not require clang
to be in the tree.
Is there any documentation on how to run these tests from windows?

I just tried to actually run the tests in llvm/test (as opposed to
llvm/tools/clang/test) and none of them work on windows. They all fail
due to windows using \ for directories, which the test runner seems to
want to use as an escape sequence.

What I don't get is how clang tests run fine on windows, but not llvm
tests. And why I have not seen this problem posted anywhere.

- Michael Spencer

I poked around, and after applying the attached patch, some tests run
just fine. (test's with grep and spaces are broken as are some
others).

The lit specific portion of your patch isn't right. LLVM tests are
written in a Tcl style. What we need to do is find a way to make lit
agnostic to the Windows paths. This probably comes down to normalizing
the %s etc subtitutions to Unix style, then translating the paths to
Windows style when we actually execute commands.

- Daniel

This is cool, I was looking into something like this, but hit a little bit
of a wall, and then got sidetracked on another project. I was going to use
llc to generate COFF object files as opposed to clang. Seems to me llc would
give better control over what was generated and would also not require clang
to be in the tree.
Is there any documentation on how to run these tests from windows?

I just tried to actually run the tests in llvm/test (as opposed to
llvm/tools/clang/test) and none of them work on windows. They all fail
due to windows using \ for directories, which the test runner seems to
want to use as an escape sequence.

What I don't get is how clang tests run fine on windows, but not llvm
tests. And why I have not seen this problem posted anywhere.

No one has made it work yet, it's that simple.

- Daniel