MSIL backend

Hello, Everyone.

We've just commited new backend for LLVM: MSIL. The author of backend is
Roman Samoilov from Codedgers Inc. (roman@codedgers.com). Backend itself
is very similar to C backend (and actually was based on it). Note, it's
pure LLVM-to-MSIL translator, so no additional checks etc. are
performed.

Backend is usable in general, but still lacks some important features:

1. There is no way to tell "import this function from that DLL file" to
backend. Even more, there are no any equivalent of import libraries for
MSIL code. So, in the future backend will include some "linking" pass,
which will resolve external DLL references. Probably, we'll use nice set
of .def files from public domain w32-api package (from Mingw32 folks).

2. Variable argument functions are unsupported now.

3. There is no support for dllimported variables (e.g. stdin, stdout,
stderr symbols in MS runtime).

4. Also, backend completely lacks any testsuite.

There are some small glitches here and there, but they are not so
important as indicated ones.

All indicated problems will be fixed in the nearest future. Any
feedback, comments, questions are surely welcome.

Anton Korobeynikov wrote:

Hello, Everyone.

We've just commited new backend for LLVM: MSIL. The author of backend is
Roman Samoilov from Codedgers Inc. (roman@codedgers.com). Backend itself
is very similar to C backend (and actually was based on it). Note, it's
pure LLVM-to-MSIL translator, so no additional checks etc. are
performed.
  
I'm confused. A MSIL front end I can understand, but a back end? How will it be used? The GCC-based front ends that come with LLVM generate bytecodes that have dependencies on the GCC runtime, which is not going to be present in a .NET environment.

Backend is usable in general, but still lacks some important features:

1. There is no way to tell "import this function from that DLL file" to
backend. Even more, there are no any equivalent of import libraries for
MSIL code. So, in the future backend will include some "linking" pass,
which will resolve external DLL references. Probably, we'll use nice set
of .def files from public domain w32-api package (from Mingw32 folks).
  
Some people have the official Microsoft files :slight_smile:

As it's apparent the developer does not have Visual Studio, how do you assemble and run the MSIL code? Do you use the utilities present in the .NET framework at all? (and, yes, they are free and frequently preloaded on new PCs, and Windows Update can install them on the rest.) That you would need to rely on Mingw32 implies you do not.

Who said the input has to come through the GCC front-ends? Perhaps this is for Jolt -> .NET? :slight_smile:

-Chris

Chris Lattner wrote:

I'm confused. A MSIL front end I can understand, but a back end? How
will it be used? The GCC-based front ends that come with LLVM generate
bytecodes that have dependencies on the GCC runtime, which is not going
to be present in a .NET environment.

Who said the input has to come through the GCC front-ends? Perhaps this is for Jolt -> .NET? :slight_smile:

-Chris

Then the problem is the converse: the .NET runtime won't be available if one of the other back ends is used. It will be very hard for a front end to support both MSIL and the other back ends.

Yes, it is possible to write C++ that can be compiled to MSIL and use the .NET runtime, but only by using Microsoft's Managed C++ extensions (which basically provide C# semantics via ugly C++ syntax). It's safe to say llvm-gcc doesn't support these extensions :slight_smile:

Hello, Jeff.

I'm confused. A MSIL front end I can understand, but a back end? How
will it be used? The GCC-based front ends that come with LLVM generate
bytecodes that have dependencies on the GCC runtime, which is not going
to be present in a .NET environment.

Well. It's LLVM-to-MSIL translator. So, if the source use some
unsupported code... The same situation was for ages with llvm-gcc and
CBackend, if system compiler is gcc 3.4.x. Generated C code contains gcc
4.x-specific builtins. I don't see anything wrong about this. LLVM
wasn't designed as completely portable.

Some people have the official Microsoft files :slight_smile:

We cannot use them :slight_smile:

As it's apparent the developer does not have Visual Studio, how do you
assemble and run the MSIL code? Do you use the utilities present in the
.NET framework at all?

Yes. Just MSIL "assembler". There will be some documentation "How to use
MSIL backend" in the nearest future.

That you would need to rely on Mingw32 implies you do not.

There won't be any files from external sources. Just clean instructions
"how to get and use them". I think some amount of .def files won't hurt
anyone.

Chris Lattner wrote:

I'm confused. A MSIL front end I can understand, but a back end? How
will it be used? The GCC-based front ends that come with LLVM generate
bytecodes that have dependencies on the GCC runtime, which is not going
to be present in a .NET environment.

Who said the input has to come through the GCC front-ends? Perhaps this is for Jolt -> .NET? :slight_smile:

-Chris

The problem is worse than I thought.

MSIL, and .NET in general, defines a specific object model. This object model is explicitly part of MSIL semantics. LLVM is at a lower level; it does not have an object model. To do a virtual call, LLVM instructions must be generated to load a function pointer from a vtable and dereference it. But MSIL is at a higher level, where one simply uses the callvirt instruction to do a virtual call and no vtable is supplied or even present. There's no obvious way to reconstruct the higher level object semantics from LLVM IR, and sure enough the new MSIL back end never generates a callvirt instruction. In other words, it is incapable of using the .NET framework library or anything else relying on virtual method calls.

What am I missing?

Hello, Jeff.

and dereference it. But MSIL is at a higher level, where one simply
uses the callvirt instruction to do a virtual call and no vtable is
supplied or even present.

You're right. Consider we will have some FE for MSIL, which will just
generate LLVM's "call" instruction with some predefined CC, which will
mean "this is virtual call". Backend can emit normal callvirt
instruction in this case. I hope, many high-level things can be avoided
using such tricks.

There's no obvious way to reconstruct the higher level object semantics
from LLVM IR, and sure enough the new MSIL back end never generates a
callvirt instruction. In other words, it is incapable of using the
.NET framework library or anything else relying on virtual method calls.

Well, I can be wrong. But if we will (possible) have some FE, which will
try to save such semantics using LLVM methods?

Unlike Java, MSIL support 'unsafe' operations, which include fully general pointer arithmetic, casting, etc. In this mode, you don't need to use its support for object models at all,

-Chris

Chris Lattner wrote:

Anton Korobeynikov wrote:

Hello, Jeff.

  
I'm confused.  A MSIL front end I can understand, but a back end?  How 
will it be used?  The GCC-based front ends that come with LLVM generate 
bytecodes that have dependencies on the GCC runtime, which is not going 
to be present in a .NET environment.
    
Well. It's LLVM-to-MSIL translator. So, if the source use some
unsupported code... The same situation was for ages with llvm-gcc and
CBackend, if system compiler is gcc 3.4.x. Generated C code contains gcc
4.x-specific builtins. I don't see anything wrong about this. LLVM
wasn't designed as completely portable.
  

There is no existing front end that can support MSIL. Do you plan on writing one? And do you plan on it not supporting the other back ends?

Some people have the official Microsoft files :)
    
We cannot use them :)

  
As it's apparent the developer does not have Visual Studio, how do you 
assemble and run the MSIL code?  Do you use the utilities present in the 
.NET framework at all?
    
Yes. Just MSIL "assembler". There will be some documentation "How to use
MSIL backend" in the nearest future.

  
That you would need to rely on Mingw32 implies you do not.
    
There won't be any files from external sources. Just clean instructions
"how to get and use them". I think some amount of .def files won't hurt
anyone.
  

I still don’t get what this is for. You shouldn’t need any win32 def files. It is extremely unusual for a program running in the .NET environment to need direct access to the win32 API. Most of the win32 API has been wrapped in .NET framework classes (though without callvirt you can’t use them).

Anton Korobeynikov wrote:

Hello, Jeff.

  
and dereference it.  But MSIL is at a higher level, where one simply 
uses the callvirt instruction to do a virtual call and no vtable is 
supplied or even present.
    
You're right. Consider we will have some FE for MSIL, which will just
generate LLVM's "call" instruction with some predefined CC, which will
mean "this is virtual call". Backend can emit normal callvirt
instruction in this case. I hope, many high-level things can be avoided
using such tricks.
  

This trick is too clever for your own good. It will wreck LLVM’s inter-procedural analysis, because it won’t know the call is virtual. LLVM may make an invalid optimization due to the faulty analysis (like inlining the target). Trying to pass through MSIL object semantics in this fashion isn’t likely to work for this and similar reasons.