Building VMKit

Hi,

I’m trying to build VMKit from SVN, and I’m getting a bunch of errors that all seem to be related to the TRACER macro not getting defined:

llvm[3]: Compiling Assembly.cpp for Release+Asserts build
In file included from Assembly.cpp:15:
Assembly.h:140: error: variable or field ‘TRACER’ declared void
In file included from Assembly.cpp:19:
N3.h:109: error: variable or field ‘TRACER’ declared void
In file included from Assembly.cpp:21:
VMClass.h:56: error: variable or field ‘TRACER’ declared void
VMClass.h:140: error: variable or field ‘TRACER’ declared void
VMClass.h:165: error: variable or field ‘TRACER’ declared void

Assembly.cpp:1929: instantiated from here
LockedMap.h:101: error: ‘class n3::Assembly’ has no member named ‘tracer’
LockedMap.h: In member function ‘void n3::LockedMap<Key, Container, Compare, Upcall>::tracer() [with Key = unsigned int, Container = n3::VMMethod, Compare = std::less, Upcall = n3::Assembly]’:

and the relevant lines:

class VMClass : public VMCommonClass {
public:
virtual void print(mvm::PrintBuffer* buf) const;
virtual void TRACER;

The only “#define TRACER” that grep found is in lib/Mvm/BoehmGC/MvmGC.h:
#define TRACER tracer()

Is there something wrong with my configuration?

Thanks,
Joshua

I'm trying to build VMKit from SVN, and I'm getting a bunch of errors
that all seem to be related to the TRACER macro not getting defined:

...

Is there something wrong with my configuration?

Hi Joshua, some details like what platform you are on (darwin, linux,
windows), what compiler you are using, how you configured LLVM etc
would be helpful.

Ciao,

Duncan.

Sure:

I’m on 64-bit Ubuntu Linux 10.04 with gcc 4.4.3. I followed the instructions on http://vmkit.llvm.org/get_started.html, as near as I can tell.

I configured llvm with the default configuration:
./configure

I configured vmkit with:
./configure --with-llvmsrc=/home/jowarner/code/llvm/ --with-llvmobj=/home/jowarner/code/llvm/ --with-gnu-classpath-glibj=/usr/share/classpath/glibj.zip --with-pnet-local-prefix=/home/jowarner/code/pnet-0.8.0/ --with-pnetlib=/home/jowarner/code/pnetlib-0.8.0/

Thanks,
Joshua

Hi Joshua,

The .Net implementation in VMKit hasn’t been updated for a long while. Fortunately for you, the Java implementation is not :). Remove all pnet references in your configure and it should compile fine:

./configure --with-llvmsrc=/home/jowarner/code/llvm/ --with-llvmobj=/home/jowarner/code/llvm/ --with-gnu-classpath-glibj=/usr/share/classpath/glibj.zip

Nicolas

Forgot to send to the mailing list…

Hi Joshua,

$ j3 Hello
j3: JavaClass.cpp:480: j3::JavaObject* j3::Class::doNew(j3::Jnjvm*):
Assertion `(this->isInitializing() ||
classLoader->getCompiler()->isStaticCompiling()) && "Uninitialized class
when allocating."' failed.
Aborted

Regarding to j3 in 64 bit version, it should work now after we've
found crush reason,
both in Debug and in Release versions. (and its 32 bit version was
continuously working)

But your case is something strange, crush didn't type such messages.
Have you taken VMkit from svn and latest version?
Also, to get j3 running recompile classpath with
-fno-omit-frame-pointer (or take my patch from here:
http://lists.cs.uiuc.edu/pipermail/vmkit-commits/attachments/20100719/35754a6f/attachment.bin
and apply it:
$ cd classpath-0.97.2
$ patch ./configure ./classpath_configure64.patch
)

That's now on j3

Regards,
Minas

Hi Minas,

I tried recompiling Classpath with -fno-omit-frame-pointer, and now, instead of printing an error message, j3 just segfaults in
“j3::JnjvmClassLoader::loadClassFromAsciiz(char const*, bool, bool) ()”

I ran llcj under strace and found that it is not even opening the input or output files, but is otherwise running normally.

Updating to the latest SVN version (revision 108831) didn’t change anything (I was only a few days out of date).

I’m not sure where to go from here. Does this fit with any of the known problems under 64-bit linux?

Thanks,
Joshua

Hi Joshua,

If you can get a running 32bit system, I’d suggest you do so, as you’ll get up to speed right away. I can’t test VMKit on a 64bits machine, and I have been aware that there are some compilation/execution problems. Besides, the current GCs of VMKit do not work on 64bits (neither MMTk nor GCMmap2).

Nicolas

Hi Nicolas,

Thanks for all your help, but if 64-bit systems are still a big problem, perhaps the VMKit AOT compiler is not the best solution to my problem. I’d like to be able to support the major (if not all all) platforms that the Avian JVM supports - x86 & x86_64 linux & windows, powerpc darwin and ARM.

Regards,
Joshua

Hi Joshua,

What plans did you have for GC? No GC at all or Avian JVM has its own GC (and is it precise or not?)?

If you’re not planning on using VMKit’s GCs, then 64-bit system should not be a big problem: the only problem that we have now is compiling GNU Classpath, and most probably Avian JVM has its own version of the class libraries?

Also, note that platform support will be strongly dependent on LLVM support.

Nicolas

Hi Nicolas,

I plan on using the Avian GC (which is a precise, generational collector). Eventually, I’d like to fully integrate all of the runtime services Avian provides - even integrating the existing Avian JIT compiler, to allow for partially-AOT builds.

Avian does indeed have it’s own class library, but I would be very surprised if VMKit could compile with them - they are sufficiently conformant for many applications, but far from a complete implementation. Avian also supports using GNU Classpath, and I was planning on targeting this feature.

As I understand it, LLVM itself can compile code for most of the mentioned platforms, and ARM support is coming along nicely.

If VMKit is to serve as the compiler, I have a number of requirements that need to be at least fairly easy to add (or already present):

  • Custom lowering (platform independent) of operations like virtual and interface calls. Currently, Avian uses the lower bits of the virtual method table to store various data, so loading the virtual table requires masking out those bits. Avian also uses a fairly unique implementation of hashing for interface method lookup (unique in the implementation, not the algorithm). In this case, it may be easier to modify Avian to fit with VMKit, instead of visa-versa.

  • Access to stack maps at GC safe points

  • Access to unwind tables

  • Configurable object layout - capability to configure the number of words in the object header, etc. Again, it may be easier to modify Avian.

  • Able to output object files (preferably) or assembly for the target platform, with appropriate global symbols for functions, etc. to be linked as a boot image.

Do you know if VMKit already has these features? If not, would it be fairly straightforward to add them in such a way as to also benefit VMKit (I don’t want to maintain a custom branch)?

Joshua

Hi Nicolas,

I plan on using the Avian GC (which is a precise, generational collector).

OK - Great!

Eventually, I’d like to fully integrate all of the runtime services Avian provides - even integrating the existing Avian JIT compiler, to allow for partially-AOT builds.

Avian does indeed have it’s own class library, but I would be very surprised if VMKit could compile with them - they are sufficiently conformant for many applications, but far from a complete implementation.

VMKit does not need for the libraries to be complete, just the VM interface to be fulfilled (eg methods in Class.java).

Avian also supports using GNU Classpath, and I was planning on targeting this feature.

OK

As I understand it, LLVM itself can compile code for most of the mentioned platforms, and ARM support is coming along nicely.

Yes indeed. I don’t know what is the status of windows support but at least for the mentioned archs/os, LLVM is a great choice.

If VMKit is to serve as the compiler, I have a number of requirements that need to be at least fairly easy to add (or already present):

  • Custom lowering (platform independent) of operations like virtual and interface calls. Currently, Avian uses the lower bits of the virtual method table to store various data, so loading the virtual table requires masking out those bits. Avian also uses a fairly unique implementation of hashing for interface method lookup (unique in the implementation, not the algorithm). In this case, it may be easier to modify Avian to fit with VMKit, instead of visa-versa.

Sure, that’s one major strength of LLVM: we could decide on a runtime function (CallVirtualMethod) that will get lowered depending on the underlying VM. I don’t see any difficulties in accomplishing this.

  • Access to stack maps at GC safe points

LLVM/VMKit already has it: VMKit uses the OCaml GC in LLVM to generate stack maps in the executable.

  • Access to unwind tables

I believe that’s for Java exceptions? The current non-optimal implementation of Java exceptions in VMKit is to check after each call site if an exception has been raised. In older versions, VMKit used C++ exceptions and Dwarf table, but that proved to be very inefficient in a mixed JIT/AOT environment (the libgcc implementation of JIT exceptions is not optimized).

LLVM is capable of generating dwarf/unwind tables at compile-time (and also in a JIT environment), so you should get access to these tables in a standard fashion. The only caveat is that you need to change VMKit and they way it compiles exception handlers and exception checks.

  • Configurable object layout - capability to configure the number of words in the object header, etc. Again, it may be easier to modify Avian.

This hasn’t been tested at a large scale (GCs that VMKit support only need a 2 word header), but you should be able to define what is an object header. The generated AOT Java code does make some assumptions on how is the header for object synchronization, but we can decide on a method that VMKit and Avian could lowered to different implementations.

  • Able to output object files (preferably) or assembly for the target platform, with appropriate global symbols for functions, etc. to be linked as a boot image.

VMKit already does that, thanks to LLVM code generators.

Do you know if VMKit already has these features? If not, would it be fairly straightforward to add them in such a way as to also benefit VMKit (I don’t want to maintain a custom branch)?

I think it you decide to use VMKit, we can easily agree on how to share a same code base, with VMKit and Avian JVM having their own lowering pass.

Nicolas

Hi Nicolas,

I plan on using the Avian GC (which is a precise, generational collector).

OK - Great!

Eventually, I’d like to fully integrate all of the runtime services Avian provides - even integrating the existing Avian JIT compiler, to allow for partially-AOT builds.

Avian does indeed have it’s own class library, but I would be very surprised if VMKit could compile with them - they are sufficiently conformant for many applications, but far from a complete implementation.

VMKit does not need for the libraries to be complete, just the VM interface to be fulfilled (eg methods in Class.java).

Avian also supports using GNU Classpath, and I was planning on targeting this feature.

OK

As I understand it, LLVM itself can compile code for most of the mentioned platforms, and ARM support is coming along nicely.

Yes indeed. I don’t know what is the status of windows support but at least for the mentioned archs/os, LLVM is a great choice.

If VMKit is to serve as the compiler, I have a number of requirements that need to be at least fairly easy to add (or already present):

  • Custom lowering (platform independent) of operations like virtual and interface calls. Currently, Avian uses the lower bits of the virtual method table to store various data, so loading the virtual table requires masking out those bits. Avian also uses a fairly unique implementation of hashing for interface method lookup (unique in the implementation, not the algorithm). In this case, it may be easier to modify Avian to fit with VMKit, instead of visa-versa.

Sure, that’s one major strength of LLVM: we could decide on a runtime function (CallVirtualMethod) that will get lowered depending on the underlying VM. I don’t see any difficulties in accomplishing this.

Is it common practice to emit function calls that are expected to be lowered by a later pass? I know LLVM uses this kind of thing with intrinsics (llvm.gcroot, for instance), but a pass lowering calls to specific functions seems very… messy.

What about something like a --emit-unlowered-llvm option on llcj that just spits out the LLVM IR before running this lowering pass?

  • Access to stack maps at GC safe points

LLVM/VMKit already has it: VMKit uses the OCaml GC in LLVM to generate stack maps in the executable.

  • Access to unwind tables

I believe that’s for Java exceptions? The current non-optimal implementation of Java exceptions in VMKit is to check after each call site if an exception has been raised. In older versions, VMKit used C++ exceptions and Dwarf table, but that proved to be very inefficient in a mixed JIT/AOT environment (the libgcc implementation of JIT exceptions is not optimized).

LLVM is capable of generating dwarf/unwind tables at compile-time (and also in a JIT environment), so you should get access to these tables in a standard fashion. The only caveat is that you need to change VMKit and they way it compiles exception handlers and exception checks.

Could this be another case for emitting calls in the IR that are lowered by a later pass?

  • Configurable object layout - capability to configure the number of words in the object header, etc. Again, it may be easier to modify Avian.

This hasn’t been tested at a large scale (GCs that VMKit support only need a 2 word header), but you should be able to define what is an object header. The generated AOT Java code does make some assumptions on how is the header for object synchronization, but we can decide on a method that VMKit and Avian could lowered to different implementations.

Avian only needs a 1-word header - it uses the lower bits of the vtable pointer for GC and hashing.

  • Able to output object files (preferably) or assembly for the target platform, with appropriate global symbols for functions, etc. to be linked as a boot image.

VMKit already does that, thanks to LLVM code generators.

I was more concerned with whether this is what the llcj driver does.

Do you know if VMKit already has these features? If not, would it be fairly straightforward to add them in such a way as to also benefit VMKit (I don’t want to maintain a custom branch)?

I think it you decide to use VMKit, we can easily agree on how to share a same code base, with VMKit and Avian JVM having their own lowering pass.

Nicolas

On the VMKit side, I’d be worried about the implications of adding extra steps in the compilation process, particularly if their only purpose is supporting another VM. Personally, if I were a VMKit developer, such a feature would be high on my list of things to remove, unless such a feature turned out to be useful in other ways. Is this a feature that could genuinely improve VMKit, or do you see this as something that would only end up being used with Avian?

It sounds as if the big sticking point is support for 64-bit linux. It wouldn’t be all that bad to require cross-compiling all AOT Avian builds from a linux machine (not worrying about building on windows), but I don’t think I could justify supporting only 32-bit linux as a build platform. 32 bit systems are on their way out.

Just to be clear, I do fully intend on using llvm - the only question is what I use to compile class files down to llvm IR (or directly to native object files).

Joshua

Sure, that’s one major strength of LLVM: we could decide on a runtime function (CallVirtualMethod) that will get lowered depending on the underlying VM. I don’t see any difficulties in accomplishing this.

Is it common practice to emit function calls that are expected to be lowered by a later pass? I know LLVM uses this kind of thing with intrinsics (llvm.gcroot, for instance), but a pass lowering calls to specific functions seems very… messy.

It is not lowering calls to specific functions but emitting a virtual call: we could decide to have a runtime call (ie that will get lowered) to get the virtual table of the object and put the indirect call instruction in place (ie it will not get lowered). Or a single runtime call that will get lowered to both getting the virtual table and do the indirect call.

What about something like a --emit-unlowered-llvm option on llcj that just spits out the LLVM IR before running this lowering pass?

Yes, that’s what I had in mind. llcj is just a driver here for VMKit, the real executable is vmjc, to which you can give a number of passes, including your pass that will lower your virtual call.

LLVM is capable of generating dwarf/unwind tables at compile-time (and also in a JIT environment), so you should get access to these tables in a standard fashion. The only caveat is that you need to change VMKit and they way it compiles exception handlers and exception checks.

Could this be another case for emitting calls in the IR that are lowered by a later pass?

This is much more difficult, but we could try to have something like that. On the list of difficulties: where is the exception handler IR located? who references it? how to prevent dead code elimination to remove it?

Avian only needs a 1-word header - it uses the lower bits of the vtable pointer for GC and hashing.

Cool. And on a synchronized, it pushes the header on the stack? Or Avian is not multi-threaded?

  • Able to output object files (preferably) or assembly for the target platform, with appropriate global symbols for functions, etc. to be linked as a boot image.

VMKit already does that, thanks to LLVM code generators.

I was more concerned with whether this is what the llcj driver does.

Yes, llcj can generate assembly files, dynamic libraries, object files, executables, etc.

On the VMKit side, I’d be worried about the implications of adding extra steps in the compilation process, particularly if their only purpose is supporting another VM.

VMKit is framework-driven, so we won’t mind extra steps in the compilation process, especially if it ends up demonstrating the modularity of VMKit. And I would love to be able to specify on the command line what kind of object header I want, what kind of exceptions, how to implement virtual calls and optimize them, etc. All of these would be implemented as LLVM pass.

Personally, if I were a VMKit developer, such a feature would be high on my list of things to remove, unless such a feature turned out to be useful in other ways.

Bare in mind that VMKit is research- and framework-driven. For run-time execution of VMKit, performance is always important, so we have to be careful (but adding LLVM passes does not cost at all). For ahead of time compilation, we should aim at having something the more generic as possible.

Is this a feature that could genuinely improve VMKit, or do you see this as something that would only end up being used with Avian?

It will definitely improve VMKit. It may end up only being used with Avian, but at least it opens more opportunities.

It sounds as if the big sticking point is support for 64-bit linux. It wouldn’t be all that bad to require cross-compiling all AOT Avian builds from a linux machine (not worrying about building on windows), but I don’t think I could justify supporting only 32-bit linux as a build platform. 32 bit systems are on their way out.

64-bit linux is near to being fully supported. There is little code in VMKit (except the GC) that is arch-dependent.

Just to be clear, I do fully intend on using llvm - the only question is what I use to compile class files down to llvm IR (or directly to native object files).

Using the vmjc tool in VMKit and LLVM lowering passes seems like a nice option to me.

Nicolas

Hi Minas,

I tried recompiling Classpath with -fno-omit-frame-pointer, and now, instead
of printing an error message, j3 just segfaults in
"j3::JnjvmClassLoader::loadClassFromAsciiz(char const*, bool, bool) ()"

Could you please run it under gdb like
$ gdb --args ./j3 HelloWorld
and print here backtrace at fall?

I ran llcj under strace and found that it is not even opening the input or
output files, but is otherwise running normally.

Let's make things clear (to me):
Have you pass these steps:
http://vmkit.llvm.org/use_aot.html - Java Ahead of Time (AOT) Compilation ?

If yes, how is it finished process of making glibj.zip into native code?
I didn't pass it on my 64-bit machine. Interesting if you get there

Updating to the latest SVN version (revision 108831) didn't change anything
(I was only a few days out of date).

I'm not sure where to go from here. Does this fit with any of the known
problems under 64-bit linux?

Not know yet :slight_smile:

Thanks,
Joshua

Thanks,
Minas

Hi Minas,

I tried recompiling Classpath with -fno-omit-frame-pointer, and now, instead
of printing an error message, j3 just segfaults in
“j3::JnjvmClassLoader::loadClassFromAsciiz(char const*, bool, bool) ()”

Could you please run it under gdb like
$ gdb --args ./j3 HelloWorld
and print here backtrace at fall?

Below is the output of the gdb session:

$ gdb --args ./j3 HelloWorld
GNU gdb (GDB) 7.1-ubuntu
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type “show copying”
and “show warranty” for details.
This GDB was configured as “x86_64-linux-gnu”.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>…
Reading symbols from /home/jowarner/code/vmkit/Release+Asserts/bin/j3…(no debugging symbols found)…done.
(gdb) r
Starting program: /home/jowarner/code/vmkit/Release+Asserts/bin/j3 HelloWorld
[Thread debugging using libthread_db enabled]
[New Thread 0x1200ff700 (LWP 16322)]
[New Thread 0x1201ff700 (LWP 16323)]
[New Thread 0x1202ff700 (LWP 16324)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x1200ff700 (LWP 16322)]
0x000000000052d725 in j3::JnjvmClassLoader::loadClassFromAsciiz(char const*, bool, bool) ()
(gdb) bt
#0 0x000000000052d725 in j3::JnjvmClassLoader::loadClassFromAsciiz(char const*, bool, bool) ()
#1 0x0000000000558633 in FindClass(_Jv_JNIEnv*, char const*) ()
#2 0x00007ffff60d531d in JNI_OnLoad () from /usr/lib/classpath/libjavanio.so
#3 0x000000000055d558 in callOnLoad ()
#4 0x0000000000569630 in Java_java_lang_VMRuntime_nativeLoad ()
#5 0x00007ffff7f07491 in ?? ()
#6 0x0000000000000000 in ?? ()
(gdb) list
1 events.c: No such file or directory.
in events.c
(gdb) f 1
#1 0x0000000000558633 in FindClass(_Jv_JNIEnv*, char const*) ()
(gdb) list
1 in events.c
(gdb)

I ran llcj under strace and found that it is not even opening the input or
output files, but is otherwise running normally.

Let’s make things clear (to me):
Have you pass these steps:
http://vmkit.llvm.org/use_aot.html - Java Ahead of Time (AOT) Compilation ?

I followed those instructions - no different result.

If yes, how is it finished process of making glibj.zip into native code?
I didn’t pass it on my 64-bit machine. Interesting if you get there

I’m not sure what you mean. There were no errors in the build, and both llcj and vmcj seem otherwise functional (they both run and print usage information just fine).

Hi Nicolas,

I had a discussion with the lead developer behind Avian, and we decided that for the sake of simplicity, we would use a custom-written translator based on the Java ASM library to do the class-to-llvm translation.

Perhaps in the future we could reconsider using VMKit - but for now, it seems just a bit too involved.

Thanks for all your help!

Regards,
Joshua