cross compiling with the C backend

For my master's thesis, I am trying to cross compile programs for the
PSP (PlayStation Portable) with LLVM and llvm-gcc.

This is what I do:

(1) compile a program and the libraries it uses (libpng etc.) with llvm-gcc
(2) link the bitcode files with llvm-ld into one file
(3) run "llc -march=c" on the result
(4) compile the resulting C source with the PSP toolchain

It seems to work with a very simple program (Pi_Calc, calculates Pi
and prints it), but it fails when I try it with another program that
is a bit more advanced: the PSP hangs when I run the program.

Am I using the right approach here? I have compiled llvmgcc-4.2-2.2 on
my machine as a native compiler (i686-pc-linux-gnu). Does llvm-gcc
have an impact on the portability of the resulting LLVM bitcode? And
is the C code generated by the C backend really portable? I use the
"-nostdinc" option for llvm-gcc and specify the include paths of the
PSP SDK instead. But I still get warnings with llvm-gcc that do not
appear with psp-gcc ("passing argument 2 of 'xyz' discards qualifiers
from pointer target type"). And the output of the C back end generates
lots of warnings as well, or is that expected?

What could solve this problem? Do I have to configure llvm-gcc as a
cross compiler? MIPS might be a better candidate than the 'i686' from
the native configuration. Or could it be simply a bug in llvm-gcc or
LLVM?

Another option could be to add real support for the PSP to LLVM,
though I'm not sure if I'll be able to accomplish that.

Regards,

-Kevin André

What could solve this problem? Do I have to configure llvm-gcc as a
cross compiler? MIPS might be a better candidate than the 'i686' from
the native configuration. Or could it be simply a bug in llvm-gcc or
LLVM?

Btw, cross-compiling llvm-gcc for Mips wont work since the support in
llvm-gcc isnt
ready yet!

Another option could be to add real support for the PSP to LLVM,
though I'm not sure if I'll be able to accomplish that.

That would be nice, any improvements to the Mips backend are welcome! =)

Kevin André wrote:

For my master's thesis, I am trying to cross compile programs for the
PSP (PlayStation Portable) with LLVM and llvm-gcc.

This is what I do:

(1) compile a program and the libraries it uses (libpng etc.) with llvm-gcc
(2) link the bitcode files with llvm-ld into one file
(3) run "llc -march=c" on the result
(4) compile the resulting C source with the PSP toolchain

It seems to work with a very simple program (Pi_Calc, calculates Pi
and prints it), but it fails when I try it with another program that
is a bit more advanced: the PSP hangs when I run the program.

Am I using the right approach here? I have compiled llvmgcc-4.2-2.2 on
my machine as a native compiler (i686-pc-linux-gnu). Does llvm-gcc
have an impact on the portability of the resulting LLVM bitcode? And
is the C code generated by the C backend really portable? I use the
"-nostdinc" option for llvm-gcc and specify the include paths of the
PSP SDK instead. But I still get warnings with llvm-gcc that do not
appear with psp-gcc ("passing argument 2 of 'xyz' discards qualifiers
from pointer target type"). And the output of the C back end generates
lots of warnings as well, or is that expected?

No, it's not portable like that. Here's what goes on.

C is portable in the sense that it gives you enough tools to inspect your environment. So an 'int' might be any number of chars (which themselves may be any number of bits >= 8), but that's okay because C provides 'sizeof(int)'. If you have code like "char *x = malloc(sizeof(int));" you're going to get different LLVM bytecode depending on what platform your llvm-gcc is set to.

LLVM is portable in the sense that the bytecode will behave the same way on every platform. So if the above code becomes "%x_addr = malloc i32" then you get a 32-bit integer regardless of the abilities of the underlying system.

Finally, the C backend's output isn't portable in the sense that it uses GCC extensions to get the correct output. Which is fine so long as you're compiling its output with GCC.

Building llvm-gcc as a cross-compiler would help, if there's a platform you can select that matches the PSP more closely. Besides that, the warning you gave as an example is probably because llvm-gcc 4.2 is a newer version of gcc than psp-gcc.

Nick

No, it's not portable like that. Here's what goes on.

C is portable in the sense that it gives you enough tools to inspect
your environment. So an 'int' might be any number of chars (which
themselves may be any number of bits >= 8), but that's okay because C
provides 'sizeof(int)'. If you have code like "char *x =
malloc(sizeof(int));" you're going to get different LLVM bytecode
depending on what platform your llvm-gcc is set to.

LLVM is portable in the sense that the bytecode will behave the same way
on every platform. So if the above code becomes "%x_addr = malloc i32"
then you get a 32-bit integer regardless of the abilities of the
underlying system.

This is what I thought as well, but I remember reading something that
said you could use the C backend for architectures that do not have a
'real' codegenerator for LLVM yet.

Finally, the C backend's output isn't portable in the sense that it uses
GCC extensions to get the correct output. Which is fine so long as
you're compiling its output with GCC.

... which is the case here.

Building llvm-gcc as a cross-compiler would help, if there's a platform
you can select that matches the PSP more closely.

I did some testing and it seems that my native gcc already is similar
enough. I compiled the following statements:

  printf(" sizeof(char) = %i\n", sizeof(char));
  printf(" sizeof(char*) = %i\n", sizeof(char*));
  printf(" sizeof(void*) = %i\n", sizeof(void*));
  printf(" sizeof(int) = %i\n", sizeof(int));
  printf(" sizeof(unsigned) = %i\n", sizeof(unsigned));
  printf(" sizeof(short) = %i\n", sizeof(short));
  printf(" sizeof(float) = %i\n", sizeof(float));
  printf(" sizeof(double) = %i\n", sizeof(double));
  printf(" endianness = %s\n", htonl(123) == 123 ? "big" : "little");
  printf(" CHAR_BIT = %i\n", CHAR_BIT);
  printf(" CHAR_MIN = %i\n", CHAR_MIN);
  printf(" CHAR_MAX = %i\n", CHAR_MAX);
  printf(" INT_MIN = %i\n", INT_MIN);
  printf(" INT_MAX = %i\n", INT_MAX);

with both psp-gcc and my gcc and they print exactly the same result:

     sizeof(char) = 1
    sizeof(char*) = 4
    sizeof(void*) = 4
      sizeof(int) = 4
sizeof(unsigned) = 4
    sizeof(short) = 2
    sizeof(float) = 4
   sizeof(double) = 8
endianness = little
   CHAR_BIT = 8
   CHAR_MIN = -128
   CHAR_MAX = 127
    INT_MIN = -2147483648
    INT_MAX = 2147483647

This explains why the Pi_Calc program does work when compiled with the
build process I described in my original message. So the difference is
more subtle; maybe a difference in the layout of structs or something.

Besides that, the
warning you gave as an example is probably because llvm-gcc 4.2 is a
newer version of gcc than psp-gcc.

Yup, my psp-gcc is still 4.1.0. And I get these kinds of warnings when
compiling the output of the C backend with psp-gcc:

llvmoutput.c:734: warning: conflicting types for built-in function 'malloc'
llvmoutput.c:1332: warning: return type of 'main' is not 'int'
llvmoutput.c: In function 'loadImage':
llvmoutput.c:2939: warning: pointer targets in passing argument 1 of
'setjmp' differ in signedness
(...)
llvmoutput.c: In function 'png_default_error':
llvmoutput.c:17976: warning: pointer targets in passing argument 1 of
'longjmp' differ in signedness
llvmoutput.c: In function 'png_crc_finish':
llvmoutput.c:18722: warning: 'llvm_cbe_i9_0_reg2mem_1__PHI_TEMPORARY'
may be used uninitialized in this function
(...)

Maybe I should pass "-nostdlib" to llvm-gcc as well. And I'll try if
using "-O1" instead of "-O5" makes a difference.

Thanks,

Kevin André

> What could solve this problem? Do I have to configure llvm-gcc as a
> cross compiler? MIPS might be a better candidate than the 'i686' from
> the native configuration. Or could it be simply a bug in llvm-gcc or
> LLVM?

Btw, cross-compiling llvm-gcc for Mips wont work since the support in
llvm-gcc isnt
ready yet!

What is still missing then? I am going to have to modify llvm-gcc as I
don't have other options. Could you tell me where to start? And can I
use llvm-gcc from the 2.2 release or should I use a more recent (svn)
version?

> Another option could be to add real support for the PSP to LLVM,
> though I'm not sure if I'll be able to accomplish that.

That would be nice, any improvements to the Mips backend are welcome! =)

I would be happy to help. The only MIPS machine I have is the PSP though.

Regards,
Kevin André