Convert C++ to C. What is 0x0p+0 ?

Hi:

I'm interested in using llvm to convert C++ code to C code.
I used the following command to do this:

% llvm-g++ -c foo.cpp -o - | llc -march=c -o foo.cbe.c

In the resulting file foo.cbe.c there are many occurences of '0x0p+0'.
What is it used for? Here's a code snippet from the file foo.cbe.c

  if ((ltmp_126_2 > 0x0p+0)) {
    goto ltmp_363_19;
  } else {
    goto ltmp_364_19;
  }

llvm-gcc is able to compile foo.cbe.c, but I need to use another C
compiler which gives a syntax error message for not recognizing
the expression '0x0p+0'.

Thank you for your assistance.

Napi

Hi Napi,

Hi:

I'm interested in using llvm to convert C++ code to C code.
I used the following command to do this:

% llvm-g++ -c foo.cpp -o - | llc -march=c -o foo.cbe.c

Yup, that'll do it. Although you might want to do a little optimization
otherwise you're going to get a lot of C code on output. Try passing -O2
to llvm-g++.

In the resulting file foo.cbe.c there are many occurences of '0x0p+0'.
What is it used for? Here's a code snippet from the file foo.cbe.c

  if ((ltmp_126_2 > 0x0p+0)) {
    goto ltmp_363_19;
  } else {
    goto ltmp_364_19;
  }

llvm-gcc is able to compile foo.cbe.c, but I need to use another C
compiler which gives a syntax error message for not recognizing
the expression '0x0p+0'.

Get a new C compiler :slight_smile:

The syntax in question is a C99 feature. It is printed by the C Backend
with the %a conversion token for printf. This is the representation of a
floating point number in hexadecimal. It allows certain values that
cannot otherwise be represented with a decimal number to be represented.
The C Backend needs to use this to ensure that the floating point value
it has in mind is *exactly* represented through the conversion to the C
source and then back by your C compiler.

Thank you for your assistance.

Welcome to LLVM.

Reid.

Hi Reid:

Thank you for your email. I need to use this C compiler that only
supports ANSI C 1989. What is the equivalent of '0x0p+0' in C89 ?
Is there any way around this?

Thanks.

Napi

$ cat t.cpp
#include <iostream>

int main(int argc, char** argv)
{
   std::cout << "0x0p+0 == " << 0x0p+0 << "\n";
}
$ ./t
0x0p+0 == 0

I supposed you could always hack the CBE to have it produce traditional floating point numbers (like 0.0 or whatever) using "%f" instead of "%a". However, you might have problems with precision during comparisons. I.e., if you have something like "if (a == 37.927)", it may not work.

-bw

The "hack" has already been implemented. As I mentioned in my last
email, all you need to do is configure LLVM with --disable-cbe-printf-a
and it will avoid using the %a conversion token and instead use ftostr.

Reid.

Hi:

I've been able to compile the attached "helloworld.c" file converted
from "helloworld.cpp".

My question is how does one usually use __main() and CODE_FOR_MAIN()
in tying up with the rest of the code?

Attached here are the original "helloworld.cpp" and "helloworld.c"
files.

Thanks.

Napi

helloworld.cpp (95 Bytes)

helloworld.c (8.43 KB)

Hi Napi,

Hi:

I've been able to compile the attached "helloworld.c" file converted
from "helloworld.cpp".

Great.

My question is how does one usually use __main() and CODE_FOR_MAIN()
in tying up with the rest of the code?

I'm not quite sure what you're asking. CODE_FOR_MAIN is defined in your
helloworld.c file as:

#define CODE_FOR_MAIN() /* Any target-specific code for main()*/
#if defined(__GNUC__) && !defined(__llvm__)
#if defined(i386) || defined(__i386__) || defined(__i386) ||
defined(__x86_64__)
#undef CODE_FOR_MAIN
#define CODE_FOR_MAIN() \
  {short F;__asm__ ("fnstcw %0" : "=m" (*&F)); \
  F=(F&~0x300)|0x200;__asm__("fldcw %0"::"m"(*&F));}
#endif
#endif

As noted in the comment, this is for target-specific code needed at the
start of main. It looks like your target needs a few assembly
instructions there.

As for the __main function, its a gcc library call required by the
compiler for program startup. The details vary but the call is needed.
Amongst other things it will probably initialize your C++ static
constructors.

Reid.

BTW,

Emil Mikulic suggested that this be implemented with a command line
switch so that whenever the CBE is invoked you can tell it to avoid
printf-a. That's a pretty good idea and I'll probably flip the
implementation of this to use a command line switch, but I'm not sure
when I'll get to that. Patches welcome.

Reid.

The function _Z4CONTv() in helloworld.c never got called from main().
This function is supposed to be the C version of CONT() in
helloworld.cpp. Where should I insert the code in hellowrld.c to call
_Z4CONTv() ?

Thanks.

Napi

...

As for the __main function, its a gcc library call required by the
compiler for program startup. The details vary but the call is needed.
Amongst other things it will probably initialize your C++ static
constructors.

Hi Reid:

I'm not using gcc for this purpose but another C compiler called AMPC.
It compiles C code into Java Bytecode. What I'm missing is the C++ to
JVM portion which I'm trying to use LLVM for converting C++ to C then
pass it through AMPC to get the Java Bytecode.

One question is does the resulting C code produced by llc will call C++
functions/methods still? It would be good if only C library functions
are called since I already have the standard C library compiled by AMPC
in the bytecode format.

Thanks.

Napi

Hi Napi,

...
> As for the __main function, its a gcc library call required by the
> compiler for program startup. The details vary but the call is needed.
> Amongst other things it will probably initialize your C++ static
> constructors.

Hi Reid:

I'm not using gcc for this purpose but another C compiler called AMPC.
It compiles C code into Java Bytecode. What I'm missing is the C++ to
JVM portion which I'm trying to use LLVM for converting C++ to C then
pass it through AMPC to get the Java Bytecode.

Okay, cool :slight_smile:

One question is does the resulting C code produced by llc will call C++
functions/methods still?

Yes.

It would be good if only C library functions
are called since I already have the standard C library compiled by AMPC
in the bytecode format.

If you use the stdc++ library in your source code then you'll need to
compile that with AMPC as well (after siphoning it through llvm).

Thanks.

You're welcome.

Reid.

Hi Reid:

I hate to be a pain. But when I ran the program helloworld.class
after compiling it with AMPC I got the following message:

Exception in thread "main" java.lang.NoClassDefFoundError:
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
        at helloworld._Z4CONTv(Unknown Source)
        at helloworld.__main(Unknown Source)
        at helloworld._$C_main(Unknown Source)
        at helloworld._$pre_C_main(Unknown Source)
        at helloworld.main(Unknown Source)

The function _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc()
was generated by llc after translating it from the C++ statement
std::cout.

How do you suggest I handle this sort of things properly?
The appropriate files are attached.

Thanks.

Napi

helloworld.c (8.46 KB)

helloworld.cpp (95 Bytes)

After converting a piece of C++ code to C one of the functions that are
generated is this:
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc

Where is it defined and where can I find the source for it? I need the
source to compile it with a C compiler (AMPC) that will convert the C
code to Java Bytecode. If the above function is in C++ then I need to
convert it to C first.

Here's the code segment that uses the function:

int main(void) {
  struct
l_struct_2E_std_3A__3A_basic_ostream_3C_char_2C_std_3A__3A_char_traits_
3C_char_3E__20__3E_ *ltmp_2_2;

  CODE_FOR_MAIN();
   /*tail*/ __main();
  ltmp_2_2 = /*tail*/
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc((
&_ZSt4cout), (&(_2E_str_1[0])));
  return 0;
}

Thanks for any tips.

Napi

Mohd-Hanafiah Abdullah wrote:

After converting a piece of C++ code to C one of the functions that are
generated is this:
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc

This is a method/function from the standard C++ library. You can link it in at the bytecode level with:

llvm-g++ -o output.bc <yourfile.bc> -lstdc++

You might also be able to do:

llvm-g++ -o output.bc <yourfile.bc> -lsupc++

... if you're only doing minimal C++ work.

One caveat: you will still have references to external C library functions (fopen(), open(), etc) that will not exist in the C output from the llc command. Can your C to Java Bytecode compiler handle calls to these functions?

-- John T.

After converting a piece of C++ code to C one of the functions that are
generated is this:
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc

This is defined in the C++ standard library. You can get the demangled name like so:
$ c++filt _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char>

&, char const*)

Where is it defined and where can I find the source for it? I need the
source to compile it with a C compiler (AMPC) that will convert the C
code to Java Bytecode. If the above function is in C++ then I need to
convert it to C first.

It comes with llvm-gcc4 in libstdc++. You'll need to compile it to bytecode by modifying the makefile though.

-Chris

Yes, AMPC handles calls to these functions. It supports ANSI C 1989.
The only functions in the standard C library that are not supported yet
are raise(), signal(), longjmp(), and setjmp(). Others work fine.

The purpose of my using LLVM is to convert C++ to C, then compile it
using AMPC and link it with the standard C library that has already been
compiled for the JVM also with AMPC. So, now I guess I will have to
also compile the C++ standard library to the JVM in order to support
functions like:
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc().

Cheers.

Napi

Thanks. Do I need to deal with the long names like
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc() or is there an
option to use the shorter version?

Cheers.

Napi

Mohd-Hanafiah Abdullah wrote:

  

After converting a piece of C++ code to C one of the functions that are
generated is this:
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
      

This is defined in the C++ standard library. You can get the demangled
name like so:
$ c++filt _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
std::basic_ostream<char, std::char_traits<char> >& std::operator<<
<std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char>
    

&, char const*)
      
Where is it defined and where can I find the source for it? I need the
source to compile it with a C compiler (AMPC) that will convert the C
code to Java Bytecode. If the above function is in C++ then I need to
convert it to C first.
      

It comes with llvm-gcc4 in libstdc++. You'll need to compile it to
bytecode by modifying the makefile though.
    
Thanks. Do I need to deal with the long names like
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc() or is there an
option to use the shorter version?
  

I don't think you will need to deal with any names. The C++ standard
library has already been compiled to LLVM bytecode (it is part of the
llvm-gcc/llvm-g++ distribution). If you use "llvm-g++ -lstdc++" it
should link in whatever libstdc++ functions are needed by your program;
they will get translated to C code along with the rest of your program
when you use llc.

-- John T.

Note that that only works with llvm-gcc3. With llvm-gcc4 you need to compile libstdc++ to bytecode explicitly.

-Chris

Could I use llvm-gcc3 with LLVM version 1.8?

Thanks.

Napi