Patch: Make ccc generate native object files

Here's a patch that makes ccc generate native object files instead of bitcode files. This makes it work better with the native toolchains, which should make testing random apps easier. I'll commit this unless anyone has a good reason or finds any bugs in the code :slight_smile:

Anders

textmate stdin KgqJts.txt (3.09 KB)

Anders Carlsson wrote:

Here's a patch that makes ccc generate native object files instead of
bitcode files. This makes it work better with the native toolchains,
which should make testing random apps easier. I'll commit this unless
anyone has a good reason or finds any bugs in the code :slight_smile:

Actually generating bitcode is very useful for me, and I'd like to keep
that possibility. Maybe there could be a flag to ccc to tell it to
generate bitcode or native (and default to native)?

For example I can use 'ccc' to call 'llvm-gcc -O4' to generate bitcode
and 'llvm-ld -native' to do linking and LTO.
I am unable to do this without 'ccc', because my system linker doesn't
recognize the bitcode files generated by -O4.

With ccc it was simply a matter of replacing clang with 'llvm-gcc -O4 -c' :wink:

Also I think keeping the possibility to generate bitcode is useful for
clang too, if we want LTO.

Thanks,
--Edwin

I'm not sure about the "perform some necessary optimizations" part of the patch. It seems somewhat premature to overload clang testing with optimization of the code it produces. Can we have -O{0,1,2,3,4,s} map to those so it's more user configurable, and people can use -O0 if they want to disable "opt" from running at all?

One approach that would probably make everyone happy is to teach "llc" how to output the original bitcode stream in its own section/segment in the native assembly stream, and teach the LLVM libraries how to extract this encapsulated bitcode data for Mach-O, ELF, etc., in the event that the input file is not a bitcode file directly. Maybe I'm not the first person to think of this and LLVM supports it already (in which case awesome!). At the expense of the added native object file size, users could run either their native "nm" or "llvm-nm", for example, and get useful information. They could either use their native linker/compiler driver to link the object files, or use llvm-ld.

People testing frontend features only would probably use "gcc" or "ccc" (which would invoke gcc) to link the hybrid object files against other native objects and native libraries. People wanting to try LTO could use "llvm-ld" or "ccc -O4" (which would invoke llvm-ld) to link the hybrid objects (and mixing in native objects would probably result in undefined symbols).

Shantonu

It's pretty easy to get the .bc file encoded into the .s file in some magic section, however, it would be harder to teach all the llvm tools about native .o files etc,

-Chris

I'm not suggesting they do.

I'm suggesting that llvm/lib/Bitcode/BitcodeReader.cpp know that if the first few bytes don't look like a bitcode file, it should try to analyze it as a Mach-O file and read enough of the load commands to find a __LLVM,__llvm section, and then create a view into the memory buffer of just that data. Same for ELF.

All llvm tools layered on top of BitcodeReader would look at the LLVM data for these hybrid objects. Native tools would look at the native parts. The native linker would aggregate the __LLVM,__llvm section into some useless blob, probably, but that's OK for these purposes.

Shantonu

I'm not suggesting they do.

I'm suggesting that llvm/lib/Bitcode/BitcodeReader.cpp know that if the first few bytes don't look like a bitcode file, it should try to analyze it as a Mach-O file and read enough of the load commands to find a __LLVM,__llvm section, and then create a view into the memory buffer of just that data. Same for ELF.

All llvm tools layered on top of BitcodeReader would look at the LLVM data for these hybrid objects. Native tools would look at the native parts. The native linker would aggregate the __LLVM,__llvm section into some useless blob, probably, but that's OK for these purposes.

Right, I'm pretty sure I understand what you're suggesting. This does require llvm to know enough of the native .o formats (macho, elf) to be able to find the bc section and its extents, that's all I'm saying :). If someone was interested in adding support for this to LLVM, I wouldn't oppose it at all, but it should be in a different file than bitcodereader.cpp. We want clients that don't need this functionality to be able to link in bc reader without linking in extra code. Since bcreader is an archive, putting the code in a separate .cpp file is sufficient for this,

-Chris