Static code generation - is it gone from LLVM 2.7?

Hi,

Just realized that ability to generate static object code (e.g. ELF w/o
using JIT) is no longer available in 2.7 (at least in release_27 branch).

For example
  > llc -filetype=obj whatever.bc
doesn't work in Linux environment anymore (well it wasn't fully
implemented before but it worked for simple bytecodes in 2.6).

I used to generate code by creating TargetMachine and
FunctionPassManager, then calling TargetMachine::addPassesToEmitFile,
then adding my own CodeEmitter/CodeWriter (exactly like llc does). I
have to say I always hated code emitter interface but at least it worked
for me.

Now LLVMTargetMachine::addPassesToEmitFile has changed. It adds its own
code emitter and it's always MachOCodeEmitter which of course I don't need.

Is there a new way to create non-JIT object code in LLVM 2.7?

Nope, sorry. This will hopefully be coming in 2.8. Mainline llvm can already do macho quite robustly for x86-32 and x86-64.

-Chris

Chris Lattner wrote:

We're integrating a full assembler into the compiler. I'm not sure what you mean by "flexibility to supply my own class to do actual object output", but you should be able to implement your own container format, right now even. :slight_smile:

-Chris

Chris Lattner wrote:

What exactly is expected to be coming? Will it be the same way MachO is
currently implemented but with some flexibility to supply my own class
to do actual object output? Or just a return of old ObjectCodeEmitter?

We're integrating a full assembler into the compiler. I'm not sure what you mean by "flexibility to supply my own class to do actual object output", but you should be able to implement your own container format, right now even. :slight_smile:

That's great. Any samples, docs?

No docs, you can look at the macho emitter to see how it works.

If it's llvm-mc you are talking about what is the current implementation
status? I mean on what target and/or input data is it known to work?

Two different things here:

1) llc -filetype=obj
2) llvm-mc: this provides a stand alone assembler (among other things)

llc -filetype=obj does not support inline assembly yet, but other than that it is believed to be 100% correct on darwin-i386 and very nearly correct on darwin-x86_64.

llvm-mc has parsers for X86 32/64 that are reasonably solid, but it only accepts the syntax that the compiler produces. For example, it will currently reject x86 instructions that don't have an b/w/l/q suffix etc. This clearly needs to be fixed :slight_smile: Other than that, it works as well as llc -filetype=obj.

New method of emitting object code is ok for me. But it is still
experimental, isn't it?

Yes.

-Chris

Chris Lattner wrote:

What exactly is expected to be coming? Will it be the same way MachO is
currently implemented but with some flexibility to supply my own class
to do actual object output? Or just a return of old ObjectCodeEmitter?

We're integrating a full assembler into the compiler. I'm not sure what you mean by "flexibility to supply my own class to do actual object output", but you should be able to implement your own container format, right now even. :slight_smile:

That's great. Any samples, docs?

No docs, you can look at the macho emitter to see how it works.

If it's llvm-mc you are talking about what is the current implementation
status? I mean on what target and/or input data is it known to work?

Two different things here:

1) llc -filetype=obj
2) llvm-mc: this provides a stand alone assembler (among other things)

llc -filetype=obj does not support inline assembly yet, but other than that it is believed to be 100% correct on darwin-i386 and very nearly correct on darwin-x86_64.

llvm-mc has parsers for X86 32/64 that are reasonably solid, but it only accepts the syntax that the compiler produces. For example, it will currently reject x86 instructions that don't have an b/w/l/q suffix etc. This clearly needs to be fixed :slight_smile: Other than that, it works as well as llc -filetype=obj.

New method of emitting object code is ok for me. But it is still
experimental, isn't it?

Yes.

Thank you for answers!

Now there is a way to implement what I'd like to. But it would be MUCH
better if LLVMTargetMachine::addPassesToEmitFile could take arbitrary
MCStreamer as input. Without such a feature when compiling bytecode
(i.e. emulating llc -filetype=obj behaviour) I have to emit .s file
first, disassemble it and feed to custom MCStreamer. That'll hopely work
but it's ugly.

What are you trying to do? I don't see why you'd have to do that.

-Chris

Chris Lattner wrote:

New method of emitting object code is ok for me. But it is still
experimental, isn't it?

Yes.

Thank you for answers!

Now there is a way to implement what I'd like to. But it would be MUCH
better if LLVMTargetMachine::addPassesToEmitFile could take arbitrary
MCStreamer as input. Without such a feature when compiling bytecode
(i.e. emulating llc -filetype=obj behaviour) I have to emit .s file
first, disassemble it and feed to custom MCStreamer. That'll hopely work
but it's ugly.

What are you trying to do? I don't see why you'd have to do that.

Ok, I'm trying to compile LLVM bytecode into some native object code
format. The thing very close to

llc -filetype=obj foo.bc

To do so I mimic llc behaviour: create TargetMachine, create
FunctionPassManager, call TargetMachine::addPassesToEmitFile(), add my
own MyObjectCodeWriter pass and run passes.

But since LLVM 2.7 TargetMachine::addPassesToEmitFile (as implemented in
LLVMTargetMachine child class) adds its own final pass (AsmPrinter)
paired with either AsmStreamer, MachOStreamer or NullStreamer. I cannot
pass my own descendant of MCStreamer class. Now we've got predefined set
of final passes instead of free choice.

AFAIK to get object code file I have to feed emitted code through my own
MyMCStreamer class. The only sane option I see is to take assembly
output from AsmStreamer (.s file) then feed it to AsmParser passing
MyMCStreamer in AsmParser's constructor.

Thus we've got
  instructions -> AsmPrinter -> AsmParser -> MyMCStreamer
While ideally it should be only
  instructions -> MyMCStreamer

Maybe I've missed something and this could be done much easier. So I'm
looking forward to advice.

Chris Lattner wrote:

New method of emitting object code is ok for me. But it is still
experimental, isn't it?

Yes.

Thank you for answers!

Now there is a way to implement what I'd like to. But it would be MUCH
better if LLVMTargetMachine::addPassesToEmitFile could take arbitrary
MCStreamer as input. Without such a feature when compiling bytecode
(i.e. emulating llc -filetype=obj behaviour) I have to emit .s file
first, disassemble it and feed to custom MCStreamer. That'll hopely work
but it's ugly.

What are you trying to do? I don't see why you'd have to do that.

-Chris

Long story short
I'd like to do something like this

llc -filetype=obj foo.bc

But due to API changes I've got to do

llc -filetype=asm foo.bc | llvm-mc assemble -filetype=obj

Which will work but the former is clearly better

Chris Lattner wrote:

New method of emitting object code is ok for me. But it is still
experimental, isn't it?

Yes.

Thank you for answers!

Now there is a way to implement what I'd like to. But it would be MUCH
better if LLVMTargetMachine::addPassesToEmitFile could take arbitrary
MCStreamer as input. Without such a feature when compiling bytecode
(i.e. emulating llc -filetype=obj behaviour) I have to emit .s file
first, disassemble it and feed to custom MCStreamer. That'll hopely work
but it's ugly.

What are you trying to do? I don't see why you'd have to do that.

Ok, I'm trying to compile LLVM bytecode into some native object code
format. The thing very close to

llc -filetype=obj foo.bc

To do so I mimic llc behaviour: create TargetMachine, create
FunctionPassManager, call TargetMachine::addPassesToEmitFile(), add my
own MyObjectCodeWriter pass and run passes.

But since LLVM 2.7 TargetMachine::addPassesToEmitFile (as implemented in
LLVMTargetMachine child class) adds its own final pass (AsmPrinter)
paired with either AsmStreamer, MachOStreamer or NullStreamer. I cannot
pass my own descendant of MCStreamer class. Now we've got predefined set
of final passes instead of free choice.

LLVM 2.7 doesn't have final support for this. The idea is that we'd add ELF and PECOFFStreamers as well, or parameterize it a different way.

AFAIK to get object code file I have to feed emitted code through my own
MyMCStreamer class. The only sane option I see is to take assembly
output from AsmStreamer (.s file) then feed it to AsmParser passing
MyMCStreamer in AsmParser's constructor.

You're going to have to hack up the code generator to do this in 2.7

-Chris

Chris Lattner wrote:

But since LLVM 2.7 TargetMachine::addPassesToEmitFile (as implemented in
LLVMTargetMachine child class) adds its own final pass (AsmPrinter)
paired with either AsmStreamer, MachOStreamer or NullStreamer. I cannot
pass my own descendant of MCStreamer class. Now we've got predefined set
of final passes instead of free choice.

LLVM 2.7 doesn't have final support for this. The idea is that we'd add ELF and PECOFFStreamers as well, or parameterize it a different way.

Obviously API hasn't settled yet. My suggestion is to add "MCStreamer
*Streamer" as an additional argument to addPassesToEmitFile() and
FileType of CGFT_Custom. The problem of passing MCContext to streamer
after its construction requires some work but I don't think it's a hard one.

Anyway predefined set of streamers whether it includes ELF/COFF/whatever
or not is a bad idea.

Thanks for answers again!