Need help with code generation

Lorenzo_Laneve · March 19, 2016, 12:03am

I wrote my compiler and now it generates LLVM IR modules. Now i’d like to go ahead and make object file and then executable, just like clang does.

What should I have to use to create the object files? and then how do I call the ld? (not llvm-ld, I want my compiler to work like Clang and I read that Clang doesn’t use llvm-ld).

Bruce_Hoult · March 19, 2016, 10:48am

If you’ve created a .bc or a .ll file then the simplest thing is to just give it to clang exactly the same as you would for a .c file. Clang will just Do The Right Thing with it.

If you don’t want to link, then pass flags such as -c to clang as usual.

e.g.

---- hello.ll ----

declare i32 @puts(i8*)
@str = constant [12 x i8] c"Hello World\00"

define i32 @main() {
%1 = call i32 @puts(i8* getelementptr inbounds ([12 x i8]* @str, i64 0, i64 0))
ret i32 0
}

Lorenzo_Laneve · March 19, 2016, 12:31pm

I’d like to make my compiler independent, just like Clang. Doesn’t Clang call llc and then system’s ld by itself? I don’t want my compiler to depend by any other program.
I guess there will be a class in the llvm library that generates the object files based on the system’s triple and data layout, and then call the system’s ld?

James_Molloy3 · March 19, 2016, 8:51pm

Hi Lorenzo,

Clang doesn’t call llc; LLVM is compiled into Clang. Clang does call the system linker though.

Making your compiler generate object code is very simple. Making it fixup that object code and execute it in memory (JIT style) is also simple. Linking it properly and creating a fixed up ELF file is less simple. For that, you need to compile to object (using addPassesToEmitFile() - see llc.cpp) then invoke a linker. Getting that command line right can be quite difficult.

Rafael, This would be a good usecase for LLD as a library. I heard that this is is an explicit non-goal, which really surprised me. Is that indeed the case?

Cheers,

James

mats_petersson · March 19, 2016, 8:58pm

If you plan on calling C runtime library functions, you probably want to do what I did:

Cheat, and make a libruntime.a (with C functions to do stuff your compiler can’t do natively) and then link that using clang or gcc.

https://github.com/Leporacanthicus/lacsap/blob/master/binary.cpp#L124

At some point, I plan to replace my runtime library with native Pascal code, at which point I will be able to generate the ELF binary straight from my compiler without the runtime library linking in the C runtime library, but that’s not happening anytime real soon. Getting the compiler to compile v5 of Wirth’s original Pascal compiler is higher on the list…

Lorenzo_Laneve · March 19, 2016, 10:15pm

@james
Yeah for code generation I figured out that clang doesn’t actually use llc, and I already started reading its code to see how it works.

Bruce_Hoult · March 19, 2016, 10:51pm

Yes, you shouldn’t have any trouble just declaring and using C functions such as fopen, fclose, puts, fputs, fputc.

You’re likely to find that putc is a macro not a function, in which case you won’t be able to use that.

Depending on whether it’s important, if you’re running on a Unix-like system then you could save quite a bit of size in your binary by using open(2), close(2), read(2), write(2) directly, as they’re not any harder to use. But the C standard library is available in more places.

Lorenzo_Laneve · March 19, 2016, 11:01pm

You’re right well, it’s just like fputc(stdout, x).
Last thing. Are 4 calls to fputc as fast as a call to fputs with a 4-char string? Or fputs may be faster?

Rui_Ueyama · March 21, 2016, 7:01pm

Hi Lorenzo,

Clang doesn't call llc; LLVM is compiled into Clang. Clang does call the
system linker though.

Making your compiler generate *object* code is very simple. Making it
fixup that object code and execute it in memory (JIT style) is also simple.
Linking it properly and creating a fixed up ELF file is less simple. For
that, you need to compile to object (using addPassesToEmitFile() - see
llc.cpp) then invoke a linker. Getting that command line right can be quite
difficult.

Rafael, This would be a good usecase for LLD as a library. I heard that
this is is an explicit non-goal, which really surprised me. Is that indeed
the case?

You can use LLD as a library.

github.com

llvm-mirror/lld/blob/master/docs/NewLLD.rst#the-elf-linker-as-a-library

The ELF, COFF and Wasm Linkers
==============================

The ELF Linker as a Library
---------------------------

You can embed LLD to your program by linking against it and calling the linker's
entry point function lld::elf::link.

The current policy is that it is your reponsibility to give trustworthy object
files. The function is guaranteed to return as long as you do not pass corrupted
or malicious object files. A corrupted file could cause a fatal error or SEGV.
That being said, you don't need to worry too much about it if you create object
files in the usual way and give them to the linker. It is naturally expected to
work, or otherwise it's a linker's bug.

Design
======

We will describe the design of the linkers in the rest of the document.

This file has been truncated. show original

James_Molloy3 · March 21, 2016, 7:04pm

A corrupted file could cause a fatal error or SEGV.

Uhhh, that’s not particularly useful.

Rui_Ueyama · March 21, 2016, 7:07pm

> A corrupted file could cause a fatal error or SEGV.

Uhhh, that's not particularly useful.

"Corrupted" means really corrupted, like ELF header is broken. Is this
really the case?

James_Molloy3 · March 21, 2016, 7:10pm

Well sure, it’s unlikely, but how many consumers can make that sort of guarantee? And if a consumer can’t guarantee the integrity of the ELF file they have no choice but not to use LLD, or to fork before using it.

Rafael_Avila_de_Espi · March 21, 2016, 7:14pm

Correct.

Cheers,
Rafael

Rui_Ueyama · March 21, 2016, 7:15pm

We had a long discussion recently and the decision was made so that we can go ahead. It is not a good idea to discuss that again. At least it is too soon.

I’d recommend to use lld’s link() function if input is guaranteed to be consistent (such as outputs of clang). Otherwise, please use fork.

James_Molloy3 · March 21, 2016, 7:16pm

Correct

Out of interest, how does LLD itself handle error reporting when invoked from the command line, and how does it avoid segfaulting in that case?

Cheers,

James

Rui_Ueyama · March 21, 2016, 7:19pm

> Correct

Out of interest, how does LLD itself handle error reporting when invoked
from the command line, and how does it avoid segfaulting in that case?

It generally reports an error and exit, or in rare circumstances it just
segfaults.

James_Molloy3 · March 21, 2016, 7:21pm

If it can exit, why can’t it longjmp back to a library consumer at least?

Rui_Ueyama · March 21, 2016, 7:23pm

We do not enable exceptions and longjmp is not safe. Also, if it can segfault for some pathetic input, “it longjmps in most cases” doesn’t help people who wants 100% guarantee like you.

James_Molloy3 · March 21, 2016, 7:25pm

Also, if it can segfault for some pathetic input

Surely that’s a bug though, not seriously designed behaviour?

Rui_Ueyama · March 21, 2016, 7:27pm

> Also, if it can segfault for some pathetic input

Surely that's a bug though, not seriously designed behaviour?

No. That is a design choice.

Topic		Replies	Views
LLVM Linker LLVM Dev List Archives	3	84	October 10, 2017
Generating object files more efficiently LLVM Dev List Archives	11	134	March 25, 2019
LLVM/Clang and getting rid of the system linker (GNU ld or MSVC link.exe) Clang Frontend	5	100	April 18, 2011
LLVM - Dynamicly load Functions via IR? Clang Frontend	1	103	April 18, 2017
Problem: Generate binary file with llc LLVM Project	2	2560	March 7, 2020

Need help with code generation

Related Topics