llvm backend tutorial

Hi,

I am writing an llvm backend tutorial through my learning process of llvm backend study and implementation. Web as follows,

http://jonathan2251.github.com/lbd/index.html

It include 10,000 lines of sources code for

1. Step by step, create an llvm backend for the Cpu0 which beginning from a CPU design for school teaching purpose in system programming.
2. ELF linker for Cpu0 which extended from lld.
3. elf2hex extended from llvm-objump.
4. Cpu0 verilog source code.

With these code, reader can run the generated code from Cpu0 llvm backend compiler, linker and elf2hex and see how it run on your computer.
The pdf and epub is also available in the web. It is a tutorial for llvm backend developer but not for an expert.
It is also can be a material for those who have compiler and Computer Architecture book knowledge and like to know how to extend the llvm
toolchain to support a new CPU.

Jonathan

Hi Jonathan,

Thanks for the enormous effort in making this tutorial.

I was reading through the material yesterday, and I am able to clearly follow the examples.

Are you planning to keep this updated with llvm revisions ?

Thanks

Shankar Easwaran

Yes, I will.

Jonathan

Shankar Easwaran shankare@codeaurora.org 於 2013/12/6 (週五) 12:27 AM 寫道﹕

Hi Jonathan,

Thanks for the enormous effort in making this tutorial.

I was reading through the material yesterday, and I am able to clearly
follow the examples.

Are you planning to keep this updated with llvm revisions ?

Thanks

Shankar Easwaran

I was wondering if this shouldn’t somehow find its way into the official LLVM documentation? It certainly seems to qualify to become official documentation in my eyes. Nearly any LLVM backend writer out there should be able to benefit from reading about your experiences, I’d think.

I know it is not as generic and abstract as what the LLVM dev list seems to prefer, but I personally find that the more concrete and based on actual experience a document is, the better the reader’s ability to understand what’s going on.

The only thing is that you might not want to go through the process of a peer review. That will likely add much work to what you have already accomplished.

– Mikael

Mikael,

Yes, I agree that llvm document stays in some high level and didn’t talk about how to translate IR into backend since it will binding with a specific CPU instructions. That’s the reason I contribute this document back to llvm.

During the past several months, there are readers corrections and questions to me. I try to answer their questions in my limited time.
I am appreciated with other programmers review for this book but I live in Taiwan where speak in Chinese. So, please mail me with English as simple and easy understanding as possible.

The other question out of llvm is,
I am a Christian. Do you connect to https://www.lyngvig.org.

Jonathan

Mikael Lyngvig mikael@lyngvig.org 於 2013/12/6 (週五) 9:43 AM 寫道﹕

I was wondering if this shouldn’t somehow find its way into the official LLVM documentation? It certainly seems to qualify to become official documentation in my eyes. Nearly any LLVM backend writer out there should be able to benefit from reading about your experiences, I’d think.

I know it is not as generic and abstract as what the LLVM dev list seems to prefer, but I personally find that the more concrete and based on actual experience a document is, the better the reader’s ability to understand what’s going on.

The only thing is that you might not want to go through the process of a peer review. That will likely add much work to what you have already accomplished.

– Mikael

2013/12/5 Jonathan <gamma_chen@yahoo.com.tw>

Hi Jonathan,

After reading/skimming through the official LLVM backend documents, I actually tried following your steps to write a new backend, but how to write td files still remains unclear. The details are not well explained, though I know most of them can be found in other documents or have already been documented somewhere in the LLVM source code or td files.

For a beginner with no experience like me, it is really hard to extract the fundamental structure from existing backends, e.g., what is necessary for an early stage and what is the refined result after years of development.

For example, everything went well with Cpu0RegisterInfo.td, with only a little struggle. But for Cpu0InstrInfo.td, questions start to come up: Why simm16 is inherited from Operand? What are PatLeaf and ComplexPattern? What is isReMaterializable? etc. Every line of description, every occurrence of new keyword or concept would confuse a beginner reader. They need to find enough information to follow this tutorial. This tutorial seems to tell that you have to write these 10 files, completely, without error, to continue to the next step. And this – how to start from starch, at least for me, is the most frustrating thing

If this is meant for beginners, I would say that a brief description or a link to these new concepts would be helpful.

Thanks,
Shang-Yi

2013/12/6 Mikael Lyngvig <mikael@lyngvig.org>

I was wondering if this shouldn't somehow find its way into the official
LLVM documentation?

I have been providing guidance to Jonathan and his collaborators from very
early in this project to make sure that this is an option.

-- Sean Silva

Shang-Yi,

Regard your questions in InstrInfo.td for the beginner. I remembered one reader asked me about td before. I told him I don’t know many things in td. The td purpose is do DAG translate from IR to machine instruction in compiler book knowledge as my book indicated as follows,
http://jonathan2251.github.io/lbd/backendstructure.html#dag-directed-acyclic-graph

There are many other knowledges include many llvm IR to machine instructions translate, branch/loop handle, function call, assembler/obj printing, disassembler, AsmParser, elf format, elf linker, llvm-objdump, … (I don’t know if you stay in td or keep going to other chapters)

To program based on a existed software structure, I try to got the overview structure first and skip the details. (Maybe you think what kind of specific td node is important and is the big picture because it exists in your InstrInfo.td)

Regards

Jonathan

Shang-Yi Yang ilway25@gmail.com 於 2013/12/6 (週五) 12:22 PM 寫道﹕

Hi Jonathan,

After reading/skimming through the official LLVM backend documents, I actually tried following your steps to write a new backend, but how to write td files still remains unclear. The details are not well explained, though I know most of them can be found in other documents or have already been documented somewhere in the LLVM source code or td files.

For a beginner with no experience like me, it is really hard to extract the fundamental structure from existing backends, e.g., what is necessary for an early stage and what is the refined result after years of development.

For example, everything went well with Cpu0RegisterInfo.td, with only a little struggle. But for Cpu0InstrInfo.td, questions start to come up: Why simm16 is inherited from Operand? What are PatLeaf and ComplexPattern? What is isReMaterializable? etc. Every line of description, every occurrence of new keyword or concept would confuse a beginner reader. They need to find enough information to follow this tutorial. This tutorial seems to tell that you have to write these 10 files, completely, without error, to continue to the next step. And this – how to start from starch, at least for me, is the most frustrating thing

If this is meant for beginners, I would say that a brief description or a link to these new concepts would be helpful.

Thanks,
Shang-Yi

2013/12/6 Mikael Lyngvig <mikael@lyngvig.org>