Proposal for GSoc project - Compile programs with LLVM compiler

hello everybody,

I’m planning to compile a large codebase (Ubuntu) with LLVM compiler as my GSoc project 2008. I’ve given the my proposal below. I’ll be really great full if you can provide your feedback on improving this proposal.

  1. Project title

Compile programs with LLVM compiler

  1. Abstract

LLVM[1] is a Low Level Virtual Machine. It basically consists of two modules. They are the LLVM suite which consists of all the tools, libraries and header files needed to use the low level virtual machine and the GCC front end. The GCC front end contains a version of GCC that compiles C and C++ code in to LLVM bitcode. There is a third, optional piece called llvm-test. It is a suite of programs with a testing harness that can be used to further test LLVM’s functionality and performance.

The main objective of this project is to improve the LLVM testsuite[2]. The basic approach followed in achieving this objective is compiling a large code base that has not yet been compiled with LLVM and converts the build system to be compatible with build system of the LLVM testsuite.

In this project Ubuntu is used as the code base and initially it will be built using LLVM. Then the build system is tested and files bug reports for any issues that might hit and help track down problems.

The development progress can be tracked from here [3].

The complete project proposal is available here [4].

  1. Deliverables

· Updated build system

· Bugs reported to the bug tracker

· Document (associated work)

· Improved test programs

· A guide for future continuation of this project or similar projects

  1. Benefits to the LLVM community

The following will be available to the LLVM community on successful completion of the project.

· An improved testsuite

· A guide (document) to follow similar projects in future

  1. Overview

This section contains my work in this project.

The main task is to build the entire Ubuntu distribution in LLVM. This will add new testcases and benchmarks for the use of LLVM. A large testsuite is very important since it provides a lot of coverage of programs and enables us to spot and improve any problematic areas in the compiler.

The procedure followed in this project is discussed below.

Initially the entire Ubuntu distribution will be compiled with the LLVM. If the compilation fails then the set of projects that successfully build with LLVM will be selected. Bugs are filed for the projects that fail with reduced testcases. If the compilation succeeds the build system will be converted to be compatible with the LLVM Programs testsuite. This can be checked in to SVN and the automated tester can use it to track progress of the compiler.

The next step is to build an image from projects that are successfully built with LLVM and merge projects built with gcc for those who failed in LLVM. Then this image is booted and the problems that encounter during this process will be tracked down. This process will be repeated until the entire system is successfully built with LLVM and booted as expected.

When testing the code, it can be run with a variety of optimizations, and with all the back-ends: CBE, llc, and lli.

  1. Project plan

The basic steps in this project are listed below.

  1. Compile the Ubuntu distribution in gcc and LLVM

  2. If the compilation succeeds convert the build system to LLVM Program testsuite compatible mode

  3. If fails,

a. File bugs

b. Compile project by project in LLVM and select the projects that successfully compiles in LLVM.

c. For the projects that fail in LLVM compile them in gcc

  1. Build an image from projects successfully built with LLVM and merge in projects built with gcc in 3.c

  2. Try to boot the image from step 4 and track the problems.

  3. Continue the process until step 2 works.

According to these steps I can break down this project in to three major phases

1st Phase

Compile Ubuntu distribution in gcc and LLVM. This is the first most steps and it will require about 3 weeks.

Estimated completion: 20th May 2008

2nd Phase

If compilation fails follow steps 2,3,4,5. These steps are followed iteratively depending on the output. When following these steps for the first time it requires a lot of time. Therefore this part of the project will consume a lot of time provided that the 1st step fails.

Deliverables: A report of bugs

Estimated completion: 14th July 2008

3rd Phase

This is the final phase. This involves converting the build system to be compatible with the LLVM Programs testsuite.

Deliverables: A report of bugs, document (associated work), and improved test programs, updated build system, document (if anyone needs to continue this project this document will be supportive)

Estimated completion: 20th August 2008

7 Biography

I am a final year student of the Department of Computer Science and Engineering, University of Moratuwa, Sri Lanka[5]. This is the first time I’m getting involved in an open source project. But I’m highly impressed in open source project model and planning to get involved in more projects in future.

I have special interests in compiler technology and I have experience in C, C++ and Java programming languages. I have developed a lexical analyzer in C for Javascript language [6] and a Javascript pretty printer is also developed in python [7]. I find this project to be interesting since it requires knowledge regarding compiler technology and I will continue to work on this area to make further contributions to the LLVM community in future.

For further information you can refer my resume [8].

References

[1] - http://llvm.org/

[2] - http://llvm.org/docs/TestingGuide.html

[3]- http://llvmcompiler.blogspot.com/

[4] -http://llvmcompiler.blogspot.com/2008/03/proposal-for-summer-of-code-2008.html

[5]- http://www.cse.mrt.ac.lk/

[6]- http://paba50.googlepages.com/LexicalAnalizer.zip

[7]- http://paba50.googlepages.com/PrettyPrinter.zip

[8]- http://paba50.googlepages.com/Paba-resume.pdf

Kumaripaba Miyurusara Atukorala wrote:

hello everybody,

I'm planning to compile a large codebase (Ubuntu) with LLVM compiler
as my GSoc project 2008. I've given the my proposal below. I'll be
really great full if you can provide your feedback on improving this
proposal.

As Anton has said, Gentoo might be a better choice for this. But Ubuntu
should be ok too, if you like it better ...

1. Project title

Compile programs with LLVM compiler

2. Abstract

LLVM[1] is a Low Level Virtual Machine. It basically consists of two
modules. They are the LLVM suite which consists of all the tools,
libraries and header files needed to use the low level virtual machine
and the GCC front end. The GCC front end contains a version of GCC
that compiles C and C++ code in to LLVM bitcode. There is a third,
optional piece called llvm-test. It is a suite of programs with a
testing harness that can be used to further test LLVM's functionality
and performance.

The main objective of this project is to improve the LLVM
testsuite[2]. The basic approach followed in achieving this objective
is compiling a large code base that has not yet been compiled with
LLVM and converts the build system to be compatible with build system
of the LLVM testsuite.

In this project Ubuntu is used as the code base and initially it will
be built using LLVM. Then the build system is tested and files bug
reports for any issues that might hit and help track down problems.

The development progress can be tracked from here [3].

The complete project proposal is available here [4].

3. Deliverables

· Updated build system

· Bugs reported to the bug tracker

· Document (associated work)

· Improved test programs

· A guide for future continuation of this project or similar
projects

4. Benefits to the LLVM community

The following will be available to the LLVM community on successful
completion of the project.

· An improved testsuite

· A guide (document) to follow similar projects in future

5. Overview

This section contains my work in this project.

The main task is to build the entire Ubuntu distribution in LLVM. This
will add new testcases and benchmarks for the use of LLVM. A large
testsuite is very important since it provides a lot of coverage of
programs and enables us to spot and improve any problematic areas in
the compiler.

Ubuntu has 13000+ packages, even if you count only those in main, there
are 3084 source packages.
Handling all of them is beyond what a single person can achieve during
the summer.

I would say handle 'Essential: yes'/'Build-Essential: yes' packages as a
first step (25, resp. 22 packages). Then if time permits try to handle
as much from the remaining packages as possible.

The procedure followed in this project is discussed below.

Initially the entire Ubuntu distribution will be compiled with the
LLVM. If the compilation fails then the set of projects that
successfully build with LLVM will be selected. Bugs are filed for the
projects that fail with reduced testcases. If the compilation
succeeds the build system will be converted to be compatible with the
LLVM Programs testsuite. This can be checked in to SVN and the
automated tester can use it to track progress of the compiler.

Is the goal only building packages? Or does it include testing built
packages as well?
I would suggest at least running testsuites for packages that have them.
Also what architecture are you building on? x86 32-bit?

The next step is to build an image from projects that are successfully
built with LLVM and merge projects built with gcc for those who failed
in LLVM. Then this image is booted and the problems that encounter
during this process will be tracked down. This process will be
repeated until the entire system is successfully built with LLVM and
booted as expected.

When testing the code, it can be run with a variety of optimizations,
and with all the back-ends: CBE, llc, and lli.

6. Project plan

The basic steps in this project are listed below.

1. Compile the Ubuntu distribution in gcc and LLVM

2. If the compilation succeeds convert the build system to LLVM
Program testsuite compatible mode

3. If fails,

a. File bugs

b. Compile project by project in LLVM and select the projects
that successfully compiles in LLVM.

c. For the projects that fail in LLVM compile them in gcc

4. Build an image from projects successfully built with LLVM and
merge in projects built with gcc in 3.c

5. Try to boot the image from step 4 and track the problems.

6. Continue the process until step 2 works.

According to these steps I can break down this project in to three
major phases

1^st Phase

Compile Ubuntu distribution in gcc and LLVM. This is the first most
steps and it will require about 3 weeks.

Well, Ubuntu is already built by gcc, I think you could focus on
building with LLVM. Building with gcc is useful when you want to track
down problems.

Best regards,
--Edwin

hi,
After reading all the useful comments added by Török I decided to make following changes to my proposal (my points are numbered as 1, 2, 3).

Kumaripaba Miyurusara Atukorala wrote:

hello everybody,

I’m planning to compile a large codebase (Ubuntu) with LLVM compiler
as my GSoc project 2008. I’ve given the my proposal below. I’ll be
really great full if you can provide your feedback on improving this
proposal.

As Anton has said, Gentoo might be a better choice for this. But Ubuntu
should be ok too, if you like it better …

  1. Project title

Compile programs with LLVM compiler

  1. Abstract

LLVM[1] is a Low Level Virtual Machine. It basically consists of two
modules. They are the LLVM suite which consists of all the tools,
libraries and header files needed to use the low level virtual machine
and the GCC front end. The GCC front end contains a version of GCC
that compiles C and C++ code in to LLVM bitcode. There is a third,
optional piece called llvm-test. It is a suite of programs with a
testing harness that can be used to further test LLVM’s functionality
and performance.

The main objective of this project is to improve the LLVM
testsuite[2]. The basic approach followed in achieving this objective
is compiling a large code base that has not yet been compiled with
LLVM and converts the build system to be compatible with build system
of the LLVM testsuite.

In this project Ubuntu is used as the code base and initially it will
be built using LLVM. Then the build system is tested and files bug
reports for any issues that might hit and help track down problems.

The development progress can be tracked from here [3].

The complete project proposal is available here [4].

  1. Deliverables

· Updated build system

· Bugs reported to the bug tracker

· Document (associated work)

· Improved test programs

· A guide for future continuation of this project or similar
projects

  1. Benefits to the LLVM community

The following will be available to the LLVM community on successful
completion of the project.

· An improved testsuite

· A guide (document) to follow similar projects in future

  1. Overview

This section contains my work in this project.

The main task is to build the entire Ubuntu distribution in LLVM. This
will add new testcases and benchmarks for the use of LLVM. A large
testsuite is very important since it provides a lot of coverage of
programs and enables us to spot and improve any problematic areas in
the compiler.

Ubuntu has 13000+ packages, even if you count only those in main, there
are 3084 source packages.
Handling all of them is beyond what a single person can achieve during
the summer.

I would say handle ‘Essential: yes’/‘Build-Essential: yes’ packages as a
first step (25, resp. 22 packages). Then if time permits try to handle
as much from the remaining packages as possible.

  1. Since I am more comfortable with Ubuntu than with Gentoo, I’ll proceed with Ubuntu. But as it is too tedious I’d first handle the Build-Essential package.Then if time permits i will move to rest of the packages.

The procedure followed in this project is discussed below.

Initially the entire Ubuntu distribution will be compiled with the
LLVM. If the compilation fails then the set of projects that
successfully build with LLVM will be selected. Bugs are filed for the
projects that fail with reduced testcases. If the compilation
succeeds the build system will be converted to be compatible with the
LLVM Programs testsuite. This can be checked in to SVN and the
automated tester can use it to track progress of the compiler.

Is the goal only building packages? Or does it include testing built
packages as well?
I would suggest at least running testsuites for packages that have them.
Also what architecture are you building on? x86 32-bit?

  1. The goal is to build packages + Test the built packages. If my proposal doesn’t imply that I will rephrase this paragraph.
    Yes,I will be building on X86 32-bit architecture. I’ll add that to my proposal.

The next step is to build an image from projects that are successfully
built with LLVM and merge projects built with gcc for those who failed
in LLVM. Then this image is booted and the problems that encounter
during this process will be tracked down. This process will be
repeated until the entire system is successfully built with LLVM and
booted as expected.

When testing the code, it can be run with a variety of optimizations,
and with all the back-ends: CBE, llc, and lli.

  1. Project plan

The basic steps in this project are listed below.

  1. Compile the Ubuntu distribution in gcc and LLVM

  2. If the compilation succeeds convert the build system to LLVM
    Program testsuite compatible mode

  3. If fails,

a. File bugs

b. Compile project by project in LLVM and select the projects
that successfully compiles in LLVM.

c. For the projects that fail in LLVM compile them in gcc

  1. Build an image from projects successfully built with LLVM and
    merge in projects built with gcc in 3.c

  2. Try to boot the image from step 4 and track the problems.

  3. Continue the process until step 2 works.

According to these steps I can break down this project in to three
major phases

1^st Phase

Compile Ubuntu distribution in gcc and LLVM. This is the first most
steps and it will require about 3 weeks.

Well, Ubuntu is already built by gcc, I think you could focus on
building with LLVM. Building with gcc is useful when you want to track
down problems.

  1. Ok, I will focus on building with LLVM.

Best regards,
–Edwin

Thank you,
Kumaripaba