GSoc 2009 (Bad Subject in the previous email)

Dear all

I am a PhD student of Computer Scince at Simon Fraser University (http://www.cs.sfu.ca) interested in applying to GSoC. My PhD is focused on theoretical computer science, but since Sep. 2008 I have started working on Software projects again. Currently I am working in COSTAR lab (http://costar.sfu.ca/) on a high performance regular expression engine based on Parallel bit streams technology. A considerable part of this project is optimal register allocation and I have got familiar with the literature during my current project. Before my PhD I have worked on various projects including distributed firewall and short message service center. These projects requried C++ and C(kernel level) programming in Linux.

I am interested in the following open projects of llvm.

1- Implementing interprocedural register allocation. This is in the same line with what I have been doing recently.

The other projects below are also quite interesting for me:

2- Adding support for Type Based Alias Analysis

3- Improving handling of memcpy/memset.

4- Implementing a loop dependency analysis infrastructure.

Best Regards
Ehsan Amiri

PS. Sorry for the wrong subject in the previous email

Hi Ehsan,

All of the projects you have listed are quite interesting. If I were to advocate for one, it would be #2. I think the scope of work is perfect for GSoc.

I’d encourage send out a more concrete proposal when you’re ready.

Thanks,

Evan

Hi Evan

Thanks for the email. I had a look at gcc implementation of TBAA and I think three main steps in implementation of TBAA for LLVM will be this:

April 20 ~ May 23: I will read gcc implementation in depth and play around with LLVM code.

May 23 ~ July 6: Implementation and test of a simple version of TBAA that does not work with all aggregate types. I think part of the coding required for aggregate type can be done in this period but it will not be ready for midterm evaluation. On the other hand this period will be my first experience with LLVM code and so I will spend some time on learning LLVM-specific details.

July 6 ~ August 10: Finishing implementaiton and testing for aggregate types and preparing any doumentaion needed.

In general gcc implementation of TBAA will be the base of my work, but I will read it in depth before starting the implementation, so that we can decide on any changes required before the start of coding.

Please let me know if this sounds reasonable.
Thanks
Ehsan

I would encourage you to have more concrete milestones for each phase of the project. What you are suggesting is very open ended and vague. You should try to break up the algorithm into incremental stages and deadlines for each part.

-Tanya

Hi Evan

Thanks for the email. I had a look at gcc implementation of TBAA and I think three main steps in implementation of TBAA for LLVM will be this:

April 20 ~ May 23: I will read gcc implementation in depth and play around with LLVM code.

May 23 ~ July 6: Implementation and test of a simple version of TBAA that does not work with all aggregate types. I think part of the coding required for aggregate type can be done in this period but it will not be ready for midterm evaluation. On the other hand this period will be my first experience with LLVM code and so I will spend some time on learning LLVM-specific details.

July 6 ~ August 10: Finishing implementaiton and testing for aggregate types and preparing any doumentaion needed.

In general gcc implementation of TBAA will be the base of my work, but I will read it in depth before starting the implementation, so that we can decide on any changes required before the start of coding.

I am not sure if modeling gcc’s implementation of TBAA is a good idea. Could someone who knows more about this topic chime in?

Evan

Hi Ehsan,

In addition to lacking a concrete set of milestones on your deliverables, your current thoughts are missing a discussion of how you plan to represent this information in LLVM. Please familiarize yourself more with this area so that you can make a proposal that is concrete enough to be implementable.

Here is some related information to read up on:
http://nondot.org/sabre/LLVMNotes/EmbeddedMetadata.txt

-Chris

Hi Ehsan, my understanding is that gcc is informed by
language front-ends about types S and T for which an S*
can alias a T*. This information amounts to a graph
with types as nodes and an arrow from S to T if an S*
can alias a T*. Probably C unions result in such edges,
and I know the Ada front-end produces all kinds of info
like this.

This information would somehow need to be communicated
to LLVM, so represented in the LLVM IR somehow. What is
the best way to do this?

Note that since multiple gcc types map to the same LLVM type,
the graph would need to be collapsed in order to be in terms
of LLVM types, and this might remove a lot of important
information. How to evaluate whether LLVM TBAA would actually
be effective in practice?

Finally LLVM alias analysis would need to be enhanced to
exploit the knowledge that an S* cannot alias a T* if there
is no path from S to T in the graph.

Anyway, that's how I always imagined LLVM TBAA would work
on a meta-level. Since I actually don't know anything much
about TBAA you should take this with a large pinch of salt!

Ciao,

Duncan.