[GSoC] Community feedback and interest in Distributed lit testing project from 2021 idea list.

Hello community/Mentor,

Myself, Soham Dixit an undergraduate student from India. I was
checking the list of 2021 GSoC projects in LLVM compiler
infrastructure, and I found ‘Distributed lit testing project’[1] as an
interesting one.

So far I have built llvm with Clang as a subproject enabled, I have
tried the make check* commands and have tried to see how the
llvm-lit based tests run.

AFAIU, there is a bottleneck(mostly time required and the number of
times we run these regression based tests) for running llvm-lit based
tests even on pretty good and powerful architectures which I believe
should be the llvm buildbots(probably).
Looking into how the building llvm compiler issue was solved mentions
that the build process was made distributed which might take advantage
of other nodes in the cluster.
IIUC, the aim of this project is to follow the similar approach that
was followed for solving the build problem and this time we are trying
to parallelize the llvm-lit based tests inside the llvm-lit framework
to get distributed on nodes in the cluster and collect pass and fail
results from each node and consolidate the same and represent to the
developer.

I am writing this email to get more detailed information about this
project and see what is the real use case where this project can be
practically used?

What is it that I should start looking into to make a good and
detailed proposal ?
I have had no experience in compilers but I am interested in getting involved.

Thanks and regards,
Soham

[1]https://llvm.org/OpenProjects.html#llvm_distributing_lit

Hi Soham,

I’m the person who has volunteered to do the GSOC mentoring for this project, should a suitable person apply for it. I replied to a previous email on this discussion by a different prospective student (see https://lists.llvm.org/pipermail/llvm-dev/2021-February/148741.html). You can find my response in the reply to that email. That response contains a number of key points about the project and hopefully answers some of your questions too. There were some comments by other people on this list, with reference to this topic, in another email thread starting here: https://lists.llvm.org/pipermail/llvm-dev/2021-March/149178.html.

For a concrete use-case, take my company. My 12 core machine takes somewhere between 20 and 40 minutes to run the entire set of LLVM tests, depending on what else my machine is doing, and what build configuration I am running. As we should run these tests prior to committing our changes to the LLVM main branch, this is a major bottleneck to landing any changes. We have about 100 computers in the office, and we run an internal distributed build system, which allows users to distribute their compilations across many machines, so that rather than having access to just 12 cores, they have access to 800+ cores potentially, making use of idle resources and allowing for far more compilations to be done and the result to be produced faster. If the lit tests could be passed in some way to these machines (perhaps in batches), the tests would run faster, since you’d have up to 1200 tests running simultaneously, rather than 12. Of course, other companies will have other distribution systems, so any solution should be general purpose, and allow individual companies to easily configure things for their own needs.

A good proposal will show some evidence of an understanding of how lit runs its tests, how distribution of a large problem like this might be solved, and some evidence of research into existing distribution systems. It would also outline how the student plans to tackle the problem, with a rough breakdown of individual stages and a timeline of how long they estimate each stage to last. Evidence of any LLVM contributions the student has already made by the time the proposals have been reviewed will also go a long way towards helping an application stand out.

I hope that answers your questions, feel free to ask more!

James