We are using machine-guided compiler optimizations (“MLGO”) for register allocation eviction and inlining for size, in real-life deployments. The ML models have been trained with reinforcement learning algorithms. Expanding to more performance areas is currently impeded by the poor prediction quality of our performance estimation models. Improving those is critical to the effectiveness of reinforcement learning training algorithms, and therefore to enabling applying MLGO systematically to more optimizations.
Expected outcomes: Better modeling of the execution environment by including additional runtime/profiling information, such as additional PMU data, LLC miss probabilities or branch mispredictions. This involves (1) building a data collection pipeline that covers additional runtime information, (2) modifying the ML models to allow processing this data, and (3) modifying the training and inference process for the models to make use this data.
Today, the models are almost pure static analysis; they see the instructions, but they make one-size-fits-all assumptions about the execution environment and the runtime behavior of the code. The goal of this project is to move from static analysis towards more dynamic models that better represent code the way it actually executes.
I came across this project and instantly got attracted to it. I would like to participate. Can you please add me to the slack channel?
I am working in an AV startup as research engineer which has provided me with extensive ML experience especially with model optimization, deployment and benchmarking in C++. I have worked a lot on Nvidia Jetson SOCs and am aiming to go deeper in the field of optimization by learning about compilers.
I’m an undergraduate student with a strong foundational understanding of Deep RL, and have also spent the past few weeks familiarizing myself with compilers, LLVM and the MLGO project.
Some resources I have gone through are the MLGO paper, this Google blog post, previous years’ work, and the documentation on the git repository. I am currently working through the training phase of the inlining demo, and would like to know how I could get closer to contributing.
I’m a 3rd year undergrad in computer science and math (mostly interested in computational math and probability, not that this is especially relevant). Looking at GSoC for this summer and have been hoping to learn more about compilers – having messed with LLVM for curiosity/ some personal projects after my compilers course, previously done work with ML in applications elsewhere, and being a fan of several projects built on LLVM, I felt this would be a great outlet + learn more about optimizations. Noticed you’ve linked some readings; will check those out of the next few days as I have time.
Should probably note programming experience, etc, following the others – top three are, maybe predictably, Python, C++, and Julia in that order, though I’ve spent time messing with several others over the years (e.g. at one point had a decent project with the now seemingly abandoned Haskell bindings llvm-haskell, but haven’t been able to get stack to build it again without some unportable black magic since I stopped working on it at the end of last summer Have since got it halfway working with the official C++ API). Decently familiar with PyTorch, TF, and XGBoost also in descending order.
Definitely best to reach me by my school email, which I also already have a Slack with (the primary on my github is an old personal); would love to learn more about the project regardless of outcomes – firstname.lastname@example.org
Done. Best thing is to come up with a specific proposal - which is necessary anyway as part of the contributor submission process for GSoC. We can help on Slack with fleshing it out before you submit it.
Prospective GSoC 2023 contributors: if you have proposals related to this topic or MLGO in general, and want to discuss/get feedback on the proposal before submitting it, the monthly MLGO meeting tomorrow will be dedicated to that (“open hours” kind of a thing). Details here.