The Problem:llvm-opt-viewer is currently a passive log dump. It interleaves optimization remarks with source code but fails to aggregate data or provide actionable insights. Developers must manually grep for missed optimizations, which is okay for small codebase, but makes it inferior to proprietary tools like Intel Advisor or AMD AOCC Reports, in case of large codebases.
I propose replacing the static HTML generator with llvm-opt-workbench: a React-based SPA(Single Page Application) that acts as an active performance analysis tool. This project will implement features derived from industry-standard compilers (ICC, NVCC, AOCC) to democratize performance tuning for LLVM users.
Below is a mockup of the âLoop Heatmapâ view, âCode Insightsâ and âLogic Diffâ view.
1. Loop-Centric Analysis (The Intel View) Current viewers are file-centric. Implement a Loop Heatmap that groups remarks by Loop ID (from metadata).
Feature: Aggregate stats per loop: Trip Count, Vector Width, and Interleave Count.
Differentiation: Allows sorting by âCompute Intensityâ vs. âMemory Boundâ (derived from Analysis remarks), moving beyond simple file scrolling.
2. Automated Bottleneck Detection (The AOCC View) Build a heuristics engine that parses the raw YAML Argsfield to flag specific hardware bottlenecks:
Memory Divergence: Flag âGather/Scatterâ operations (indicative of non-contiguous memory access).
Register Pressure: Highlight remarks indicating âSpillsâ or excessive stack usage (critical for GPU/embedded targets).
Actionability: Map failure codes to specific pragmas (e.g., dep-distance failure â suggest #pragma clang loop distribute).
Inlining Misses: filter by cost > threshold failures and suggest _attribute_((always_inline)) for high-hotness functions.
3. Logic-Aware Diffing Extend opt-diff to support Function-Symbol Diffing. Instead of line-number comparisons, which break easily, map the remarks to function symbols. This enables accurate âBefore/Afterâ performance regression tracking across commits, even when file line numbers shift.
4. Performance Architecture
Backend: C++ based parser (replacing the slow Python yaml parser) to generate a single compressed JSON blob.
Frontend: React + Virtualized Lists to handle multi-GB reports without DOM lag.
I request feedback on the heuristic engineâs scope for this initiative. @anemet@OfekShilon
I need to explicitly note that there are no projects were proposed for LLVM yet, so youâd need to wait until student proposal period will be open to see what will be open projects (assuming that LLVM will be participating this year).
Strictly speaking thatâs not accurate, as the opt-viewer/OptView2 scripts produce an index html page which lists all remarks. OptView2 additionally enables filtering the collected remarks, which was very useful to me.
Sounds like a great potential addition to the llvm ecosystem.
If the need to prioritize arises (which it inevitably does), Iâd submit that relatively few projects are able to use PGO builds - so heatmap features might prove less valuable than others.
Sorry for the late reply, and kudos! Hope this project materializes.
You are right regarding OptView2, itâs actually a big inspiration here. I phrased the grep part poorly, I meant to highlight that while static HTML reports (like OptView2âs) are a huge step up from raw logs, they still lack the interactivity of a proper debugging workflow. My goal is to enable things like instant filtering and logic-aware diffing without needing to regenerate static files for every change.
Regarding PGO, that is a very fair point. I realize that relying solely on Block Frequency would limit the heatmapâs utility for non-PGO builds.
Beyond the PGO constraints, are there specific architectural limitations or friction points you encounter with the current opt-viewer scripts, or any improvement areas you believe should be prioritized?
I see you mention that loading all the data into the DOM will run out of memory. Is there a point where just loading the remark data into JS objects would also OOM the tab, and is that amount of data realistically achievable for a big C++ project?
One project in a similar â but not quite the same â direction (which hasnât really gotten anywhere near the polish required for a production tool) would be: Clang Optimization Viewer - Visual Studio Marketplace
I was somewhat involved in the development of this plugin, and one of the issues we ran into was YAML parsing being incredibly slow (which you note as well). Instead of a full YAML parser we hacked together a parser for the particular subset of YAML used by the optimization record, which was faster. Nowadays @tobias-stadler has added support for emitting the optimization record in the bitstream format however, which should parse even faster. You can see him present his work here, if you havenât already: https://www.youtube.com/watch?v=i7O62-2qxpU
YAML is still a reasonable starting point to get up and running with a working prototype quickly, of course.
I think your roadmap is ambitious: imho (without having worked on anything optimisation related in quite a while, so take that for what itâs worth) this would be a successful GSoC project even if you ended up only having time to finish phase 1. Phase 2 is a very welcome addition as well though!
On another note: do you have mentors confirmed for this project? 2 are required IIRC. I think this is a great initiative, and it would be a shame if it ended up not happening because of a lack of mentors.
Thank you for the detailed feedback. To answer your question for massive codebases (like compiling LLVM with full LTO), the sheer volume of remarks is enough to hit the V8 heap limit. The push back to limit is the deduplicated relational JSON structure, but thereâs still a physical ceiling, and in case of extremely large codebases, the backend only serves the data in chunks, so the frontend only holds what it needs.
I think using the prototype Python bindings for libRemarks to read bitstream natively makes sense. Iâll update the proposal to make that the plan for the backend.
I also really appreciate the reality check on the timeline. I will adjust the proposal to clearly box Phase 1 as the core GSoC deliverable, and frame Phase 2 as stretch goals.
Since there arenât any confirmed mentors yet for this project, Iâd really appreciate it if you could tag mentors whoâd be interested in this project.
For this to move forward as an official GSoC project, we need to lock in at least two mentors. I would be really grateful if anyone interested in the compiler diagnostics side could step in to mentor this.
Tagging all the relevant people here @hnrklssn@tobias-stadler@OfekShilon .
Since time is of the essence right now, please let me know if we can make this happen so I can get the final submission sorted on my end.
Thank you all again for all the support and guidance so far!
Itâs come to my attention that there was a project last year that did something somewhat similar: Google Summer of Code
It hasnât been merged yet, but I think it would be good if we could build upon that tool if possible. Could you take a look, and see if you could incorporate that in your proposal?
Hi @kamini08, thanks for sharing this, the proposal looks quite interesting.
For context, last yearâs LLVM Advisor work was designed as a lightweight, offline-first tool for offloading workflows, so the implementation intentionally avoids a heavy dependency stack. The current prototype already has two main pieces: a compiler wrapper to collect artifacts/remarks during the build, and a local visualization tool served with a minimal Python server plus HTML/CSS/JavaScript.
So this could likely be extended rather than replaced. Some of the directions we had already considered were better support for large projects, cross translation-unit dependencies, and richer aggregation/analysis on top of the collected data.
Your proposal seems compatible with that direction, especially if it can build on the existing collection pipeline and visualization infrastructure.
Yes @androm3da, I think it could be a good direction.
llvm-advisor is intended as a unified infrastructure to collect and visualize compilation data, so llvm-mca could make sense as an additional analysis layer, especially to complement optimization remarks with lower-level performance information.
I would probably see that as a follow-up extension, though, rather than part of the initial core, since llvm-mca is a somewhat different level of analysis and we should keep the first scope manageable.
Iâve rewritten the proposal to build directly upon the existing llvm-advisor infrastructure. The focus is now entirely on scaling the backend for massive LTO workloads and injecting active diagnostics into the existing Code Explorer UI, maintaining the lightweight, offline-first policy for the tool.
Iâd like to get a feedback on whether this updated architecture aligns better with the toolâs vision, and if there are any other improvements, I should focus on.
I am still actively looking for mentors for this project. If any of you would be open to mentoring me, please let me know @hnrklssn@miguelcsx@jdoerfert@kevinsala
Just a quick follow up on this, as the GSoC submission deadline is coming up in under 48 hours.
I plan to submit the final proposal tomorrow and am still actively looking for a mentor if the updated scope looks good.