[Coverage] Support a hierarchical directory structure in generated coverage html reports

Description of the project:
Clang supports source-based coverage that shows which lines of code are covered by the executed tests[1]. It uses llvm-profdata [2] and llvm-cov [3] tools to generate coverage reports. llvm-cov currently generates a single top-level index HTML file. For example, a single top-level directory code coverage report [4] for LLVM repo is published on a coverage bot. Top-level indexing causes rendering scalability issues in large projects, such as Fuchsia [5]. The goal of this project is to generate a hierarchical directory structure in generated coverage html reports to match the directory structure and solve scalability issues. Chromium uses its own post-processing tools to show a per-directory hierarchical structure for coverage results [6]. Similarly, Lcov, which is a graphical front-end Gcov[7], provides a one-level directory structure to display coverage results [8].

Resources:
[1] Source-based Code Coverage — Clang 17.0.0git documentation
[2] llvm-profdata - Profile data tool — LLVM 17.0.0git documentation
[3] llvm-cov - emit coverage information — LLVM 17.0.0git documentation
[4] https://lab.llvm.org/coverage/coverage-reports/index.html
[5] https://fuchsia.dev
[6] https://analysis.chromium.org/coverage/p/chromium
[7] Gcov (Using the GNU Compiler Collection (GCC))
[8] LCOV - llvm-toolchain.info
[9] Support per-directory index files for HTML coverage report · Issue #54711 · llvm/llvm-project · GitHub

Expected result: Implement a support in hierarchical directory structure in generated coverage html reports and show the usage of this feature in LLVM repo code coverage reports.

Project size: Either medium or large.

Difficulty: Medium

Confirmed Mentor: Gulfem Savrun Yeniceri, Petr Hosek

could someone please explain which step here Code coverage in Chromium
Chromium doing extra as post-processing to show a per-directory hierarchical structure in coverage result?

Directory coverage for Chromium is done here:
https://source.chromium.org/chromium/chromium/tools/build/+/main:recipes/recipe_modules/code_coverage/resources/generate_coverage_metadata.py;l=468
Similarly, directory coverage for Fuchsia is done here: https://cs.opensource.google/fuchsia/fuchsia/+/main:tools/debug/covargs/report.go;l=315

Code coverage in Chromium only explains the high-level overview of the coverage pipeline.

Thanks for details! So currently there is not directory structure details in PROFDATA? Or it is already there but llvm-cov has not been coded to process it?

The directory structure needs to be added to llvm-cov’s coverage rendering mechanism.

Hello, allow me to ask two questions:

  1. Do we need to preserve the functionality of generating current flat structure or just replace it with the new hierarchical version?
  2. Do we need to implement the hierarchical structure for text format too?

@gulfemsavrun Hi, I finished a primitive version with minimum change of code. I think it can satisfy your requirement above. Here is a little output example. It’s a coverage report for one of my homework repository. Forgive me that I don’t have the time to test it on a real large project. If you do need a coverage report for Fuchsia, please let me know.

This patch is aimed to realize the need with minimum efforts and surely has many ways to improve. I am looking forward to get your feedback.

@gulfemsavrun Hi, this is my draft proposal: https://yhgu2000.github.io/llvm-cov-example/gsoc.pdf. Will this be OK? Let me know what can I improve!

yhgu2000 Thank you for your proposal. I will get back to you after reviewing it.

@yhgu2000 Your proposal looks good, and thank you for working on a small prototype to familiarize yourself before submitting the proposal. Please go ahead and start the application process if you are interested in.

Thank you!

Just a reminder that GSoC contributor application deadline is April 4 - 18:00 UTC.

Thank you! I have already submitted my proposal on March 29.

1 Like

I think that preserving the flat layout as an option would be useful since for smaller projects that layout may be preferable to the hierarchical one.

It might be useful for consistency but it is not necessary.

Congratulations @yhgu2000, this project got accepted in GSoC 2023.
Welcome to the LLVM community! @petrhosek and I are very excited about this project, and looking forward to working with you!