Hey folks, some updates:
First, big thanks to @MaskRay @RKSimon and @lenary for reviewing most of the commits, that’s greatly appreciated.
The following graph shows the number of lines output by the preprocessor when building libLLVM.so
on the x axis, number of preprocessor output line, on the y axis, commits starting with 07486395d2d05c9c567994456774cafdcc1611d0.
As big drops and big increase are the most interesting ones, here is a dump of the major changes, sorted in ascending order:
- Cleanup LLVMMC headers · llvm/llvm-project@ef736a1 · GitHub -2772831
- Cleanup codegen includes · llvm/llvm-project@7f230fe · GitHub -1297142
- Cleanup LLVMObject headers · llvm/llvm-project@e72c195 · GitHub -955336
- https://github.com/llvm/llvm-project/commit/ed98c1b37661b0795a5e34517941485f0f0688d1 -850610
- https://github.com/llvm/llvm-project/commit/fbbc41f8dd233f1f0655852ca3e9634b1bb90cf0 -638828
- https://github.com/llvm/llvm-project/commit/71c3a5519dbcd609fb64560ac7fdfe8db149b905 -382703
- https://github.com/llvm/llvm-project/commit/a5bbc6ef99bbc7fcf321326df2889e063ed77004 -319914
- https://github.com/llvm/llvm-project/commit/a494ae43bef09c8d0f6a6a98e92f3a89758247d5 -314478
- https://github.com/llvm/llvm-project/commit/59630917d6cc7c4a273f617f92bf6190ee2992e1 -277671
- https://github.com/llvm/llvm-project/commit/fc97efa40962978b9c92688ea2f2236386aec945 -235210
- https://github.com/llvm/llvm-project/commit/290e482342826ee4c65bd6d2aece25736d3f0c7b -185009
- https://github.com/llvm/llvm-project/commit/19bdf44d850884a13a8708ccf1260fb7f4ef4eb3 -179617
- https://github.com/llvm/llvm-project/commit/85355a560a33897453df2ef959e255ee725eebce -179354
- https://github.com/llvm/llvm-project/commit/0f73fb18ca333e38cdb9ffa701a8db026c56041d -179352
- https://github.com/llvm/llvm-project/commit/b380a31de084a540cfa38b72e609b25ea0569bb7 -151791
- https://github.com/llvm/llvm-project/commit/c7eb84634519e6497be42f5fe323f9a04ed67127 -141980
- https://github.com/llvm/llvm-project/commit/65588a0776aedab35cac0f1e2b312c2731f1afa2 -139084
- https://github.com/llvm/llvm-project/commit/06943537d9eef401dd408a79cd70a2d2f3c084df -138676
- https://github.com/llvm/llvm-project/commit/8de8731591fef829fb92549883e24ffa4ad381d7 -123806
- https://github.com/llvm/llvm-project/commit/eb4c8608115c1c9af0fc8cb5b1e9f2bc960014ef -104052
- https://github.com/llvm/llvm-project/commit/0a4184909a8c4861142acec0f59a4a3373f39b09 +179620
- https://github.com/llvm/llvm-project/commit/21bce9007ae818f95863dca928c1488d982e5383 +187704
- https://github.com/llvm/llvm-project/commit/290e5722e83e9c7480d64c049a14b74e30b6af4a +138861
- https://github.com/llvm/llvm-project/commit/2aed07e96c7a4f777e854fec619842c4289f8fbd +1693782
- https://github.com/llvm/llvm-project/commit/30e612ebdfb0f243eb63d93487790a53c26ae873 +139084
- https://github.com/llvm/llvm-project/commit/43c2348c5b926df6bdbc5b70efaa35ecdefe12d5 +179352
- https://github.com/llvm/llvm-project/commit/5f62156762d45f53fa70446c718813f9f9a099e5 +121301
- https://github.com/llvm/llvm-project/commit/61835d19a848ecd3530d9b86deb6b15f336ae6d6 +227260
- https://github.com/llvm/llvm-project/commit/807ba7aace188ada83ddb4477265728e97346af1 +179617
- https://github.com/llvm/llvm-project/commit/8bcbfb50e8ea24998f9adf2f50b1f63b499299ed +123806
- https://github.com/llvm/llvm-project/commit/a278250b0f85949d4f98e641786e5eb2b540c6b0 +1294010
- https://github.com/llvm/llvm-project/commit/bd3a1de683f80d174ea9c97000db3ec3276bc022 +151774
- https://github.com/llvm/llvm-project/commit/c31014322c0b5ae596da129cbb844fb2198b4ef4 +139054
- https://github.com/llvm/llvm-project/commit/de54e4ab78ef09b60f870e8df6f8a87e56d6bd94 +179352
- https://github.com/llvm/llvm-project/commit/f75da0c8e65cf1b09012a8b62cd7f3e9a646bbc9 +1750280
- https://github.com/llvm/llvm-project/commit/f927021410691bc2612cfb635b1d9cf9b94977e6 +151774
Nothing to surprising there: header cleanup removes lines and new features add lines.
It’s also interesting to see the scale of the changes: I’ve been investing tons of hours in the process, leading to more than 6M lines of preprocessed code being removed but libLLVM requires ~238M lines of preprocessed code to build, so that’s largely negligible in terms of impact on the whole code base.
I still think this is useful, even just considering the (assumed) lower coupling that results from the cleanup.
So what’s next? I still need to finish the cleanup for libLLVM, and I’m probably not going further down that slope.
I also would love to improve the IncludeCleaner step from clangd to output less false positive on LLVM codebase (not to self: track those bugs somewhere…), so that we could use it in the CI. I’ve submitted a first step to be able to use that engine as a stand-alone tool here: ⚙ D121593 [clangd][WIP] Provide clang-include-cleaner.
Alternatively, I wrote a python script that runs IWYU twice, once before and once after a patch, and makes a diff of the output to detect added dependencies. Would someone be interested in taking that script and plug it in the CI?
That’s probably too long for a summary, but hey, that’s how it is.
PS: I’ve been diligently breaking the CI with the cleanup commits. I do test all projects on my setup, but not in a multi-platform way, and not under EXPENSIVE_CHECKS neither under NDEBUG. My future self is working on more diligently checking pre-commit CI.