clang-rename performance oddness

Hi,

[ Resent this time without attaching a screenshot. ]

I've recently tried to use clang-rename on a larger codebase
(LibreOffice's core.git repo [1]), and while the tool works correctly, it's
surprisingly slow.

I've tried 3 actions:

1) Compiling a (somewhat random selected) file,
sw/source/uibase/uiview/pview.cxx with clang++: it takes about 4 seconds
on my machine.

2) Using LibreOffice's own simple rename tool [2] to rename 11 members
of a class referenced in that translation unit: also about 4 seconds.

3) Doing the same with clang-rename: 76 seconds.

So I expected that clang-rename does something stupid, but after
building the LLVM codebase with -DCMAKE_BUILD_TYPE=RelWithDebInfo and
used callgrind to see where time is spent, it seems that the time is
mostly spent in clang::ParseAST().

(One more thing I did not check is if -DLLVM_ENABLE_ASSERTIONS=ON is the
source of that slowdown; but I doubt such a 19x slowdown is due to
assertions being enabled.)

I see no obvious silly in any of its callers, like calling it again and
again inside a for() cycle or something like that. Any idea what might
cause the 4 vs 76 seconds?

I guess the problem is not specific to the above source file, but in
case it's interesting, happy to write up and share precise, reproducible
steps to trigger the problem.

I'm not sure what's a good way to share a callgrind profile, but I've
uploaded a kcachegrind screenshot[3]; what is a bit surprising to me is
that the time (it seems) is spent in a single clang function that's
called only once by clang-rename, and I guess the same function is also
called by clang++.

Any ideas how to debug such a problem?

Thanks,

Miklos

[1] git clone git://anongit.freedesktop.org/libreoffice/core
[2] https://cgit.freedesktop.org/libreoffice/contrib/dev-tools/tree/clang/rename.cxx
[3] https://people.freedesktop.org/~vmiklos/2016/Screenshot_20160816_215756.png

I doubt that most of your time is being spent in ParseAST() itself. ParseAST has lots and lots of callbacks into code that does the "real" work. I would expect most of the time spent to be somewhere under ParseAST.

If you are on Linux, I would recommend using 'perf' to collect your profile, and FlameGraph (GitHub - brendangregg/FlameGraph: Stack trace visualizer) to visualize it. Make sure you build with -fno-omit-frame-pointer, as otherwise perf will likely not have useful information.

Having assertions on can cause more overhead than you think. When I last did profiling, I noticed that the allocation behavior changed a lot with assertions turned on. The clang / llvm allocators would start poisoning and verifying empty memory during alloc / free.

Hi Miklos,

I had similar discussion with Manuel couple weeks ago.

This is what I have:

Debug build

kbobyrev@kbobyrev:~$ time ~/Documents/dev/build/Debug/llvm/bin/clang-rename -offset=9298 -new-name=Options -i ~/Documents/dev/src/llvm/tools/clang/tools/extra/clang-rename/tool/ClangRename.cpp

real 1m11.103s
user 1m10.534s
sys 0m0.548s

RelWithDebInfo build

kbobyrev@kbobyrev:~$ time ~/Documents/dev/build/RelWithDebInfo/llvm/bin/clang-rename -offset=9298 -new-name=Options -i ~/Documents/dev/src/llvm/tools/clang/tools/extra/clang-rename/tool/ClangRename.cpp

real 0m7.830s
user 0m5.476s
sys 0m2.318

Release build

kbobyrev@kbobyrev:~$ time ~/Documents/dev/build/Release/llvm/bin/clang-rename -offset=9298 -new-name=Options -i ~/Documents/dev/src/llvm/tools/clang/tools/extra/clang-rename/tool/ClangRename.cpp

real 0m4.553s
user 0m4.364s
sys 0m0.184s

What Manuel pointed out, unless you are building in Release mode Clang AST Parser is really really slow. Thus, we only saw performance problems in AST parsing, clang-rename routine seemed OK for me at that point.

Are you sure you have 60+ seconds in RelWithDebInfo mode, not in Debug mode? If you are, please send me some additional info about how to reproduce the problem.

Hi,

I doubt that most of your time is being spent in ParseAST() itself.
ParseAST has lots and lots of callbacks into code that does the "real" work.
I would expect most of the time spent to be somewhere under ParseAST.

Yes, I guess so; though I could not identify a single hotspot in the
called methods.

If you are on Linux, I would recommend using 'perf' to collect your profile,
and FlameGraph (GitHub - brendangregg/FlameGraph: Stack trace visualizer) to visualize it.
Make sure you build with -fno-omit-frame-pointer, as otherwise perf will
likely not have useful information.

Ah great, that allows me to share the recorded profile, unlike valgrind:

https://people.freedesktop.org/~vmiklos/2016/clang-rename.svg

Having assertions on can cause more overhead than you think. When I last
did profiling, I noticed that the allocation behavior changed a lot with
assertions turned on. The clang / llvm allocators would start poisoning and
verifying empty memory during alloc / free.

Hmm, indeed. Disabling assertions helps a bit, now clang-rename is down
from ~75 secs to 60 secs. But still a large ~60 vs 5 seconds gap.

Regards,

Miklos

Hi Kirill,

The profile you shared is showing 12% of the samples coming from the swapper. That suggests that something (likely Firefox and/or plasmashell) are taking up a lot of memory and causing general performance problems.

During the run of clang-rename, you may want to look at top and see how much memory is being used by clang-rename (as well as other processes). Maybe that's just an artifact of the perf run though. I'm not used to seeing a flame graph with such deep stacks.

In looking at the top level code, I do see one thing that I don't particularly care for...

Each symbol to be renamed causes a new frontend action to be taken. That eventually means a full re-parsing of the AST. I would hope / expect that I could gain significant performance by passing multiple renamings in one clang-rename invocation, but that doesn't look to be the case.

So the problem isn't that ParseAST() is slow, it's that it's being called too many times.

Hi Ben,