Is it possible to use Profile-Guided Optimization (PGO) when compiling an application to the Webassembly target with Clang now? In Rustc I met such a limitation - link. I am wondering - is this scenario supported by Clang? If it supports this scenario, I think Rustc can get some inspiration from the Clang implementation.
The desire to have PGO support for WASM is simple - speed up more Webassembly applications.
If anyone already tried to implement this scenario - could you please share your experience with it?
@davidxl - I kindly mention you here since you know a lot about PGO in Clang.
I am not aware of any pgo support for WASM. To make it work, the least that needs to be done is to implementing profile runtime for wasm, including implementing __llvm_profile_register_function to register profile counters (note this is not needed for native linux environment because counters are grouped by sections and are contiguous. The profile dumping support is another one needing to be added.
I suspect that the Instrumentation profile lowering also needs to be changed as profile counters may need to be allocated differently.
Profile-use side, there are also missing things. When compiling high level source to WASM code, the inliner and value profiler transformation can benefit from the profile. For WASM to machine code JIT, there needs to be WASM language support to record the profile data (branch probability, entry count, value profile data).
Since I didn’t find an issue about PGO support for the WASM target for Clang in the LLVM issue tracker - do I need to create one? At least it would be visible to other people that PGO does not work with WASM target for now (maybe documenting such a limitation somewhere in Clang’s PGO documentation also will be helpful for potential users).
By the way, are there other Clang-supported targets without PGO (Instrumentation) support?
Hi @zamazan4ik,
Currently we don’t have support for PGO in WebAssembly, but this is something we are interested in working on this year. Since WebAssembly is a VM architecture, this effort is larger than just enabling the LLVM PGO infrastructure; there are a couple of standards proposals in flight (see here and here) to pass various types of profile-related information from toolchains to wasm engines. Also some kinds of optimizations don’t work as well (or work differently) in wasm compared to hardware architectures (e.g. code layout optimizations, since the code ends up getting recompiled to machine code). Having said that though, that doesn’t mean we can’t also just get LLVM’s PGO support working on its own. if there are specific use cases that would benefit from this, that would help is prioritize it; or of course if you or anyone else wanted to work on it we would be happy to help you with guidance and reviews.