Some questions about Profile guided function order via Temporal Profiling, such as binary size regression

Recently, I updated our ld64 linker implementation with the upstream BalancedPartitioning algorithm and re-run the test.

Here is the result:

Some explanation:

  1. Reordered function: 2.2M cold functions (for which total call count is 0 in my local profile)
  2. llvm-bpc: use our ld64 with llvm BalancedPartitioning code
  3. ld-bpc: use our own bpc algorithm
  4. symbol count/branch island count: use ld64 linkmap file to accurate

From the result, the binary size increment nearly all comes from the increased branch island.

Tip: I even use some diff tool to compare the base linkmap with the reordered one, the diff symbols are all of branch island function

section compare

From your suggestion, I even use objdump -h to compare the binary itself. Here is the result. Ignore the VM address changes, only care about size

Another question: The whole MachO section size diff is 1.71MB, but the binary size diff is 2.05MB, but I don’t know why there are 340KB loss ? Any practice to compare the binary MachO size layout.