My apologies for failing to fully understand your viewpoint previously. Even so, Intel’s module splitting scheme cannot be adapted for our requirements. Its design is tailored specifically to split host and device compilation pipelines, addressing use cases vastly different from ours—we cannot repurpose this implementation for our workloads.
A concise comparison of all candidate solutions we’ve assessed to date is provided below. Having thoroughly balanced their respective merits and limitations, we maintain that the newly proposed splitting logic serves an indispensable purpose and is not superfluous.
| Split Method | Core Splitting Mechanism | Limitations for Our Use Case |
|---|---|---|
| AMDGPU | Split based on call graph | 1. The splitting pass picks GPU kernel functions as split root nodes. which is a GPU-kernel-specific design and cannot generalize to our workloads. 2. No dedicated handling logic for ifunc symbols is implemented. |
| Intel | Category-based splitting | 1. Root nodes are determined by kernel invocations or the presence of the `sycl-module-id` attribute, tailored for heterogeneous host-device compilation. 2. ifunc and symbol alias scenarios are not supported. |
| Julia | Split by connected components | Fine-grained module partitioning is not achievable with this approach. |
Please feel free to review the analysis above and offer any comments. We also encourage you to inspect our codebase for further verification.( [ThinLTO][Split] Split module for parallel compilation in backend (1/N) by mmjjpp · Pull Request #198702 · llvm/llvm-project) As all existing splitting approaches fail to fit our use case, we hope you can approve our new splitting design.
In practice, Steps 2–6 are statically hardcoded in user build scripts and opaque to modification. A variable number of split objects would break this ThinLTO pipeline entirely. This motivates our merging step after codegen: it preserves a single consistent output artifact for Step 3~5 and eliminates the need for script adjustments.
If we instead split modules during compilation, full transparency to user build flows would be far harder to achieve.