Size of ISelLowering.cpp files

It seems a lot of ISelLowering.cpp files are huge:

17698 ./llvm/lib/Target/AMDGPU/SIISelLowering.cpp
22145 ./llvm/lib/Target/ARM/ARMISelLowering.cpp
24713 ./llvm/lib/Target/RISCV/RISCVISelLowering.cpp
30221 ./llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
62393 ./llvm/lib/Target/X86/X86ISelLowering.cpp
11156 ./llvm/lib/Target/SystemZ/SystemZISelLowering.cpp

I was curious if it has ever been considered to split these files into smaller chunks similar to how InstCombine files are split up? In our downstream backend, we have a similar large file and was wondering if there are any issues that might prevent such a split.

1 Like

I’m generally in favor of splitting it. Though I think it’ll eventually fall onto each target on how exactly they want to split it. For instance, in RISC-V LowerOperation and friends account for a great portion of the lines so it might be a good target to split, but I’m not sure if that’s also the case for other targets.

P.S. IIRC there was a time X86ISelLowering.cpp was so big it crashed some of the IDEs.

LOL I’d just created [META][X86] Remove unnecessary x86 code from DAG/X86ISelLowering · Issue #143088 · llvm/llvm-project · GitHub to focus work on getting X86ISelLowering.cpp down.

What I’ve found is there’s a great deal in there that doesn’t actually need to be in DAG at all, as we have better passes it should be (or already is) handled.

As well as raw LOC, file build times is also a major concern, with some targets having insane number of generated isel patterns that get included in multiple files.

3 Likes

X86 already has a partial split, X86ISelLoweringCall.cpp

Yes, and I was going to suggest that this is a good pattern to follow in the rest of the targets. The ISel handling for calls is always complicated, regardless of target, and really doesn’t need to be part of the general “select everything” API that TargetLowering otherwise implements. I think we copied it in part from GlobalISel

Thanks. Apart from splitting call lowering, is there any advantage/desire to split out other parts? For example, for X86, all the SSE/AVX ISel could be split into its own file. I am assuming each target might have such splits possible across feature sets (or some other dimension). I don’t work directly on any of the upstream backends, but I had similar questions about downstream code we have and that lead to this thread about upstream code as well.

Anyone aware of the motivation to split InstCombine into several files? I see that it was done in 2010 (for ex, split PHI node stuff out to InstCombinePHI.cpp · llvm/llvm-project@de1fede but a search of llvm dev mailing did not show any discussion about this.

AFAICT, its mostly better code organization to avoid too large files. A secondary effect may be that incremental builds are a bit faster as only one of the split-up files gets recompiled as opposed a large monolithic file. Even a full build might be faster because the split parts get compiled in parallel. And these same motivations might apply to ISel lowering file splits.

@clattner It seems the split for InstCombine happened as a series of commits here: The llvm-commits The Week Of Monday 28 December 2009 Archive by thread. Can you please share if it was mostly to avoid too large .cpp files or something more? Thanks

A large portion of the SSE/AVX code on X86 is in static functions called by LowerOperation and PerformDAGCombine. In order to move them to another file, you either need to move all of LowerOperation/PerformDAGCombine or make those functions non-static class members. Making them non-static member functions requires a bit more work than just moving the functions which is largely why we didn’t already do it.

So would it make sense to try to split all of LowerOperation and/or all of PerformDAGCombine out into separate files, instead of splitting by AVX/SSE/other features?

Incidentally I find it strange that the file name X86ISelLowering does not match the class name X86TargetLowering.