At least one proprietary linker put a lot of effort into deduplicating and rewriting debug information. This took up the majority of the link time despite serious engineering time on performance optimisation. For example, some sections were written from scratch by the linker because that proved faster than parsing the input. Teaching LLD to dedup DWARF should be expected to dramatically slow it down (when enabled, ideally not when disabled).
Is a more incremental approach viable? In particular, are there IR passes that fold debug strings etc that could be deployed before feeding everything into a linker?