RFC: Reducing metadata in LLVM tests


We are seeing that the LLVM test suite grows every day, so there is always a need for reducing the new tests as well as the old ones. There are tools designed to do that, e.g. [0] and [1]. When an IR (or MIR) test contains Debug Info, the LLVM DI Metadata makes the test obviously longer. Not all of these metadata are always necessary for the test, so reviewers frequently ask patch submitters for a reduced test case in terms of DI Metadata by recommending some tricks described in [2]. The [2] is just a small proposal for a utility tool that will save us some time (when fully implemented) during making the patches as well as when doing code reviews – any thoughts on this?

[0] https://llvm.org/docs/CommandGuide/bugpoint.html
[1] https://blog.regehr.org/archives/2109
[2] https://github.com/djolertrk/llvm-metadataburn

Best regards,

Hi Djordje,

I think something like this would be super useful. Can you explain how it differs from the metadata reduction in bugpoint and to what degree the two share (or could share) code?

– adrian

Hi Adrian,

I’m not opposed to adding something like this into the llvm-reduce (ReduceMetadata) utility, and I think it will be a better way.
@Jeremy Morse has mentioned that someone from SONY is working on something like this – I am happy with that and I just want to avoid redundant work.

Best regards,

Hi Folks,

I’m implementing DI metadata support for llvm-reduce as a part of my MSc project like @Morse, Jeremy mentioned.



Hi Nabeel,

Good luck with that! We are looking forward to seeing this implemented!


My main concern would be that significantly automatically reduced LLVM
debug info IR metadata might produce "strange" examples (valid
according to the verifier, but so quirky as to be hard to know if
they're meant to be supported or if we should/might eventually deem
them invalid according to the verifier) that may be hard to
understand, while still not being short enough to be hand craftable -
in which case I'm not sure how much of an improvement to the current
situation they would be.

So I think at least for the first few examples of automatically (or
even significantly hand reduced) test cases with debug info metadata
we'll want to look pretty carefully at them to discuss whether they
improve or harm test case maintainability.