Utility to generate elaborated assembly/IR tests

Debug info related directories contain a lot of Clang-generated assembly files and LLVM IR files (and some precanned relocatable files which can/should be replaced). In general, the assembly/IR files benefit from being cleaned to remove unnecessary
details. However, for tests requiring elaborate assembly/IR files where cleanup is
less practical (e.g., large amount of debug information output from Clang),
the current practice is to include the C/C++ source file and the generation instructions as comments, e.g.

; Input generated from the following C code using
; clang -g -O2 -S -emit-llvm

; int e(int);
; inline int f(int a) { return e(a); }
; int g(int a) { return f(a); }
; 
; namespace n {
; int v;
; }

This is inconvenient when regeneration is needed. (The author introduced the test might have performed non-trivial clean-up steps.) I propose a utility script llvm/utils/update_test_body.py to make re-generation easier ([utils] Add script to generate elaborated IR and assembly tests by MaskRay · Pull Request #89026 · llvm/llvm-project · GitHub).

## For test with one single assembly file, .endif can be used.
## Other tests need split-file

    # RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o a.o
    # RUN: ... | FileCheck %s
    # CHECK: hello
    .ifdef GEN
    #--- a.cc
    int va;
    #--- gen
    clang --target=x86_64-linux -S -g a.cc -o -
    .endif
    # content generated by the script 'gen'

The script will prepare extra files with split-file, invoke gen, and then rewrite the part after .endif with its stdout.

If you want to generate extra files, you can print #--- separators and utilize split-file in RUN lines.

4 Likes

I think this is a great idea. I have often encouraged people producing debug-info tests that originated from higher-level code to include that code as comments, because I’ve had to regenerate that kind of test. Automating this will be handy.

Because the checks are not generated, it does not carry the same risks that the other update-check scripts do.

This is a separate issue but also worth pursuing. The LLVM Security Group just talked about this yesterday. There are a lot of them, some of which are “broken” objects intended to exercise error-handling paths, but many probably just predate the habit of using asm or yaml to generate the objects.

Thanks for the support!

This is a good idea, but I think an important part of this project is to add logic to minimize the LLVM IR that comes out of Clang. Reviewers regularly ask people manually adding debug info tests to remove irrelevant target-specific attributes from the test to make it more readable. For source location tests, we can adopt the convention to always test with -g1 to exclude irrelevant type and variable info.

Agree. This utility can use help from an IR cleanup script to automate creation/update of IR debug tests better.

My immediate use case is for lld/test/ELF/debug-names*.s assembly tests I contributed in https://github.com/llvm/llvm-project/pull/86508 .

I picked .ifdef GEN ... .endif for assembly because even in the absence of this script, a user can easily regenerate a test using just split-file.

For IR tests, ideally we should have a multi-line comment marker as well, otherwise users have to manually remove ; .

Thanks for @jh7370 for the review on [utils] Add script to generate elaborated assembly tests by MaskRay · Pull Request #89026 · llvm/llvm-project · GitHub .
I hope that there will be a second pair of eyes reviewing the patch:)

Can I get another person to examine the patch? Thanks.

Thanks to ayermolo and dwblaikie who have reviewed the patch ([utils] Add script to generate elaborated IR and assembly tests by MaskRay · Pull Request #89026 · llvm/llvm-project · GitHub) as well.

The documentation will become available at https://llvm.org/docs/TestingGuide.html#elaborated-tests after the patch lands.