[RFC] Exhaustive bitcode compatibility tests for IR features

From the discussion of bitcode backward compatibility on the list, it seems we lack systematic way to test every existing IR features. It is useful to keep a test that exercises all the IR features for the current trunk and we can freeze that test in the form of bitcode for backward compatibility test in the future. I am proposing to implement such a test, which should try to accomplish following features:
1. Try to keep it in one file so it is easy to freeze and move to the next version.
2. Try to exercise and verify as much features as possible, which should includes all the globals, instructions, metadata and intrinsics (and more).
3. The test should be easily maintainable. It should be easy to fix when broken or get updated when assembly gets updated.
I am going to implement such test with a lengthy LLVM assembly, in the form of the attachment (which I only tests for global variable). It is going to be long, but someone must do it first. Future updates should be much simper. In the test, I started with a default global variable and enumerate all the possible attributes by changing them one by one. I try to keep the variable declaration as simple as possible so that it won’t be affected by some simple assembly level changes (like changing the parsing order of some attributes, since this is supposed to be a bitcode compatibility test, not assembly test). I try to make the tests as thorough as possible but avoid large duplications. For example, I will tests Linkage attribute in both GlobalVariable as well as Function, but probably not enumerate all the types I want to test. I will keep the tests for Types in a different section since it is going to be huge and it is orthogonal to the tests of globals.
When making a release or some big changes in IR, we can freeze the test by generating bitcode, change the RUN line so it runs llvm-dis directly, and modified the CHECKs that corresponding to the change. Then we can move on with a new version of bitcode tests. This will add some more works for people who would like to make changes to IR (which might be one more reason to discourage them from breaking the compatibility). I will make sure to update the docs for changing IRs after I add this test.

Currently, there are individual bitcode tests in the llvm which are created when IR or intrinsics get changed. This exhaustive test shouldn’t overlap with the existing ones since this tests is focusing on keeping a working up-to-date version of IR tests. Both approaches of bitcode tests can co-exists. For example, for small updates, we can add specific test cases like the ones currently to test auto-upgrade, while updating the exhaustive bitcode test to incorporate the new changes. When making huge upgrades and major releases, we can freeze the exhaustive test for future checks.

For the actual test cases, I think it should be trivial for globals, instructions, types (Correct me if I am wrong), but intrinsics can be very tricky. I am not sure how much compatibility is guaranteed for intrinsics, but they can’t not be checked through llvm-as then llvm-dis. Intrinsics, as far as I know, are coded like normal functions, globals or metadata. My current plan is to write a separate tool to check the intrinsics actually supported in the IR or backend. Intrinsic function might be the easiest since the supported ones should all be declared in Intrinsics*.td and can be check by calling getIntrinsicID() after reading the bitcode. Intrinsics coded as globals (llvm.used) or metadata (llvm.loop) can be more tricky. Maybe another .td file with hardcoded intrinsics for these cases should be added just for the testing purpose (we can add a new API to it later so that we don’t need to do string compares to figure out these intrinsics). After we have another tool to test intrinsics (which can be merged with llvm-dis to save a RUN command and execution time), the attached test will just need to be updated like following (checking llvm.global_ctors for example):
; RUN: verify-intrinsics %s.bc | FileCheck -check-prefix=CHECK-INT %s

%0 = type { i32, void ()*, i8* }
@llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
; CHECK: @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
; CHECK-INT: @llvm.global_ctors int_global_ctors

Let me know if there is better proposal.

Steven

llvm-3.6.ll (3.71 KB)

From the discussion of bitcode backward compatibility on the list, it seems we lack systematic way to test every existing IR features. It is useful to keep a test that exercises all the IR features for the current trunk and we can freeze that test in the form of bitcode for backward compatibility test in the future. I am proposing to implement such a test, which should try to accomplish following features:
1. Try to keep it in one file so it is easy to freeze and move to the next version.
2. Try to exercise and verify as much features as possible, which should includes all the globals, instructions, metadata and intrinsics (and more).
3. The test should be easily maintainable. It should be easy to fix when broken or get updated when assembly gets updated.
I am going to implement such test with a lengthy LLVM assembly, in the form of the attachment (which I only tests for global variable). It is going to be long, but someone must do it first. Future updates should be much simper. In the test, I started with a default global variable and enumerate all the possible attributes by changing them one by one. I try to keep the variable declaration as simple as possible so that it won’t be affected by some simple assembly level changes (like changing the parsing order of some attributes, since this is supposed to be a bitcode compatibility test, not assembly test). I try to make the tests as thorough as possible but avoid large duplications. For example, I will tests Linkage attribute in both GlobalVariable as well as Function, but probably not enumerate all the types I want to test. I will keep the tests for Types in a different section since it is going to be huge and it is orthogonal to the tests of globals.
When making a release or some big changes in IR, we can freeze the test by generating bitcode, change the RUN line so it runs llvm-dis directly, and modified the CHECKs that corresponding to the change. Then we can move on with a new version of bitcode tests. This will add some more works for people who would like to make changes to IR (which might be one more reason to discourage them from breaking the compatibility). I will make sure to update the docs for changing IRs after I add this test.

Currently, there are individual bitcode tests in the llvm which are created when IR or intrinsics get changed. This exhaustive test shouldn’t overlap with the existing ones since this tests is focusing on keeping a working up-to-date version of IR tests. Both approaches of bitcode tests can co-exists. For example, for small updates, we can add specific test cases like the ones currently to test auto-upgrade, while updating the exhaustive bitcode test to incorporate the new changes. When making huge upgrades and major releases, we can freeze the exhaustive test for future checks.

For the actual test cases, I think it should be trivial for globals, instructions, types (Correct me if I am wrong), but intrinsics can be very tricky. I am not sure how much compatibility is guaranteed for intrinsics, but they can’t not be checked through llvm-as then llvm-dis. Intrinsics, as far as I know, are coded like normal functions, globals or metadata. My current plan is to write a separate tool to check the intrinsics actually supported in the IR or backend. Intrinsic function might be the easiest since the supported ones should all be declared in Intrinsics*.td and can be check by calling getIntrinsicID() after reading the bitcode. Intrinsics coded as globals (llvm.used) or metadata (llvm.loop) can be more tricky. Maybe another .td file with hardcoded intrinsics for these cases should be added just for the testing purpose (we can add a new API to it later so that we don’t need to do string compares to figure out these intrinsics). After we have another tool to !

test intr
i
nsics (which can be merged with llvm-dis to save a RUN command and execution time), the attached test will just need to be updated like following (checking llvm.global_ctors for example):

; RUN: verify-intrinsics %s.bc | FileCheck -check-prefix=CHECK-INT %s

%0 = type { i32, void ()*, i8* }
@llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
; CHECK: @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
; CHECK-INT: @llvm.global_ctors int_global_ctors

Let me know if there is better proposal.

Hi Steven,

thanks for taking care of this. I think bitcode compatibility is important and it believe the kind of test you propose takes a reasonable approach.

Cheers,
Tobias

Thanks Tobias. I will start writing more tests and the utility to check intrinsics.

So the proposal is that during development new features are added to
test/Features/compatibility.ll (or some other name). When 3.6 is
released, we will

* assemble the file with llvm-as-3.6.
* Check in the .bc file as test/Features/Input/compatibility-3.6.bc
* Copy test/Features/compatibility.ll to
test/Features/compatibility-3.6.ll and change it to run llvm-dis
directly on the 3.6 bitcode.

And then when 4.1 is released we have a discussion on what we want to
drop from the old .bc files.

Correct? If so, sounds reasonable to me.

So the proposal is that during development new features are added to
test/Features/compatibility.ll (or some other name). When 3.6 is
released, we will

* assemble the file with llvm-as-3.6.
* Check in the .bc file as test/Features/Input/compatibility-3.6.bc
* Copy test/Features/compatibility.ll to
test/Features/compatibility-3.6.ll and change it to run llvm-dis
directly on the 3.6 bitcode.

And then when 4.1 is released we have a discussion on what we want to
drop from the old .bc files.

Correct? If so, sounds reasonable to me.

Correct. This is exactly what I mean.

This sounds like a good plan. Initially, the tests will be very repetitious, but over time as we make changes they will diverge. For example, before swapping the order of ‘alias’ and linkage in the IL, we would’ve had:

@a = alias weak i8* @target
; CHECK: @a = alias weak i8* @target

This would’ve been copied to compatibility-3.N.ll for the previous release before the change, and the change would have to update to the new llvm-dis syntax:

@a = alias weak i8* @target
; CHECK: @a = weak alias i8* @target

So we need the duplication because we expect llvm-dis to slowly diverge from the input that we would have fed to llvm-as-3.N.

Well, this essentially means when someone need to change assembly syntax in llvm-3.9, he has to go back and edit all the compatibility tests for 3.N? Sounds like a little bit extra work, though I am not totally against it.
Another option, since this is a bitcode test and the assembly really doesn’t matter, we can do the tests for changes like alias in one single file like following:

; Check for 3.5 compatibility
; RUN: llvm-dis %s.3.5.bc | FileCheck %s
; Check for current IR
; RUN: llvm-as < %s | llvm-dis | FileCheck %s

@a = weak alias i8* @target
; llvm-3.5: @a = alias weak i8* @target
; CHECK: @a = weak alias i8* @target

We can avoid duplicating the entire file for some small changes that really doesn’t affect bitcode.

Steven

No, the idea is that in test/Features/compatibility-3.6.ll we will have

; The file test/Features/Inputs/compatibility-3.6.bc was created by
passing this file to llvm-as-3.6.

and the test itself would just use llvm-dis and FileCheck, not llvm-as.

In other words, the file compatibilty-X.Y.ll contains text that can be
assembled by llvm-as from the X.Y release, not current trunk. We would
keep the text mostly for documentation and if someone ever wants to
verify that Inputs/compatibility-X.Y.bc is correct.

Cheers,
Rafael

Yes, we don’t need to edit the assembly in the file, but we need to modified the CHECK line to reflect the output of current llvm-dis. I was talking about updating the CHECK in all the previous version. Does that make sense?

Yes, the CHECK lines would have to be updated, but that seems like a
pretty small annoyance.

Changing the assembly format means updating all tests that produce it,
including test/Transforms and clang's tests, so this would be a drop
in the ocean.

Cheers,
Rafael

Make senses. Thank Rafael for the feedback. I will proceed to finish the test.

From the discussion of bitcode backward compatibility on the list, it
seems we lack systematic way to test every existing IR features.

After reading the rest of the proposal, what you are suggesting doesn't
sound like a "systematic way to test every existing IR feature". You are
missing a key part which is to ensure that the tests you produce actually
are exhaustive. Otherwise, you are just hoping, which is not an improvement
over the current state of affairs.

It seems like it would be much better to essentially put
http://llvm.org/docs/BitCodeFormat.html (and/or all other relevant
information about what constitutes a valid bitcode module) in a
machine-readable format (e.g. in TableGen, or YAML) and use that to
generate a bitcode file that exercises all the features of interest. It
would also be used to generate a verifier which checks that the bitcode
conforms to that schema: this allows automatically catching when changes
are introduced that require updating the "schema".

-- Sean Silva

From the discussion of bitcode backward compatibility on the list, it seems we lack systematic way to test every existing IR features.

After reading the rest of the proposal, what you are suggesting doesn’t sound like a “systematic way to test every existing IR feature”. You are missing a key part which is to ensure that the tests you produce actually are exhaustive. Otherwise, you are just hoping, which is not an improvement over the current state of affairs.

I agree that my proposed test is not truly exhaustive but a truly exhaustive one will be too large to checkin and pass around. Even trying to generating tests on the fly will require two versions of LLVM for the compatibility test.
More importantly, the bitcode compatibility test is more than just encoding and decoding the bitcode format, it needs to preserve the same meaning across the versions. Other than the ability to generate exhaustive tests, there need a mechanics to tell the set of IR and equal to the other set of IR.
I thought about auto generate tests (even using doxygen to keep up with IR changes) but I cannot come up with a simple and elegant solution. If you have a work-flow in mind, please suggest.

From the discussion of bitcode backward compatibility on the list, it
seems we lack systematic way to test every existing IR features.

After reading the rest of the proposal, what you are suggesting doesn't
sound like a "systematic way to test every existing IR feature". You are
missing a key part which is to ensure that the tests you produce actually
are exhaustive. Otherwise, you are just hoping, which is not an improvement
over the current state of affairs.

I agree that my proposed test is not truly exhaustive but a truly
exhaustive one will be too large to checkin and pass around. Even trying to
generating tests on the fly will require two versions of LLVM for the
compatibility test.
More importantly, the bitcode compatibility test is more than just
encoding and decoding the bitcode format, it needs to preserve the same
meaning across the versions. Other than the ability to generate exhaustive
tests, there need a mechanics to tell the set of IR and equal to the other
set of IR.
I thought about auto generate tests (even using doxygen to keep up with IR
changes) but I cannot come up with a simple and elegant solution. If you
have a work-flow in mind, please suggest.

I think you misunderstand what I'm saying. Your original proposal was for
you to laboriously generate a file by hand. I'm asking: why not generate it
programmatically and also get other benefits, like being able to generate a
verifier?

-- Sean Silva

I’ve committed a set of precisely this kind of tests earlier this year, based off LLVM 3.2.
They’re in test/Bitcode. (e.g. test/Bitcode/global-variables.3.2.ll), the commits are 197340, 197873, 202262 and 202647 if you want the whole list.

Does that cover what you want, more or less?
The coverage isn't complete, but there are a few more test that I still have laying around that I unfortunately never got around to committing, I'll do it later this week.

Thanks Michael. I notice your tests which is a good place to start. My attached test file is just an example for myself to test how to cleanly layout all the tests in one file (or if possible).
I think I mainly want to improve two things. One is a systematic way to test bitcode from a certain LLVM version. That is the intention behind merge all the tests into one file.
The other one is to test the compatibility of all intrinsics which is missing.
If you have more tests that you want to commit, please do so because they can help me a lot!

Hi Rafael,

  I have a quick question for you. First of all I am not very familiar with this code, so....

Before this change:

[llvm] r212349 - Implement LTOModule on top of IRObjectFile
http://comments.gmane.org/gmane.comp.compilers.llvm.cvs/195450

LTOModule::parseSymbols used to explicitly add global aliases as defined symbols:

  // add aliases
for (const auto &Alias : IRFile->aliases())
    addDefinedDataSymbol(Alias);

After your patch I do not seem to see an explicit handling of aliases... I would naively expect something like this:

for (auto &Sym : IRFile->symbols()) {
.....
    if (isa<GlobalVariable>(GV)) {
      addDefinedDataSymbol(Sym);
      continue;
    }
    
    if (isa<GlobalAlias>(GV)) {
      addDefinedDataSymbol(Sym);
      continue;
    }

}

Is this an oversight or a design decision? If this is the later, when/where aliases should be handled?
Thanks a lot.

Sergei

It looks almost like that:

557 if (F) {
558 addDefinedFunctionSymbol(Sym);
559 continue;
560 }
561
562 if (isa<GlobalVariable>(GV)) {
563 addDefinedDataSymbol(Sym);
564 continue;
565 }
566
567 assert(isa<GlobalAlias>(GV));
568 addDefinedDataSymbol(Sym);

Cheers,
Rafael

Sorry for sending the mail again. Including LLVMdev this time.

Hi

I have chance to work more on this topic. I was experimenting the idea of auto-generating all the IR tests because the amount of intrinsics and their variations exist in LLVM.
My attempt is to write a C++ bitcode generator that iterate through all the TableGen and enum structures in the LLVM to generate all the bitcode using LLVM APIs. This should be an easy way to generate a bitcode test that is up-to-date on top of the trunk. I have a first working version taking care of intrinsics, globals (and most of its features) and some instructions. It also generates all the CHECKs at the same time. However there are quite some problems around this approach. First of all, informations encoded in the TableGens for intrinsics are not precise enough to generate good tests and some of them are even wrong (but not exposed, I will bring up the details in the end of the email). Second, the part I finished is the “easy” part in which lots of tests can be generated in batch. The rest of the test generation will require more coding and more careful planning. So before I spend lots of time to write up more features, I would like to have some feedback from the list.

I am not sure how many people support writing such tool and have it in the trunk. It uses LLVM API which is also subject to change (but also a pain to change). The benefit is to have a IR tests that is up-to-date and we can easily acquire a bitcode test from any point of the history. When we make a release, we can simply run this tool to have a frozen version of Bitcode in the repo. It is also good in testing the robustness of the API which allows my to patch few bugs along the way. If people like to see this tool, I can start a review soon for the part I finished and try to get it in the trunk. I might need some help to finish the rest, because all the recent changes that involves bitcode. Otherwise, I will finish my bitcode test file by hand written the rest of the tests, combine them with my current output and what Michael contributed earlier.

For people who interested in why intrinsics TableGen information is not precise and accurate, the main reason is that we simply ignores all type specified with LLVMAnyPointerType. LLVMAnyPointerType essentially support all types and encode the name into the function name. I found the original intention to add LLVMAnyPointerType is to specify a pointer of certain type in any address space, but it seems never care about the type. What makes it worse is that, since the verifier is not checking its type, many intrinsics actually has the wrong type definition. For example, int_aarch_neon_st2 is defined as (llvm_anyvector_ty, LLVMMatchType<0>, LLVMAnyPointerType<LLVMMatchType<0>>), but in reality, it has type like (v8i8, v8i8, i8*) instead of (v8i8, v8i8, v8i8*). I write up a patch myself to check all types in LLVMAnyPointerType and many regression tests failed. I cannot find a way to fix all the testcase failures without breaking the bitcode compatibility. I currently generate all of the variation of them since they are “valid” bitcode for current version.

Steven

It sounds like the Android RenderScript guys have the most in-the-trenches experience with bitcode incompatibilities. Stephen Hines (CC’d), what sorts of incompatibilities have you guys seen during the 3.x timeline? Would Steven Wu’s proposal catch the sorts of incompatibilities that you guys have seen?

– Sean Silva