Proposal for an ABI testsuite for clang

Hi,

At the SN Systems division of Sony, we have developed an IA64 ABI test-suite for clang/llvm based on the lit framework. We would like to submit this to the LLVM community. The test-suite currently supports clang in both LP64/x86-64 and ILP32/x86 targets, with the ability of adding others. The tests perform target-side execution and work with both cross and native targets.

Please find attached a pdf document that describes the design of this test-suite. In summary, the test-suite covers:
· struct layout rules, including bit-fields,
· Object layout, base classes, vtables, VTTs, construction vtables,
· Name mangling,
· Contents of typeinfo variables,
· Array cookies,
· and a few other items.

Please consider accepting this test-suite as an llvm project. We look forward to addressing any comments and questions you have on this topic.

Best regards,

Sunil Srivastava

SN Systems / Sony Computer Entertainment

Design-of-ABI-Testsuite-7-2.pdf (454 KB)

Hi,

At the SN Systems division of Sony, we have developed an IA64 ABI test-suite

This word IA64 - I do not think it means what you think it means :slight_smile:

for clang/llvm based on the lit framework. We would like to submit this to
the LLVM community. The test-suite currently supports clang in both
LP64/x86-64 and ILP32/x86 targets, with the ability of adding others. The
tests perform target-side execution and work with both cross and native
targets.

Sounds pretty cool. Do you have the code anywhere? Does where does it
fit in? What do the integration patches look like to have it run by
default?

-eric

IA64 is Itanic…Err “Itanium” which != x86_64. x86_64 was created by AMD, not Intel.

But the ABI is called the Itanium ABI, is it not?

–paulr

From: "Sunil Srivastava" <sunil_srivastava@playstation.sony.com>
To: cfe-dev@cs.uiuc.edu
Sent: Wednesday, July 2, 2014 5:51:20 PM
Subject: [cfe-dev] Proposal for an ABI testsuite for clang

Hi,

At the SN Systems division of Sony, we have developed an IA64 ABI
test-suite for clang/llvm based on the lit framework. We would like
to submit this to the LLVM community.

I am strongly in favor of you submitting this for code review as soon as you can. A good ABI test suite is something we definitely should have.

-Hal

It seems like the primary benefit of this test suite is that it allows verification that two different C++ compilers are functionally equivalent, because it’s completely LLVM IR agnostic.

I don’t think these are the kinds of tests that developers will want to write when implementing new features of the C++ language (say, variable templates). For that, we already have Clang IRGen tests, which are easier to work on incrementally, rather than these tests, which appear to be more data-driven.

Still, this would be great. :slight_smile:

Since the goal is compatibility with a previous version, why does the test suite itself contain the “right answers” that it checks against? Shouldn’t the “right answers” be determined by a “golden master” version of the toolchain?

Roughly what I’m saying is instead of “check(…, expected)” instead do “print(…)”. Then compare what is printed with the output produced by the program when compiled with the reference toolchain. This approach seems more useful, since:

a) it tests what we actually care about, which is that the code is compatible with a reference toolchain. At the end of the day, compatibility with existing compiled code is more important than actually sticking to the ABI specification.

b) it lets you use whatever baseline you want. E.g. possibly some internal toolchain that unfortunately shipped with a bug; or the MSVC toolchain, which would be immediately useful for the folks working on MS ABI compatibility. In fact, this test suite could easily be used to create an instant checklist of incompatibilities between any two compilers that claim to be compatible (not even from the same vendor).

c) Allows expansion to any other target or ABI by simply expanding the set of “every possible …” to cover any extra details relevant to the new target or ABI. Hence the test suite is naturally target and ABI agnostic. Adding new checks for a different ABI automatically strengthens the checks for all ABI’s.

As an example of c), consider the following: Suppose that to extend the test suite to MS ABI, you find that you need to generate a new code construct that exercises a particular aspect of the MS mangling which the existing Itanium cases didn’t cover. If you compile this construct with two different Itanium ABI compilers and the mangled names differ, then that is an incompatibility.

In this sense, I think the goal of the ABI compatibility test suite should be to collect/generate source code constructs that exercise a particular dimension along which two C++ compilers could have an ABI incompatibility. This is mostly agnostic to the target or the compiler being used (a small amount of stuff would need some target hooks, like getting a mangled symbol name, but the bulk should be portable and platform agnostic).

I think that what you guys have already done (the steps you list under “A set of ‘interesting’ classes for the test-suite is generated by:”) is excellent progress in this direction of collecting examples that exercise dimensions of potential incompatibility since by construction the Itanium ABI (or any ABI spec) tries to cover all the relevant dimensions which could cause an ABI incompatibility.

– Sean Silva

From: "Sean Silva" <chisophugis@gmail.com>
To: "Sunil Srivastava" <sunil_srivastava@playstation.sony.com>
Cc: cfe-dev@cs.uiuc.edu
Sent: Wednesday, July 2, 2014 10:58:18 PM
Subject: Re: [cfe-dev] Proposal for an ABI testsuite for clang

Since the goal is compatibility with a previous version, why does the
test suite itself contain the "right answers" that it checks
against? Shouldn't the "right answers" be determined by a "golden
master" version of the toolchain?

Roughly what I'm saying is instead of "check(..., expected)" instead
do "print(...)". Then compare what is printed with the output
produced by the program when compiled with the reference toolchain.
This approach seems more useful, since:

a) it tests what we actually care about, which is that the code is
compatible with a reference toolchain. At the end of the day,
compatibility with existing compiled code is more important than
actually sticking to the ABI specification.

I understand why you say this, and comparing to a reference toolchain is also valuable, but having checks against the ABI specification itself is independently valuable. This is especially true when the Clang/LLVM toolchain *is* the vendor-provided reference toolchain. I support testing both modes, but I really like the testing-against-the-ABI-spec aspect of what was proposed. This would be a unique contribution to open-source compiler development in general, let alone LLVM.

-Hal

> From: "Sean Silva" <chisophugis@gmail.com>
> To: "Sunil Srivastava" <sunil_srivastava@playstation.sony.com>
> Cc: cfe-dev@cs.uiuc.edu
> Sent: Wednesday, July 2, 2014 10:58:18 PM
> Subject: Re: [cfe-dev] Proposal for an ABI testsuite for clang
>
>
>
>
> Since the goal is compatibility with a previous version, why does the
> test suite itself contain the "right answers" that it checks
> against? Shouldn't the "right answers" be determined by a "golden
> master" version of the toolchain?
>
>
>
> Roughly what I'm saying is instead of "check(..., expected)" instead
> do "print(...)". Then compare what is printed with the output
> produced by the program when compiled with the reference toolchain.
> This approach seems more useful, since:
>
>
> a) it tests what we actually care about, which is that the code is
> compatible with a reference toolchain. At the end of the day,
> compatibility with existing compiled code is more important than
> actually sticking to the ABI specification.

I understand why you say this, and comparing to a reference toolchain is
also valuable, but having checks against the ABI specification itself is
independently valuable. This is especially true when the Clang/LLVM
toolchain *is* the vendor-provided reference toolchain. I support testing
both modes, but I really like the testing-against-the-ABI-spec aspect of
what was proposed. This would be a unique contribution to open-source
compiler development in general, let alone LLVM.

The two things aren't mutually exclusive. If you want to check against the
spec with the approach I suggested, then just generate an output file which
is what you think is correct according to the spec and use that as a
baseline (instead of generating the baseline output file by compiling the
code in the test suite with a reference compiler).

-- Sean Silva

I do hope that the intent is not to ensure that clang has bug-for-bug compatibility with earlier versions. We’ve fixed many bugs over time which would necessitate ABI breaks from earlier versions of clang.

Sean,

I have to strongly disagree with you up there. Testing the ABI is more
important than testing compatibility with other compilers. IF the
other compiler doesn't follow the ABI, it should be fixed, that's a no
brainer, and any decent compiler community will not argue against ABI
breakages. Testing against other compilers is *also* important, but
not to the detriment of ABI checking.

Also, checking the ABI against expected results make it explicit, and
helps a lot on debugging problems when they happen. An expected output
can have comments pointing out the ABI paragraph and document, as well
as having some regular expression / pattern matching structure that
"explains" the document in a programming-friendly way. This is
invaluable, and we don't normally have that because people don't
normally have time to write it. If someone did this, we should not
discard and request a poorer check.

But checking against another compiler using patterns and documentation
won't work, because there isn't any, just the good old diff. In that
case, I agree, diffing against a golden standard is the way to go, and
I also agree we should have that *in addition* to ABI tests.

My tuppence.

--renato

>> > a) it tests what we actually care about, which is that the code is
>> > compatible with a reference toolchain. At the end of the day,
>> > compatibility with existing compiled code is more important than
>> > actually sticking to the ABI specification.
>>
>> I understand why you say this, and comparing to a reference toolchain is
>> also valuable, but having checks against the ABI specification itself is
>> independently valuable. This is especially true when the Clang/LLVM
>> toolchain *is* the vendor-provided reference toolchain. I support
testing
>> both modes, but I really like the testing-against-the-ABI-spec aspect of
>> what was proposed. This would be a unique contribution to open-source
>> compiler development in general, let alone LLVM.
>
>
> The two things aren't mutually exclusive. If you want to check against
the
> spec with the approach I suggested, then just generate an output file
which
> is what you think is correct according to the spec and use that as a
> baseline (instead of generating the baseline output file by compiling the
> code in the test suite with a reference compiler).

Sean,

I have to strongly disagree with you up there. Testing the ABI is more
important than testing compatibility with other compilers.

They are both important. My suggestion addresses both of them (see below).
Think of my suggestion as a primitive that can be used for both purposes
(see at the end of this post also).

However, in the OP's PDF, it says that what actually motivated this work is
consistency with previous compilers: "In some industries, *consistency is
more important than correctness*. Hence the need for a test-suite to *guard
against changes*." (emphasis mine)
I am not involved with this project internally at Sony, but I know that an
extremely important use case for our clients is binary-only libraries.
Hence ensuring ABI compatibility with previous platform compilers is
important.

The ability to use a "spec" baseline or a "golden master" baseline by just
swapping the baseline file is key to reconciling the two conflicting
desires:
1. the open source project wants to be as "by-the-spec" ABI-compliant as
possible
2. maintainers of platform compilers want to ensure that their platform
compilers are compatible with previous code compiled for the platform.

Since it seems that #2 is the thing that actually motivated this work, I
think it is important to take that into account.

Also keep in mind that the test-suite generator program that generates the
"right answers by the spec" is also subject to human fallibility in the
same way as the the ABI code and tests inside clang itself (although the
narrower focus of the generator might reduce the chance of error). The
extent to which this test suite actually provides an independent check on
Clang's ABI correctness depends on how it was developed; I would be
interested to know:

- how many times during the development of this test suite clang's behavior
diverged from the "expected" behavior as computed by the generator program,
and the generator program was later found to be in error.

- (less likely) how many times the generator program produced a test that
happened to align with a "by-the-spec incorrect" bug in clang, and the
agreement was interpreted as both being correct. (we can't really know this
number, of course)

- what would happen if this test suite does indeed find a bug with a
previous reference toolchain? Would that necessitate an internal fork of
the test suite that has a deliberately "incorrect" (by the spec) test?

I think that checking against another independently developed compiler
targeting the same ABI will provide at least as much of an independent
check on Clang's correctness as a written-from-scratch generator program,
if not moreso. Of course, checking two compilers against a generator
transitively gives you that benefit, but is less accommodating of the
requirement of checking actual code compatibility against a given baseline
(that may have a bug; and would you bet that it doesn't?).

IF the
other compiler doesn't follow the ABI, it should be fixed, that's a no
brainer, and any decent compiler community will not argue against ABI
breakages. Testing against other compilers is *also* important, but
not to the detriment of ABI checking.

Also, checking the ABI against expected results make it explicit, and
helps a lot on debugging problems when they happen. An expected output
can have comments pointing out the ABI paragraph and document, as well
as having some regular expression / pattern matching structure that
"explains" the document in a programming-friendly way.

I don't think that what you are asking for is mutually exclusive with what
I'm suggesting. I think you're imagining the baseline output as just a flat
file containing a bunch of meaningless numbers printed in it. Realistically
each item printed in the output file would likely be keyed on a unique
identifier of some sort, and it would simply be a matter of going to the
place in the test file that contains that unique identifier. The citation
comments that you are suggesting could reside on the source code in exactly
the way you suggest. I'm not sure what you mean by '"explains" the document
in a programming-friendly way' so I can't comment on that.

-- Sean Silva

Hi Sunil,

looks cool. The included test file example looks “llvm-y” to me :slight_smile: A few questions:

1.) How much of your 400 test files is from the manual abi method, how much from your test case generator?
2.) Test runtime probably scales fairly well with the number of cores, right? You say it takes one hour to run on one core, so an 8-core system should take less than 10 minutes?
3.) How many bugs did you find with this so far?

(I tend to be somewhat cautious about programs that generate exhaustive tests and then compare the results to golden files. In my experience, they tend to take a long time to run since they generate many “not interesting” test cases, and when changing something one has to update many golden files and then it’s easy to miss undesired changes. For an ABI test suite, the golden files will hopefully not change often though.

I like the approach taken during clang/win bringup: They wrote fuzzers to generate lots of test cases, and then checked in hand-reduced versions of the test cases that turned out to be interesting, using the regular clang testing architecture.)

Nico

That approach was unreasonably effective. At this point I'm very confident
that we make the same record layout and vtable layout decisions as MSVC
2013. When the next release is out, it will be easy to see if anything
changed. :slight_smile:

Hi,

At the SN Systems division of Sony, we have developed an IA64 ABI
test-suite for clang/llvm based on the lit framework. We would like to
submit this to the LLVM community.

We are ready to submit first set of test files. I would like some feedback as to where in the tree it should go; how should the review process go.. etc.

I am quoting the README that gives more details giving the structure of the testsuite and instructions to run it.

Hi,

At the SN Systems division of Sony, we have developed an IA64 ABI
test-suite for clang/llvm based on the lit framework. We would like to
submit this to the LLVM community.

We are ready to submit first set of test files. I would like some feedback as to where in the tree it should go; how should the review process go.. etc.

Personally I think a subdirectory of projects/test-suite would be
good. It would be nice if all of this could be abstracted around make
check in some way fitting in there nicely. I.e. when I run the
testsuite via make TEST=simple check it'll automagically run the ABI
testsuite as well. This isn't particularly ideal given that the rest
of the testsuite is more designed around performance etc.

Others may have other opinions here. I'll CC a couple of them.

-eric

From: "Eric Christopher" <echristo@gmail.com>
To: "Sunil Srivastava" <sunil_srivastava@playstation.sony.com>, "Bob Wilson" <bob.wilson@apple.com>, "Daniel Dunbar"
<daniel@zuster.org>, "Chris Matthews" <chris.matthews@apple.com>
Cc: cfe-dev@cs.uiuc.edu
Sent: Friday, July 18, 2014 12:37:45 PM
Subject: Re: [cfe-dev] FW: Proposal for an ABI testsuite for clang

> Hi,
>
>> At the SN Systems division of Sony, we have developed an IA64 ABI
>> test-suite for clang/llvm based on the lit framework. We would
>> like to
>> submit this to the LLVM community.
>
> We are ready to submit first set of test files. I would like some
> feedback as to where in the tree it should go; how should the
> review process go.. etc.
>

Personally I think a subdirectory of projects/test-suite would be
good. It would be nice if all of this could be abstracted around make
check in some way fitting in there nicely. I.e. when I run the
testsuite via make TEST=simple check it'll automagically run the ABI
testsuite as well. This isn't particularly ideal given that the rest
of the testsuite is more designed around performance etc.

Others may have other opinions here. I'll CC a couple of them.

Given that these are essentially regression tests, and use lit (just like all of our other regression tests), why would we not put them into test/ABI (or similar) along with Clang's other regression tests?

-Hal

Was under the impression there were execution tests in there, if not,
then I agree that test/ABI would make the most sense. The ability to
do things via gcc for it would make a bit more sense to be in
projects/test-suite.

-eric

From: "Eric Christopher" <echristo@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: cfe-dev@cs.uiuc.edu, "Sunil Srivastava" <sunil_srivastava@playstation.sony.com>, "Bob Wilson"
<bob.wilson@apple.com>, "Daniel Dunbar" <daniel@zuster.org>, "Chris Matthews" <chris.matthews@apple.com>
Sent: Friday, July 18, 2014 2:24:21 PM
Subject: Re: [cfe-dev] FW: Proposal for an ABI testsuite for clang

Was under the impression there were execution tests in there, if not,
then I agree that test/ABI would make the most sense. The ability to
do things via gcc for it would make a bit more sense to be in
projects/test-suite.

I think they might be, but we have lit execution regression tests for the saniitizers in compiler-rt, so I don't see why we can't copy the relevant lit infrastructure and have them there as well?

-Hal

From: "Eric Christopher" <echristo@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: cfe-dev@cs.uiuc.edu, "Sunil Srivastava" <sunil_srivastava@playstation.sony.com>, "Bob Wilson"
<bob.wilson@apple.com>, "Daniel Dunbar" <daniel@zuster.org>, "Chris Matthews" <chris.matthews@apple.com>
Sent: Friday, July 18, 2014 2:24:21 PM
Subject: Re: [cfe-dev] FW: Proposal for an ABI testsuite for clang

Was under the impression there were execution tests in there, if not,
then I agree that test/ABI would make the most sense. The ability to
do things via gcc for it would make a bit more sense to be in
projects/test-suite.

I think they might be, but we have lit execution regression tests for the saniitizers in compiler-rt, so I don't see why we can't copy the relevant lit infrastructure and have them there as well?

We could, I suppose, but it's been a hard rule that execution tests
weren't in those testsuites up to now. Do we have a reason to put them
there? I don't really have a firm opinion here, but very few people
had replied otherwise.

-eric