Newbee question: LLVM backend regression tests for thumb1 targets on simulator possible?

Hello,

as a newbee, I'd appreciate some support on regression test setup.

Specifically, I am interrested in the feature of tail call optimizations for
the ARM v6m targets. This feature currently seems to be completely
deactivated at the moment (v6m being based on thumb1 ?!). According to my
code-reading, this feature will involve some modifications in epilogue
generation.

My work on a gcc backend did show me that for a beginner like me, it is rather
likely that the first attempts will break something.

Thus, being completely new to the llvm project I think that it's
essential to first establish a suitable regression test setup. My plan
is to run the tests for the v4t platform in thumb-only mode where free
simulators are available and which behaves (at least with respect to the
epilogue) very much like v6m.

From former work on gcc I'm used to the possibilty to run regression
tests in a simulator for the target platform (there with the expect
script mechanism). In the documentation on the web, I did, however, not
yet find information on a comparable feature for llvm.

Therefore the specific questions:

- Is there a mechanism to run backend testsuite runs in a simulation
framework and if yes, could you give me a hint on how to find information
- If not, how is regression testing for the thumb1 targets currently
implemented?

Yours,

Björn Haase

P.S.: Of course, any hint with respect to sibling call optimization
would also be appreciated in a second step. I actually already found some helpful
information in the sources.

Hello,

as a newbee, I'd appreciate some support on regression test setup.

Specifically, I am interrested in the feature of tail call optimizations for
the ARM v6m targets. This feature currently seems to be completely
deactivated at the moment (v6m being based on thumb1 ?!). According to my
code-reading, this feature will involve some modifications in epilogue
generation.

My work on a gcc backend did show me that for a beginner like me, it is rather
likely that the first attempts will break something.

:slight_smile:

Thus, being completely new to the llvm project I think that it's
essential to first establish a suitable regression test setup. My plan
is to run the tests for the v4t platform in thumb-only mode where free
simulators are available and which behaves (at least with respect to the
epilogue) very much like v6m.

Why not test your v6m changes on a Cortex-m0 QEMU? Semihosted applications are pretty easy to get going, something like:

$ qemu-system-arm -semihosting -M integratorcp -cpu cortex-m0 -kernel a.out

Thumv4t support is a bit spotty in llvm as it's not very well tested. The problem I run into most (and fixed several times), is that thumbv4t doesn't have a lo->lo mov instruction that doesn't clobber cpsr. That being said, it does work for the most part... just something to watch out for.

From former work on gcc I'm used to the possibilty to run regression
tests in a simulator for the target platform (there with the expect
script mechanism). In the documentation on the web, I did, however, not
yet find information on a comparable feature for llvm.

Therefore the specific questions:

- Is there a mechanism to run backend testsuite runs in a simulation
framework and if yes, could you give me a hint on how to find information

Not that I know of, no.

I routinely run the libc++ & libc++abi test suites in QEMU, but that's more of a whole-toolchain test, rather that just a backend test. You can get surprisingly far without simulator testing, so for most things, LLVM doesn't use it for testing. This makes it easier for any random developer to check out the llvm sources, build them, and run the testsuite, and not have to have simulators installed for all the different backends.

- If not, how is regression testing for the thumb1 targets currently
implemented?

We use the LIT framework http://llvm.org/docs/CommandGuide/lit.html to test the compiler. Mostly this means feeding in llvm IR, and using FileCheck http://llvm.org/docs/CommandGuide/FileCheck.html (which is basically a glorified grep) to verify that the assembly produced is the same as what the test expects to be generated.

To run these tests, use the 'make check-all' target from the build directory.

Cheers,

Jon

Why not test your v6m changes on a Cortex-m0 QEMU? Semihosted applications are pretty easy to get going, something like:

$ qemu-system-arm -semihosting -M integratorcp -cpu cortex-m0 -kernel a.out

Thank you, Jonathan, for your reply. I will be checking this out. To my knowledge there was no free simulator for cortex M0! Great to see that there *is* an option.

Thumv4t support is a bit spotty in llvm as it's not very well tested. The problem I run into most (and fixed several times), is that thumbv4t doesn't have a lo->lo mov instruction that doesn't clobber cpsr. That being said, it does work for the most part... just something to watch out for.

OK, good to know this. I considered using v4t only because I assumed that licensing prohibited free simulators for v6m.

I routinely run the libc++ & libc++abi test suites in QEMU, but that's more of a whole-toolchain test, rather that just a backend test. You can get surprisingly far without simulator testing, so for most things, LLVM doesn't use it for testing. This makes it easier for any random developer to check out the llvm sources, build them, and run the testsuite, and not have to have simulators installed for all the different backends.

- If not, how is regression testing for the thumb1 targets currently
implemented?

We use the LIT framework http://llvm.org/docs/CommandGuide/lit.html to test the compiler. Mostly this means feeding in llvm IR, and using FileCheck http://llvm.org/docs/CommandGuide/FileCheck.html (which is basically a glorified grep) to verify that the assembly produced is the same as what the test expects to be generated.

OK, I will look after this, It might take me some time to understand the mechanisms.

Thank's again for your support.

Björn.

I've been wondering too about how to get better ARM v6m compile-and-execute
testing going.

As you say Jon, the non-execution-based regression tests are surprisingly
good at catching issues; but they're no full substitute for executing the
code produced by the backend for a reasonably-sized test suite.

If somehow it would be possible to compile and run the LLVM test-suite
for v6m, I think that would be a good step forward. It would also allow
to get a buildbot going without too much effort continuously checking
basic correctness of v6m code generation.

My guess is that the biggest hurdle would be to get linux or a similar
operating system going on a v6m/thumb1 simulator. Does anyone have an idea
if this is feasible or completely impossible?

Thanks,

Kristof

IIRC, any linux will be requiring some MMU unit? If I understand correctly, the v6m instruction set is just a binary compatible subset of v7. So what about generating code with restrictions for v6m and running tests on some linux on, say, A15 machine?

Björn.

I'm working on a baremetal cross toolchain, and I would like to set up such a buildbot. There are several pieces needed before that can happen though, and the important one is remote testing support in LIT (which I'm working on)... in due time :slight_smile:

Jon

I'm working on a baremetal cross toolchain, and I would like to set up such a buildbot. There are several pieces needed before that can happen though, and the important one is remote testing support in LIT (which I'm working on)... in due time :slight_smile:

Jon

This would be perfect.

Actually, you suggested to use

> $qemu-system-arm -semihosting -M integratorcp -cpu cortex-m0 -kernel a.out

You have some special version of qemu? My default distribution's qemu package did not have a cortex-M0 nor has qemu head freshly taken from git:

$ qemu-arm -cpu ?
Available CPUs:
   arm926 arm946 arm1026 arm1136 arm1136-r2 arm1176 arm11mpcore cortex-m3 cortex-a8
   cortex-a8-r2 cortex-a9 cortex-a15 ti925t pxa250 sa1100 sa1110 pxa255 pxa260
   pxa261 pxa262 pxa270 pxa270-a0 pxa270-a1 pxa270-b0 pxa270-b1 xa270-c0
   pxa270-c5 any

Still your procedure will work, just by using the m3 instead :-).

Yours,

Björn.

I'm working on a baremetal cross toolchain, and I would like to set up such a
buildbot. There are several pieces needed before that can happen though, and
the important one is remote testing support in LIT (which I'm working on)...
in due time :slight_smile:

Jon

This would be perfect.

Actually, you suggested to use

> $qemu-system-arm -semihosting -M integratorcp -cpu cortex-m0 -kernel a.out

You have some special version of qemu? My default distribution's qemu package
did not have a cortex-M0 nor has qemu head freshly taken from git:

Oh, maybe I'm thinking of an internal build of qemu. Sorry about that.

$ qemu-arm -cpu ?
Available CPUs:
   arm926 arm946 arm1026 arm1136 arm1136-r2 arm1176 arm11mpcore cortex-m3
cortex-a8
   cortex-a8-r2 cortex-a9 cortex-a15 ti925t pxa250 sa1100 sa1110 pxa255 pxa260
   pxa261 pxa262 pxa270 pxa270-a0 pxa270-a1 pxa270-b0 pxa270-b1 xa270-c0
   pxa270-c5 any

Still your procedure will work, just by using the m3 instead :-).

Yeah, that should work for the most part, unless you emit thumb2 instructions (which will work on cortex-m3, but not m0). The list of them is pretty small though IIRC, so maybe you can write a script that disassembles & checks for them.

> $ qemu-arm -cpu ?
> Available CPUs:
> arm926 arm946 arm1026 arm1136 arm1136-r2 arm1176 arm11mpcore
> cortex-m3
> cortex-a8
> cortex-a8-r2 cortex-a9 cortex-a15 ti925t pxa250 sa1100 sa1110
pxa255 pxa260
> pxa261 pxa262 pxa270 pxa270-a0 pxa270-a1 pxa270-b0 pxa270-b1 xa270-
c0
> pxa270-c5 any
>
> Still your procedure will work, just by using the m3 instead :-).
Yeah, that should work for the most part, unless you emit thumb2
instructions (which will work on cortex-m3, but not m0). The list of
them is pretty small though IIRC, so maybe you can write a script that
disassembles & checks for them.

Cortex-M0 implements the v6m architecture.
Cortex-M3 implements the v7m architecture.

Having had a quick look at the v6m and the v7m ARMARM's, next to the
extra instructions v7m supports, I think the main other difference
that's very relevant for a code generator is that v6m always generates
a fault when an unaligned access occurs, whereas v7m can support unaligned
accesses for many of the load and store instructions. I think it'd
be important to set the CCR register on the v7m simulation so that
it always generates an alignment fault in case there's an unaligned
access, i.e. setting CCR.UNALIGN_TRP to 1.

To check that clang indeed only produces v6m instructions when
telling it to target Cortex-m0, I think we could rely on LLVM's
MC layer correctly modelling if an instruction is part of v6m or not,
as our internal MC Hammer test suite[1] has finally, since about 1
month ago, started passing completely for the v6m architecture.
Potentially, just making sure that a build with assertions enabled
would result in an assertion failure being triggered when a non-v6m
instruction gets emitted could be good enough, as a starting point?

Thanks,

Kristof

[1] http://llvm.org/devmtg/2012-04-12/Slides/Richard_Barton.pdf

To check that clang indeed only produces v6m instructions when
telling it to target Cortex-m0, I think we could rely on LLVM's
MC layer correctly modelling if an instruction is part of v6m or not,
as our internal MC Hammer test suite[1] has finally, since about 1
month ago, started passing completely for the v6m architecture.
Potentially, just making sure that a build with assertions enabled
would result in an assertion failure being triggered when a non-v6m
instruction gets emitted could be good enough, as a starting point?

Yes. That would be better. Tim was musing about doing that the last time I ran into the v4t lo->lo copy thing.

Jon

And as as a second level of security, one may rely on the checks implemented in gas.

I’ve been wondering too about how to get better ARM v6m compile-and-execute
testing going.

As you say Jon, the non-execution-based regression tests are surprisingly
good at catching issues; but they’re no full substitute for executing the
code produced by the backend for a reasonably-sized test suite.

If somehow it would be possible to compile and run the LLVM test-suite
for v6m, I think that would be a good step forward. It would also allow
to get a buildbot going without too much effort continuously checking
basic correctness of v6m code generation.

My guess is that the biggest hurdle would be to get linux or a similar
operating system going on a v6m/thumb1 simulator. Does anyone have an idea
if this is feasible or completely impossible?

Not sure feasibility of the particular implementation, but qemu + linux works pretty well for testing other targets.

-eric

You have some special version of qemu? My default distribution's qemu
package did not have a cortex-M0 nor has qemu head freshly taken from git:

The trouble with Cortex-M0 (from a virtualised testing perspective) is
that it doesn't have an MMU, so any "real" operating system
environment will be far more artificial than we'd like, and probably
very low down on qemu's list of priorities to boot.

Testing a Thumb1 CPU with an MMU (I'm sure there are ARM11* CPUs with
MMU but before Thumb2, but would need to dig into which is closest to
v6m) would cover *most* issues, but obviously not the ways in which
Cortex-M0 is closer to v7 than v6.

Tim.

Why not test on real hardware?

https://developer.mbed.org/platforms/DipCortex-M0/

cheers,
--renato

Of course, it would not be bad to run also some limited tests on real-world hardware. However, the answer why "why not"... is in my opinion:

Because you will have big difficulties matching existing regression test code (designed for bigger targets) to a system being constrained to say 16k Flash and 4k RAM. My experience with the testsuite of gcc being run on microcontrollers (AVR in this case) is that you encounter so many pseudo failures just because the test cases don't account for the limitations of the DUT that it's a tedious work to get any useful information out of the results. Imagine, e.g. a test that uses alloca and tests for correct stack pointer re-adjustment from frame pointers and just allocate two buffers of 2k RAM.

In my opinion there is more need for a regression test framework in a simulation environment in a first step. Of course an optional, second step with tests on real-world HW would not hurt. Actually qemu does not seem to be this bad for this purpose.

Björn.

I get what you mean, and I agree with you. Testing "if it should work"
is one thing, and testing "if it actually works" is another. Plus the
difficulties of testing in real hardware that I know only too well.

Though, trying to create a simulated environment that will be relevant
for the system you care about might be as complicated as real hardware
tests, and it will never give you the guarantee that you're not
missing the point.

I would run Thumb1 tests on any Thumb1 emulator (no need to make it
look like an M0) and reduce the set of tests on a real M0 for the
things that you can actually test, and do it.

cheers,
--renato