C++ Module failures on ARM

Folks,

I have been trying to get this buildbot green for a while...

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/

but for some reason, those four tests fail:

FAIL: Clang::fmodules-validate-once-per-build-session.c
FAIL: Clang::prune.m
FAIL: Clang::validate-system-headers.m
FAIL: LLVM::archive-update.test

I have no idea why, since the same build on other ARM boards work perfectly.

I have completely removed the buildbot directory, re-created,
restarted, and still the same four errors occur. I have updated the
system, removed spurious packages, moved the version of all relevant
tools (compiler, cmake, ninja, etc) to the other bots, for instance:

http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15

which is green ever since. I also run NEON and non-NEON ARMv7 tests on
my test box and all tests pass.

It seems that the problem is with versions of the modules that get
included, for example:

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/4811/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Aarchive-update.test

archive-update.test:34:8: error: expected string not found in input
NEWER: newer
       ^
<stdin>:1:1: note: scanning from here
older
^

Does anyone have any idea on how to fix this issue? I don't think it's
in Clang itself, but I can't think of anything else to do in the
machine... My last resort will be to kill that machine and re-install
from scratch, but that would be to admit failure. :slight_smile:

cheers,
--renato

PS: Those problems occurred after a power cut, which left all my bots
in a terrible state. All others recuperated well, not this...

These all look like something is going wrong with timestamps in your filesystem.

The dates on the files seem fine… 01/01/2000 for the old, current for the new.

And if I repeat the process manually, it works on that board. I’m thinking there’s some concurrency issue that don’t show in the other machines, but one that is consistent, as it happens every time.

I’ll stop that bot and build manually. Thanks for looking!

The dates on the files seem fine... 01/01/2000 for the old, current for
the new.

And if I repeat the process manually, it works on that board. I'm thinking
there's some concurrency

Odd. I'd guess that archive-update.test is going to be the easiest of these
to debug. Is it the final command that's failing? Can you find out what
timestamp llvm-ar thinks the relevant files have?

Ha! Found it, thanks!!

I managed to get a script that always fails by fudging the
archive-update.test.script with dumps.

The output of a normal run is:

  File: ‘archive-update.test.tmp.newer/evenlen’
  Size: 6 Blocks: 8 IO Block: 4096 regular file
Device: 804h/2052d Inode: 32638159 Links: 1
Access: (0600/-rw-------) Uid: ( 1000/ linaro) Gid: ( 1000/ linaro)
Access: 2015-04-17 11:09:21.529709000 +0100
Modify: 2015-04-17 11:09:21.529709000 +0100
Change: 2015-04-17 11:09:21.529709000 +0100
Birth: -
  File: ‘archive-update.test.tmp.older/evenlen’
  Size: 6 Blocks: 8 IO Block: 4096 regular file
Device: 804h/2052d Inode: 32768202 Links: 1
Access: (0600/-rw-------) Uid: ( 1000/ linaro) Gid: ( 1000/ linaro)
Access: 2000-01-01 00:00:00.000000000 +0000
Modify: 2000-01-01 00:00:00.000000000 +0000
Change: 2015-04-17 11:09:21.519709000 +0100
Birth: -

while the broken machine has:

  File: ‘archive-update.test.tmp.newer/evenlen’
  Size: 6 Blocks: 8 IO Block: 4096 regular file
Device: 804h/2052d Inode: 28579614 Links: 1
Access: (0664/-rw-rw-r--) Uid: ( 1000/ linaro) Gid: ( 1000/ linaro)
Access: 1970-01-17 02:07:14.312737000 +0100
Modify: 1970-01-17 02:07:14.312737000 +0100
Change: 1970-01-17 02:07:14.312737000 +0100
Birth: -
  File: ‘archive-update.test.tmp.older/evenlen’
  Size: 6 Blocks: 8 IO Block: 4096 regular file
Device: 804h/2052d Inode: 28573869 Links: 1
Access: (0664/-rw-rw-r--) Uid: ( 1000/ linaro) Gid: ( 1000/ linaro)
Access: 2000-01-01 00:00:00.000000000 +0000
Modify: 2000-01-01 00:00:00.000000000 +0000
Change: 1970-01-17 02:07:14.302737000 +0100

Note the peace lover machine really loves the 70's... :slight_smile:

I think it has to do with the fact that the script uses && between
statements and the filesystem only updates the changes after all of
them are done, unless you touch with a specific date and time, it
seems. When I run it manually, it works because the file is, then,
with the correct timestamp.

All in all, broken file system. I've seen this before in this machine
and I think I know how to fix it.

Thanks for the help!
--renato