"grep -w" irregularity

Not entirely sure how to categorize this particular problem, but it's clearly platform test related: "grep -w" appears to operate differently on the x86_64 linux buildbot versus my local Mac OS 10.4.11 and Ubuntu x86_64. In the CellSPU's shift_ops.ll test case, "grep -w shlh" returns the correct 9 expected occurances, whereas the x86_64 buildbot finds 10.

Any suggestions for a workaround, other than to ditch using "grep -w"?

-scooter

2008/12/30 Scott Michel <scottm@aero.org>

Not entirely sure how to categorize this particular problem, but it’s
clearly platform test related: “grep -w” appears to operate
differently on the x86_64 linux buildbot versus my local Mac OS
10.4.11 and Ubuntu x86_64. In the CellSPU’s shift_ops.ll test case,
“grep -w shlh” returns the correct 9 expected occurances, whereas the
x86_64 buildbot finds 10.

Does the asm output differ, or does grep output itself differ on these two platforms, with the same asm input file? I just took the .s output file from a run on an x86/Linux box, and tested it with grep on multiple systems, including the ones you list, and they all agree the answer is 9.

Can you diff the assembly files generated on those two platforms (though they should be identical, since llc specifies the architecture)?

Any suggestions for a workaround, other than to ditch using “grep -w”?

Personally, I think that testing as we have for optimization correctness via ‘grep’ should be replaced by unittests, which are much more precise.

Misha

Also, it’s possible that you have a 32/64-bit issue in the CellSPU backend. Have you tried running llc built for a 64-bit host?

-Chris

Misha:

It’s not like I can ssh into the x86_64 buildbot, so I really can’t tell. All I can tell is that I get blamed for a failed build.

-scooter

Chris:

No. It’s a difference between grep on the x86_64 buildbot vs. the rest of the world, apparently.

-scooter

Chris:

On my local x86_64 Ubuntu 7.10 machine, the shift_ops.ll is an unexpected success (i.e., “grep -w shlh %t1.s | count 9” succeeds.)

I get the same unexpected success on my x86_64 Mac 10.4.11.

On the x86_64 buildbot, the same test fails. The culprit is grep, evidently. It’s just that simple.

I suspect there’s not really an issue with endianness, since all the test does is (a) generate code using the backend, (b) grep’s for certain instructions. Nothing is actually executed.

-scooter

Chris:

On my _local_ x86_64 Ubuntu 7.10 machine, the shift_ops.ll is an unexpected success (i.e., "grep -w shlh %t1.s | count 9" succeeds.)

I get the same unexpected success on my x86_64 Mac 10.4.11.

On the x86_64 buildbot, the same test fails. The culprit is grep, evidently. It's just that simple.

Not necessarily. That builder could be getting a different .s file from LLC.

I suspect there's not really an issue with endianness, since all the test does is (a) generate code using the backend, (b) grep's for certain instructions. Nothing is actually executed.

LLC is still run. If there is a bug in the code generator, it could easily manifest itself this way. 32/64-bit portability issues, buffer overruns and other undefined behavior could easily cause this sort of thing. Please ask the owner of that builder nicely to send you the .s file that it is producing. If it is identical to the one you get then I'll believe it is a grep difference, but that doesn't sound like the most likely issue.

-Chris

Chris:

On my _local_ x86_64 Ubuntu 7.10 machine, the shift_ops.ll is an
unexpected success (i.e., "grep -w shlh %t1.s | count 9" succeeds.)

I get the same unexpected success on my x86_64 Mac 10.4.11.

On the x86_64 buildbot, the same test fails. The culprit is grep,
evidently. It's just that simple.

Not necessarily. That builder could be getting a different .s file
from LLC.

Given that I have an x86_64 machine here and the buildbot is x86_64, and given that my local x86_64 machine "unexpectedly" succeeds, I'm less inclined to suspect LLC at this point.

Granted, I'm probably not running the same Linux distribution that the buildbot is running. But still, my local Linux x86_64 box succeeds where the buildbot fails, using the same svn version (i.e., no diffs between the LLVM repo and my local copy.)

LLC is still run. If there is a bug in the code generator, it could

easily manifest itself this way. 32/64-bit portability issues, buffer
overruns and other undefined behavior could easily cause this sort of
thing. Please ask the owner of that builder nicely to send you the .s
file that it is producing. If it is identical to the one you get then
I'll believe it is a grep difference, but that doesn't sound like the
most likely issue.

I'd be completely shocked if llc were _not_ run, since the test invokes it. I'll contact the buildbot owner -- maybe it's something funky between Linux distributions (oh, now there's a total surprise!)

-scooter

The buildbot is running 8.04 of ubuntu.
This actually does appear to be a bug in grep.

The grep -w command on the buildbot produces the following output:

  shlh $3, $3, $4
  shlh $3, $4, $3
  shlh $3, $3, $4
  shlh $3, $4, $3
  shlh $3, $3, $4
  shlh $3, $4, $3
  shlh $3, $4, $3
  shlh $3, $4, $3
  .size shlhi_i16_7,.-shlhi_i16_7
  shlh $3, $4, $3

man grep says the following for -w:

  -w, --word-regexp
              Select only those lines containing matches that form whole
              words. The test is that the matching substring must either be
              at the beginning of the line, or preceded by a non-word
              constituent character. Similarly, it must be either at the end
              of the line or followed by a non-word constituent character.
              Word-constituent characters are letters, digits, and the
              underscore.

Clearly, it does not match the ending criteria since neither of the
matches on the .size line are followed by non-word constituent
characters or eol.

I've attached the .s file it produces in case you want to file a bug
against grep.

bugingrep (6.54 KB)

Also, it only happens with the locale set to en_US.UTF-8 (default on
ubuntu 8.04), but it's still a bug even in a utf-8 locale since both
shlh's are followed immediately by i, which are not "non-word
constituent characters".

Granted, I’m probably not running the same Linux distribution that

the buildbot is running. But still, my local Linux x86_64 box

succeeds where the buildbot fails, using the same svn version (i.e.,

no diffs between the LLVM repo and my local copy.)

The buildbot is running 8.04 of ubuntu.
This actually does appear to be a bug in grep.

Thanks DannyB! Scott, I guess this means we should avoid grep -w :frowning:

-Chris

* Scott Michel:

On my _local_ x86_64 Ubuntu 7.10 machine, the shift_ops.ll is an
unexpected success (i.e., "grep -w shlh %t1.s | count 9" succeeds.)

I get the same unexpected success on my x86_64 Mac 10.4.11.

On the x86_64 buildbot, the same test fails. The culprit is grep,
evidently. It's just that simple.

There have been issues the GNU libc regular expression code. Try
running with "unset LANG" (or "LC_ALL=C") and see if it improves
things.

The problem is that the regexp code used to be unacceptably slow in
multi-byte locales such as UTF-8, and the patch Debian applied to
improve its speed wasn't 100% correct.

Considering most regexps can be done in linear time, it seems fairly
dumb to break them to get speed, instead of simply changing
algorithms.
(The fact that most implementations suck badly, well ....)

I reckon that's probably a good idea. I haven't filed a bug report against a GNU tool in years.

-scooter

Even a broken clock is right twice a day? (Me being the broken clock. :slight_smile:

-scooter

* Daniel Berlin:

There have been issues the GNU libc regular expression code. Try
running with "unset LANG" (or "LC_ALL=C") and see if it improves
things.

The problem is that the regexp code used to be unacceptably slow in
multi-byte locales such as UTF-8, and the patch Debian applied to
improve its speed wasn't 100% correct.

Considering most regexps can be done in linear time, it seems fairly
dumb to break them to get speed, instead of simply changing
algorithms.

IIRC, it's not an issue of complexity classes. With multi-byte
character set conversion, the constant factor is just too large.