[3.7.0] Two late issues with cross compilation to mips

Hi,

Sorry for the late report but I’ve only just found these issues. Llvm.org isn’t working for me at the moment but I’ll file tickets once it is.

The issues are:

  1. Almabench has some significant numerical differences and fails the reference check for some configs. I’m investigating this one at the moment but early indications are that it’s a similar (but different) problem to the one we had in LLVM 3.6.2.

  2. Read-only exception tables have broken compatibility with the ~2 year old gcc toolchains I was using for release testing cross compilation. This isn’t a problem for most test-suite runs since we can just update the assembler but is causing trouble for microMIPS. More recent toolchains lack the microMIPS multilib I was using and migrating to the new one is causing link failures. These failures are related to ELF header bits specifying the SNaN/QNaN encodings to be IEEE754-1985 or IEEE754-2008 compliant. I suspect the –mnan=2008 isn’t reaching the assembler.

  3. Clang is incompatible with changes to the mips-mti-linux-gnu sysroot from Imagination’s mips-mti-linux-gnu toolchain. Libaries are still multilib’d (albeit with a reduced set) but some of the include paths aren’t anymore. It’s also no longer correct to include sysroot/include (this path is added by common code) since this skips some function definitions. Instead, we must only include sysroot/usr/include like GCC does. There may be more details but so far the fix doesn’t look simple. As far as I can tell, clang’s multilib expects includes and libraries to have the same layout (osSuffix() seems to control both). The good news is that it’s not a regression since we can use toolchains from before this layout change.

Daniel Sanders

Leading Software Design Engineer, MIPS Processor IP

Imagination Technologies Limited

www.imgtec.com

#2 has turned out to be user error combined with inconsistent leniency in the driver. –mnan=2008 strictly conforms to the documentation and is passed on for MIPS32R3 and ignored for MIPS32R2 (which didn’t have IEEE754-2008 support). However, -mmicromips is (currently) permitted for MIPS32R2 even though it too was added in MIPS32R3. I’ve corrected my test configuration and I’ll tighten this up in later releases.

Hi Daniel,

I am on a vacation now till the Aug 3 but I can take a look at these
problems. What is the most important?

As to the issue #3 - do we need to keep compatibility with the old
mips-mti-linux-gnu toolchain layout?

Simon

#1 is the most important but I'm already debugging that. I don't want to interrupt your holiday but if you could look at #3 that would be great.

As to the issue #3 - do we need to keep compatibility with the old mips-mti-linux-gnu toolchain layout?

It would be nice to support both the old and new layouts but I doubt that's possible with the current framework (see below). If it's not possible then we should use the new layout and encourage users to update their gcc toolchains since the new layout has been in use for at least 6 months.

I believe that supporting both layouts ties into the later target triple work to some degree. There are at least three mips-mti-linux-gnu toolchains with different sets of multilibs, and anyone can build more variants of the mips-mti-linux-gnu toolchain. So ideally, we should be able to configure the multilib layout at configure-time or using config files. Of course, implementing this kind of thing is too much work for this release.

Okay. I will take a look at issue #3.

I believe I've identified the problem with almabench but I haven't found the root cause in the compiler yet.

The problem is that a caller saved register ($f14) is being moved across a call and this call sometimes clobbers the value. As a result, the value of the TWOPI constant used in the fmod() calls isn't always 2*PI.

According to -print-after-all, the pass that moves the instruction is Simple Register Coalescing. The bit I'm stuck on at the moment is that I'm not sure what information is supposed to prevent this move from happening. I thought there was supposed to be an ImplicitDefine on the call instruction for each clobbered register but this doesn't seem to be the case. Am I missing something obvious?

To reduce memory consumption clobbered registers are handled with RegisterMask machine operands which contain a bitset of all registers clobbered.

  • Matthias

Thanks. This is making a lot more sense now and it's looking like this issue isn't Mips specific.

Here's the IR dump before simple register coalescing (note: I've patched the IR printer to print the contents of the regmask):
4480B %vreg260<def> = LDC1 %vreg253, <cp#3>[TF=6]; mem:LD8[ConstantPool] AFGR64:%vreg260 GPR32:%vreg253
4496B %vreg261<def> = FMUL_D32 %vreg247, %vreg248; AFGR64:%vreg261,%vreg247,%vreg248
4512B ADJCALLSTACKDOWN 16, %SP<imp-def>, %SP<imp-use>
4528B %D6<def> = COPY %vreg243; AFGR64:%vreg243
4544B JAL <ga:@sin>, <regmask %FP %RA %D10 %D11 %D12 %D13 %D14 %D15 %F20 %F21 %F22 %F23 %F24 %F25 %F26 %F27 %F28 %F29 %F30 %F31 %S0 %S1 %S2 %S3 %S4 %S5 %S6 %S7 >, %RA<imp-def,dead>, %D6<imp-use,kill>, %SP<imp-def>, %D0<imp-def>
4560B ADJCALLSTACKUP 16, 0, %SP<imp-def>, %SP<imp-use>
4576B %vreg262<def> = COPY %D0<kill>; AFGR64:%vreg262
4592B %vreg263<def> = FMUL_D32 %vreg256, %vreg262; AFGR64:%vreg263,%vreg256,%vreg262
4608B ADJCALLSTACKDOWN 16, %SP<imp-def>, %SP<imp-use>
4624B %vreg264<def> = FADD_D32 %vreg261, %vreg263; AFGR64:%vreg264,%vreg261,%vreg263
4640B %D6<def> = COPY %vreg255; AFGR64:%vreg255
4656B JAL <ga:@cos>, <regmask %FP %RA %D10 %D11 %D12 %D13 %D14 %D15 %F20 %F21 %F22 %F23 %F24 %F25 %F26 %F27 %F28 %F29 %F30 %F31 %S0 %S1 %S2 %S3 %S4 %S5 %S6 %S7 >, %RA<imp-def,dead>, %D6<imp-use,kill>, %SP<imp-def>, %D0<imp-def>
4672B ADJCALLSTACKUP 16, 0, %SP<imp-def>, %SP<imp-use>
4688B %vreg265<def> = COPY %D0<kill>; AFGR64:%vreg265
4704B %vreg266<def> = FMUL_D32 %vreg258, %vreg265; AFGR64:%vreg266,%vreg258,%vreg265
4720B ADJCALLSTACKDOWN 16, %SP<imp-def>, %SP<imp-use>
4736B %D6<def> = COPY %vreg255; AFGR64:%vreg255
4752B JAL <ga:@sin>, <regmask %FP %RA %D10 %D11 %D12 %D13 %D14 %D15 %F20 %F21 %F22 %F23 %F24 %F25 %F26 %F27 %F28 %F29 %F30 %F31 %S0 %S1 %S2 %S3 %S4 %S5 %S6 %S7 >, %RA<imp-def,dead>, %D6<imp-use,kill>, %SP<imp-def>, %D0<imp-def>
4768B ADJCALLSTACKUP 16, 0, %SP<imp-def>, %SP<imp-use>
4784B %vreg267<def> = COPY %D0<kill>; AFGR64:%vreg267
4800B %vreg268<def> = FMUL_D32 %vreg257, %vreg267; AFGR64:%vreg268,%vreg257,%vreg267
4816B ADJCALLSTACKDOWN 16, %SP<imp-def>, %SP<imp-use>
4832B %vreg269<def> = FMUL_D32 %vreg0, %vreg264; AFGR64:%vreg269,%vreg0,%vreg264
4848B %vreg270<def> = FADD_D32 %vreg266, %vreg268; AFGR64:%vreg270,%vreg266,%vreg268
4864B %vreg271<def> = FMUL_D32 %vreg0, %vreg270; AFGR64:%vreg271,%vreg0,%vreg270
4880B %vreg272<def> = FMUL_D32 %vreg271, %vreg259; AFGR64:%vreg272,%vreg271,%vreg259
4896B %vreg273<def> = FMUL_D32 %vreg269, %vreg259; AFGR64:%vreg273,%vreg269,%vreg259
4912B %vreg274<def> = FADD_D32 %vreg24, %vreg273; AFGR64:%vreg274,%vreg24,%vreg273
4928B %vreg275<def> = FADD_D32 %vreg274, %vreg272; AFGR64:%vreg275,%vreg274,%vreg272
4944B %D6<def> = COPY %vreg275; AFGR64:%vreg275
4960B %D7<def> = COPY %vreg260; AFGR64:%vreg260
4976B JAL <ga:@fmod>, <regmask %FP %RA %D10 %D11 %D12 %D13 %D14 %D15 %F20 %F21 %F22 %F23 %F24 %F25 %F26 %F27 %F28 %F29 %F30 %F31 %S0 %S1 %S2 %S3 %S4 %S5 %S6 %S7 >, %RA<imp-def,dead>, %D6<imp-use,kill>, %D7<imp-use>, %SP<imp-def>, %D0<imp-def>

The %vreg260 at 4480B is being coalesced with the %D7 at 4960B but the call preserved masks in the JAL's (jump and link) at 4544B, 4752B, and 4656B say that D7 isn't preserved. The pass doesn't call getRegMask() in any obvious way so it seems likely that it's not respecting the mask.

Thanks again. I've got to the bottom of this and submitted a patch at http://reviews.llvm.org/D11649. It seems the register coalescer doesn't look at regmask operands.