MIPS & GP register

Hi LLVM MIPS people,

I've been trying to keep track of the MIPS backend in order to eventually switch to clang/llvm from GCC for building our camera software. We've been using a build at revision 156432 for some time with no problems. I synced up to TOT clang/llvm today (revision 162004) to see if any optimizations had been improved, etc. The build I made with it started crashing cameras immediately. I managed to track it down to the removal of GP from the list of reserved registers in this checkin:

------------------------------------------------------------------------
r156694 | ahatanak | 2012-05-11 20:21:18 -0700 (Fri, 11 May 2012) | 3 lines

Stop reserving register $gp. Do not call isGPFI to check whether a frame object
is the $gp save slot.

Adding Mips::GP back into the reserved register list made everything go back to normal. I'm guessing that GP was removed from the list on purpose, so I don't think my change is necessarily correct, but it did fix our problems. I have attached a patch for reference anyway.

Are there some flags we should be passing to make sure GP doesn't get stomped on? Actually, we generated the list of flags we use without much rhyme or reason, to be honest. Here's what we have and was working (filtered a bit to remove -I, -D, -W flags and sorted out by flag type and in order of what I think matters):

    -march=mips32r2
    -mtune=4kem
    -msoft-float
    -EL

    -Xclang -triple -Xclang mipsel-sde-elf
    -Xclang -mrelocation-model -Xclang static

    -Xclang -mllvm -Xclang -mips-ssection-threshold=0
    -Xclang -mllvm -Xclang -enable-mips-delay-filler

    -Xassembler -G -Xassembler 0 -g

    -funsigned-char
    -fshort-wchar
    -fno-zero-initialized-in-bss
    -fasynchronous-unwind-tables
    -ffunction-sections
    -fdata-sections

    -Oz

Does that stuff looks like it makes sense? A once-over by someone with some expertise would be much appreciated. I also see in the LLVM 3.1 release notes that "MIPS32 little-endian direct object code emission is functional" - does that mean we don't need a supporting GCC installation anymore? What do we do to enable this feature? Would our flags above need to change if we do that?

On the runtime side, clang/llvm generated a bunch of calls to floating point functions that our old-timey gcc library didn't contain - specifically __floatundidf and __floatundisf and a couple of other similarly named routines. We managed to get them out of a newer libgcc, but couldn't find anything in clang/llvm that had them. I found them in compiler-rt, but from what I gather that doesn't work for MIPS yet?

Thanks for any tips & advice!

mips_gp.patch (760 Bytes)

    -march=mips32r2
    -mtune=4kem
    -msoft-float
    -EL

    -Xclang -triple -Xclang mipsel-sde-elf
    -Xclang -mrelocation-model -Xclang static

    -Xclang -mllvm -Xclang -mips-ssection-threshold=0
    -Xclang -mllvm -Xclang -enable-mips-delay-filler

We generally try to discourage people from using -Xclang flags
wherever possible; they're really implementation details, and
considered an unstable interface.

For the triple and relocation model, the flags you're looking for are
"-target mipsel-sde-elf" and "-static". For the MIPS-specific flags,
I don't think there's an equivalent; please file bugs if it's actually
useful functionality we should be exposing with regular flags.

Does that stuff looks like it makes sense? A once-over by someone with some expertise would be much appreciated. I also see in the LLVM 3.1 release notes that "MIPS32 little-endian direct object code emission is functional" - does that mean we don't need a supporting GCC installation anymore? What do we do to enable this feature? Would our flags above need to change if we do that?

"-integrated-as" will force it on. Not a MIPS expert, so no clue if
it actually works there. By itself, it's probably not particularly
useful in terms of eliminating other tools from your toolchain; like
the name of flag says, it's basically just an integrated version of
"as".

-Eli

   -march=mips32r2
   -mtune=4kem
   -msoft-float
   -EL

   -Xclang -triple -Xclang mipsel-sde-elf
   -Xclang -mrelocation-model -Xclang static

   -Xclang -mllvm -Xclang -mips-ssection-threshold=0
   -Xclang -mllvm -Xclang -enable-mips-delay-filler

We generally try to discourage people from using -Xclang flags
wherever possible; they're really implementation details, and
considered an unstable interface.

For the triple and relocation model, the flags you're looking for are
"-target mipsel-sde-elf" and "-static".

"-static" I think is OK. "-target mipsel-sde-elf" doesn't work at all - it seems like it tries to pass that flag along to GCC, which then complains "error: unrecognized command line option ‘-triple’". I changed it to "-ccc-host-triple" and that seemed to fix that problem, but the assembler seems to freak out something fierce. At first, it said "FATAL:/usr/bin/../libexec/as/x86_64/as: I don't understand 'G' flag!", and when I removed the offending "-Xassembler" flags, it generates hundreds of errors, I think because it's calling "/usr/bin/gcc" and then getting an x86_64 assembler instead of the MIPS cross-assembler. Adding a -v flag seems to confirm this thinking. With "-ccc-host-triple mips-sde-elf" and the -v flag, I get this invocation to the assembler (-D, -W and -I flags removed):

     "/usr/bin/gcc" -funsigned-char -msoft-float -Oz -static
        -v -Xassembler -G -Xassembler 0 -ffunction-sections
        -fdata-sections -MD -march=mips32r2 -mtune=4kem -EL
        -fshort-wchar -fno-zero-initialized-in-bss
        -fasynchronous-unwind-tables -c
        -o Coach12p/RfiUiAssetsCompiled1.o -G 0 -x assembler
        /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-cNtWix.s

This GCC invocation gives the error message "i686-apple-darwin11-llvm-gcc-4.2: 0: No such file or directory", I think because of the -G 0 flag. Here's the matching assembler invocation:

     /usr/llvm-gcc-4.2/bin/../libexec/gcc/i686-apple-darwin11/4.2.1/as
        -arch x86_64 -force_cpusubtype_ALL -G 0
        -o Coach12p/RfiUiAssetsCompiled1.o
        /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-cNtWix.s

This is the invocation that says:

     FATAL:/usr/bin/../libexec/as/x86_64/as: I don't understand 'G' flag!

As I mentioned above, I tried taking out the "-Xassembler -G -Xassembler 0" flags to shut that error up and see if anything useful happened. Instead I got a bunch of errors about bad opcodes, which makes sense if it's trying to assemble MIPS assembly with an Intel assembler. Here are the invocations for reference:

     "/usr/bin/gcc" -funsigned-char -msoft-float -Oz -static
        -v -ffunction-sections -fdata-sections -MD
        -march=mips32r2 -mtune=4kem -EL -fshort-wchar
        -fno-zero-initialized-in-bss -fasynchronous-unwind-tables
        -c -o Coach12p/RfiUiAssetsCompiled1.o -x assembler
        /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-kTss18.s

     /usr/llvm-gcc-4.2/bin/../libexec/gcc/i686-apple-darwin11/4.2.1/as -arch x86_64
        -force_cpusubtype_ALL -o Coach12p/RfiUiAssetsCompiled1.o
        /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-kTss18.s

whereas when I do it with the "-Xclang -triple -Xclang mipsel-sde-elf" style it seems to call the cross-assembler correctly:

     "/usr/local/lytro/bin/mips-sde-elf-gcc" -funsigned-char
       -msoft-float -Oz -static -v -Xassembler -G -Xassembler 0
       -ffunction-sections -fdata-sections -MD -march=mips32r2
       -mtune=4kem -EL -fshort-wchar -fno-zero-initialized-in-bss
       -fasynchronous-unwind-tables -c -o Coach12p/RfiUiAssetsCompiled1.o
       -G 0 -x assembler
       /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-BmjBFI.s

      /usr/local/lytro/lib/gcc/mips-sde-elf/4.7.1/../../../../mips-sde-elf/bin/as
          -G 0 -EL -mips32r2 -O2 -no-mdebug -mabi=32 -march=mips32r2 -mtune=4kem
          --trap -G 0 -o Coach12p/RfiUiAssetsCompiled1.o
          /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-BmjBFI.s

For the MIPS-specific flags, I don't think there's an equivalent;
please file bugs if it's actually useful functionality we should
be exposing with regular flags.

Both of those flags are critical, I think. It's possible the delay slot filler isn't believed to be "ready for prime time" and so it's hidden? It does seem to work fine for us. Setting the SDATA section threshold is very important for us - our program is much too large and has far too many globals to fit in the 64 KB SDATA section. I'll file bugs and see what turns up. I can make the patches myself if you can give me the 30-second pointer in the right direction.

I also see in the LLVM 3.1 release notes that "MIPS32 little-endian direct object code emission is functional" - does that mean we don't need a supporting GCC installation anymore? What do we do to enable this feature? Would our flags above need to change if we do that?

"-integrated-as" will force it on. Not a MIPS expert, so no clue if
it actually works there. By itself, it's probably not particularly
useful in terms of eliminating other tools from your toolchain; like
the name of flag says, it's basically just an integrated version of
"as".

I tried turning it on and it seemed to work - it died on some inline assembly, but I think I can fix that and then try again. Thanks for the tip on the flag.

-- Carl

   -march=mips32r2
   -mtune=4kem
   -msoft-float
   -EL

   -Xclang -triple -Xclang mipsel-sde-elf
   -Xclang -mrelocation-model -Xclang static

   -Xclang -mllvm -Xclang -mips-ssection-threshold=0
   -Xclang -mllvm -Xclang -enable-mips-delay-filler

We generally try to discourage people from using -Xclang flags
wherever possible; they're really implementation details, and
considered an unstable interface.

For the triple and relocation model, the flags you're looking for are
"-target mipsel-sde-elf" and "-static".

"-static" I think is OK. "-target mipsel-sde-elf" doesn't work at all - it seems like it tries to pass that flag along to GCC, which then complains "error: unrecognized command line option ‘-triple’". I changed it to "-ccc-host-triple" and that seemed to fix that problem, but the assembler seems to freak out something fierce. At first, it said "FATAL:/usr/bin/../libexec/as/x86_64/as: I don't understand 'G' flag!", and when I removed the offending "-Xassembler" flags, it generates hundreds of errors, I think because it's calling "/usr/bin/gcc" and then getting an x86_64 assembler instead of the MIPS cross-assembler. Adding a -v flag seems to confirm this thinking. With "-ccc-host-triple mips-sde-elf" and the -v flag, I get this invocation to the assembler (-D, -W and -I flags removed):

     "/usr/bin/gcc" -funsigned-char -msoft-float -Oz -static
        -v -Xassembler -G -Xassembler 0 -ffunction-sections
        -fdata-sections -MD -march=mips32r2 -mtune=4kem -EL
        -fshort-wchar -fno-zero-initialized-in-bss
        -fasynchronous-unwind-tables -c
        -o Coach12p/RfiUiAssetsCompiled1.o -G 0 -x assembler
        /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-cNtWix.s

This GCC invocation gives the error message "i686-apple-darwin11-llvm-gcc-4.2: 0: No such file or directory", I think because of the -G 0 flag. Here's the matching assembler invocation:

     /usr/llvm-gcc-4.2/bin/../libexec/gcc/i686-apple-darwin11/4.2.1/as
        -arch x86_64 -force_cpusubtype_ALL -G 0
        -o Coach12p/RfiUiAssetsCompiled1.o
        /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-cNtWix.s

This is the invocation that says:

     FATAL:/usr/bin/../libexec/as/x86_64/as: I don't understand 'G' flag!

As I mentioned above, I tried taking out the "-Xassembler -G -Xassembler 0" flags to shut that error up and see if anything useful happened. Instead I got a bunch of errors about bad opcodes, which makes sense if it's trying to assemble MIPS assembly with an Intel assembler. Here are the invocations for reference:

     "/usr/bin/gcc" -funsigned-char -msoft-float -Oz -static
        -v -ffunction-sections -fdata-sections -MD
        -march=mips32r2 -mtune=4kem -EL -fshort-wchar
        -fno-zero-initialized-in-bss -fasynchronous-unwind-tables
        -c -o Coach12p/RfiUiAssetsCompiled1.o -x assembler
        /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-kTss18.s

     /usr/llvm-gcc-4.2/bin/../libexec/gcc/i686-apple-darwin11/4.2.1/as -arch x86_64
        -force_cpusubtype_ALL -o Coach12p/RfiUiAssetsCompiled1.o
        /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-kTss18.s

whereas when I do it with the "-Xclang -triple -Xclang mipsel-sde-elf" style it seems to call the cross-assembler correctly:

     "/usr/local/lytro/bin/mips-sde-elf-gcc" -funsigned-char
       -msoft-float -Oz -static -v -Xassembler -G -Xassembler 0
       -ffunction-sections -fdata-sections -MD -march=mips32r2
       -mtune=4kem -EL -fshort-wchar -fno-zero-initialized-in-bss
       -fasynchronous-unwind-tables -c -o Coach12p/RfiUiAssetsCompiled1.o
       -G 0 -x assembler
       /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-BmjBFI.s

      /usr/local/lytro/lib/gcc/mips-sde-elf/4.7.1/../../../../mips-sde-elf/bin/as
          -G 0 -EL -mips32r2 -O2 -no-mdebug -mabi=32 -march=mips32r2 -mtune=4kem
          --trap -G 0 -o Coach12p/RfiUiAssetsCompiled1.o
          /var/folders/mk/0mblc5810cjgs0nylrkjxqbm0000gq/T/RfiUiAssetsCompiled1-BmjBFI.s

That's weird... you're probably triggering some sort of bad case in
the driver logic which tries to call gcc to assemble and link on
targets where we don't know what to do. That logic is generally a bit
shaky to begin with.

For the MIPS-specific flags, I don't think there's an equivalent;
please file bugs if it's actually useful functionality we should
be exposing with regular flags.

Both of those flags are critical, I think. It's possible the delay slot filler isn't believed to be "ready for prime time" and so it's hidden? It does seem to work fine for us. Setting the SDATA section threshold is very important for us - our program is much too large and has far too many globals to fit in the 64 KB SDATA section. I'll file bugs and see what turns up. I can make the patches myself if you can give me the 30-second pointer in the right direction.

Clang::AddMIPSTargetArgs does the relevant parsing; options are
defined in include/clang/Driver/Options.td .

-Eli

Our guy that works on the Clang/LLVM driver is on vacation for another week.

That's weird... you're probably triggering some sort of bad case in
the driver logic which tries to call gcc to assemble and link on
targets where we don't know what to do. That logic is generally a bit
shaky to begin with.

It sounds like that means "time to file a bug". =) Which product should I file under?

For the MIPS-specific flags, I don't think there's an equivalent;
please file bugs if it's actually useful functionality we should
be exposing with regular flags.

Both of those flags are critical, I think. It's possible the delay slot filler isn't believed to be "ready for prime time" and so it's hidden? It does seem to work fine for us. Setting the SDATA section threshold is very important for us - our program is much too large and has far too many globals to fit in the 64 KB SDATA section. I'll file bugs and see what turns up. I can make the patches myself if you can give me the 30-second pointer in the right direction.

Clang::AddMIPSTargetArgs does the relevant parsing; options are
defined in include/clang/Driver/Options.td .

Thanks - I'll give it a look and shed what I can do before filing bugs about them.

-- Carl

I don't think it's a bug, having looked at it some more. The problem is that our gcc is called "mips-sde-elf-gcc", and clang wants it to be "mipsel-sde-elf-gcc" to match the -ccc-host-triple flag we need to pass. Adding a "-ccc-gcc-name mips-sde-elf-gcc" flag fixes it.

-- Carl

(forwarding to llvm-dev)

(forwarding to llvm-dev)

From: Akira Hatanaka <ahatanak@gmail.com>
Date: Fri, Aug 17, 2012 at 2:35 PM
Subject: Re: [LLVMdev] MIPS & GP register
To: Carl Norum <carl@lytro.com>

Will something like this fix the problem?

if (!Subtarget.isLinux()) {
reserve GP and GP_64
}

To improve code, we have stopped reserving GP as a dedicated global register and have made it available to the register allocator. This works if we can initialize GP at the entry of every function, as we do now, but will not otherwise.

Yes that’s fine as long as it works for our triple. I’m not quite sure how that would work out. Our target is bare-metal mips. If I had my druthers we would freely use GP as well, but we have some vendor-provided libraries that rely on it not getting whacked. Maybe someday I can go in and rebuild all that stuff, but it’s definitely a future project for us.

– Carl

OK here are some updated patches. Thanks for checking them out. Is this the right place to send them, or should I be sending to llvm-commits?

– Carl

delay-filler.patch (1.08 KB)

reserve-gp.patch (760 Bytes)