Arm: disabling/disallowing Thumb instructions

Hi all,

When I use Clang, I can add -mno-thumb to the command line and Clang
generates pure Arm code without any use of Thumb instructions.

However, I am messing about with the Glasgow Haskell Compiler (GHC)
which generates LLVM IR code directly and then calls `opt` and `llc` on
that IR code. The generated IR code currently has:

    target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:64:128-a0:0:64-n32"
    target triple = "armv6-unknown-linux-gnueabihf"

in the header, but the generated assembly uses both Arm and Thumb
instructions.

Any clues appreciated.

Cheers,
Erik

That shouldn't be happening. The "armv6" ought to imply ARM mode. What
Thumb instructions are you seeing, and do you have a .ll file that
reproduces the issue with standard tools or is it just the way GHC is
driving LLVM?

Cheers.

Tim.

Hi Erik,

That's really odd. Are you sure the Thumb part was really generated by
the compiler, instead of some runtime chunks in GHC, or third-party
static libraries?

Can you dump assembly, or are you disassembling the object files? Or
is this a whole-program disassembly, where all libraries are linked
together?

cheers,
--renato

Tim Northover wrote:

That shouldn't be happening. The "armv6" ought to imply ARM mode. What
Thumb instructions are you seeing, and do you have a .ll file that
reproduces the issue with standard tools or is it just the way GHC is
driving LLVM?

Sorry false alarm.

I was a looking an object file produced by GCC compiling a C file, not
an object file produced by GHC compiling a Haskell file.

GHC on Linux uses GCC by default to compile C (GHC's runtime system is
written in C) and LLC/OPT when compiling Haskell code. For some reason
gcc on armhf/linux by default produces Thumb code. GCC needs to be passed
-marm on the command line to force production of pure Arm code.

This has finally got me to the bottom on one of the most difficult bugs
I've ever worked on.

Cheers,
Erik

GHC on Linux uses GCC by default to compile C (GHC's runtime system is
written in C) and LLC/OPT when compiling Haskell code. For some reason
gcc on armhf/linux by default produces Thumb code. GCC needs to be passed
-marm on the command line to force production of pure Arm code.

Yes, GCC defaults to Thumb2. Yet another Triple issue that things are
not what they seem.

Here, the "arm" in "arm-linux-gnueabihf" means ARM the architecture,
not the instruction set. In LLVM, we mean as the instruction set, with
"thumb-linux-gnueabihf" as Thumb2.

You're not the first one to fall for that. As a matter of fact, that
was probably one of my first "bugs" in LLVM, too. :slight_smile:

This has finally got me to the bottom on one of the most difficult bugs
I've ever worked on.

I'm glad you worked things out. :slight_smile:

cheers,
--renato

Renato Golin wrote:

You're not the first one to fall for that. As a matter of fact, that
was probably one of my first "bugs" in LLVM, too. :slight_smile:

Yeah, this GHC one was a little more complex. The compiler itself was
completely fine. The linker was quite happy to link Arm and Thumb code
into an executable that worked.

The problem was the GHC interactive environment which has its own runtime
linker that loads object files. The problem here was that the run time
linker was loading code compiled from Haskell and hence generated as Arm
code (via the LLVM backend), but that code was being called from the C
runtime code that was compiled as Thumb.

The C code was calling into Haskell compiled function inside the executable
without a problem (the platform linker was doing the right thing). It was
also able to execute the Arm code loaded by the run time linker. The SIGILL
was happening in the C run time code after it *returned* from the Arm code
that was loaded by the runtime linker.

Bizzare!

Erik