Understanding targets

Heyho everyone,

I have a big noob question again! I basically want to know, how I can find out which processors a certain installation of Clang supports and how do I change the target processor.

Some background to make my question more clear:

Normally I develop applications for Windows and 64bit – back then using the MSVC compiler. By now I changed to the fantastic Clang-Cl and didn’t thought much about processors – I take it automatically target x86-64 because of Visual Studio?

However, I know that Clang supports more processors (or architectures? What is the difference?) then x86-64. So I wanted to know, which are supported and how do I tell Clang do use a different target?

To give even more background:

I used an old development environment, which provided compiler for the old Playstation CPU – Wikipedia says it is a “32-bit RISC MIPS R3000A-compatible MIPS R3051” (I simply thought it is a R3000). And I started to wonder, how do I know if Clang supports that old CPU?

So I ran my llvm-config with --targets-built and got:

AArch64 AMDGPU ARM BPF Hexagon Lanai Mips MSP430 NVPTX PowerPC RISCV Sparc SystemZ WebAssembly X86 XCore

I see there “Mips” and “RISCV” but no RISC MIPS or even a R3000 – does that now mean that Clang does not support does old processors? Or does it? What do I do if I encounter a ‘wild’ Clang without llvm-config – can I still find out its targets? I have no idea and this is why I’m asking.

Thank you in advance for any help!

Kind greetings

Björn Gaier

The term “target” is somewhat overloaded.

When llvm-config tells you it was built with the X86 target, that actually includes a variety of closely related architectures, such as x86_64, i386, and so on. Within the x86_64 architecture, there are many individual processor implementations that LLVM understands, such as Skylake, Bulldozer, and many many more.

What clang means by “target” is really the target triple, which includes the architecture, OS, and sometimes other information. For clang-cl the default triple is likely something along the lines of “x86_64-pc-msvc” and tends to be derived from the characteristics of the environment where you are running clang.

Clang accepts a -target option which allows you to specify a different triple. The first component of the triple would have to be an architecture name that is supported by one of the LLVM targets that clang was built with. So, to support triples starting with “x86_64” you would need a clang that includes the X86 target, and so on.

Regarding the Mips target, it looks like the supported 32-bit architecture names are “mips” and “mipsel” so you could experiment with using triples starting with those strings. I don’t know anything in particular about the Mips target other than what I just said. I have cc’d the code owner of the MIPS target, who might be able to help you there.

–paulr

Now Clang accepts the following MIPS CPU names. This list can be found
in the "clang/lib/Basic/Targets/Mips.cpp". mips1 and mips5 accepted by
Clang, but unsupported by code generator. I'm going to remove them
from this list.

mips1, mips2, mips3, mips4, mips5,
mips32, mips32r2, mips32r3, mips32r5, mips32r6,
mips64, mips64r2, mips64r3, mips64r5, mips64r6,
octeon, octeon+, p5600

R3000 is a CPU implements mips1 instruction set architecture.
Unfortunately you cannot generate a code for this CPU using Clang.

Regarding the Mips target, it looks like the supported 32-bit architecture
names are “mips” and “mipsel” so you could experiment with using triples
starting with those strings. I don’t know anything in particular about
the Mips target other than what I just said. I have cc’d the code owner
of the MIPS target, who might be able to help you there.

I see there “Mips” and “RISCV” but no RISC MIPS or even a R3000 – does that
now mean that Clang does not support does old processors? Or does it? What
do I do if I encounter a ‘wild’ Clang without llvm-config – can I still find
out its targets? I have no idea and this is why I’m asking.

Now Clang accepts the following MIPS CPU names. This list can be found
in the "clang/lib/Basic/Targets/Mips.cpp". mips1 and mips5 accepted by
Clang, but unsupported by code generator. I'm going to remove them
from this list.

mips1, mips2, mips3, mips4, mips5,
mips32, mips32r2, mips32r3, mips32r5, mips32r6,
mips64, mips64r2, mips64r3, mips64r5, mips64r6,
octeon, octeon+, p5600

R3000 is a CPU implements mips1 instruction set architecture.
Unfortunately you cannot generate a code for this CPU using Clang.

We never implemented mips1 codegen as it was orders of magnitude harder than mips2 (mostly because of the delay slots on load instructions) and there's no 'generic' mips1 target (because coprocessor 0 wasn't standardized).
However, we had mips2 working as it was needed to build for Debian. Did it get broken or did they move their mips port up to something more recent?
I think David is actively using mips4 too. @David: Is that right?

Hello Paul and Simon, (Sorry - I'm not sure about the social conventions in mailing lists)

Both of your answers helped me a lot! So If I understand it correctly, Clang knows what 'mips1' and 'mips5' are - but can't generate code for it? Why is it like that?

I actually have a more in general questions about processors... If this is the wrong place for it, please ignore it, I'm just a bit confused.
So the R3000 is a "MIPS CPU"? What does that actually mean? Is the architecture MIPS? Or the producer? When I go to Wikipedia I see MIPS as the designer, so I take it is like saying "Intel CPU" or "AMD CPU" but that does not tell me anything about the assembly instruction it uses, right?
But then also I see as Design "RISC", as I understood it describes the assembly instructions? But why would I tell Clang to target "mips1" when the design of the R3000 is RISC? Why isn't RISCV correct then? Or RISC1 or so...

Also how does that influence floating point arithmetic? I often heard that those are separated processors FPUs(?). So could it be, that there is an additional processor besides the processor I know about? Like R3000 + FPU? Wouldn't had Clang or any other compiler to know about such a construct or is that not the case?

Sorry again if this is too much off topic - I simply never thought about such stuff before °/////°

Thank you in advance and kind greetings
Björn

Hi,

1. Here by architecture I mean instruction set architecture (ISA) [1].
In extremely simplified form ISA can be considered a set of machine
instructions which can be handled by CPU.

2. Intel, ARM, MIPS etc design various ISAs. Intel designs and
manufactures CPUs. AMD uses almost the same instruction set as Intel
but uses different "internal" design of CPUs and manufactures them by
itself too. ARM, MIPS and some other companies only design CPUs. ARM
and MIPS processors are manufactured by other companies like Mediatek
or Qualcomm for example. Sometimes chipmakers produce CPU with
"canonical" design. Sometimes they add extensions. New instructions
for example.

2. Sometimes CPU name almost directly points to supported ISA. It's
true for Intel CPUs. For ARM and MIPS it's more difficult. There no
ISA named "Qualcomm Snapdragon 820" and you have to look as
documentation to get know that this chip supports ARMv8 ISA. MIPS
R3000 [2] implements MIPS I ISA [3].

3. RISC is an abbreviation for "reduced instruction set computer" [4].
It covers a large set of ISAs designed by different companies. MIPS in
particular. RISC-V (supported by LLVM) is an open-source hardware
instruction set architecture based on "reduced instruction set
computer" principles [5]. MIPS 1 is a RISC ISA, RISC-V is a RISC ISA
too. But MIPS 1 is not equal to RISC-V as well as MIPS 1 is not equal
to SPARC.

4. Some ISA defines floating point instructions, some other does not
do that. CPU might support most part of ISA but does not support
floating point instructions. You need to refer CPU documentation. For
example, if your CPU supports MIPS32 R2 ISA (which defines FPU
instructions), but does not have FPU, you can specify that and request
emulation of FPU by the following Clang options: -mips32r2
-msoft-float.

[1] Instruction set architecture - Wikipedia
[2] R3000 - Wikipedia
[3] MIPS architecture - Wikipedia
[4] Reduced instruction set computer - Wikipedia
[5] RISC-V - Wikipedia

As Daniel Sanders said:
[[
We never implemented mips1 codegen as it was orders of magnitude
harder than mips2 (mostly because of the delay slots on load
instructions) and there's no 'generic' mips1 target (because
coprocessor 0 wasn't standardized).
]]

MIPS 2 should work.

Both of your answers helped me a lot! So If I understand it correctly, Clang knows what 'mips1' and 'mips5' are - but can't generate code for it? Why is it like that?

I actually have a more in general questions about processors... If this is the wrong place for it, please ignore it, I'm just a bit confused.
So the R3000 is a "MIPS CPU"? What does that actually mean? Is the architecture MIPS? Or the producer?

Both. MIPS (the company) developed the MIPS architecture, specifying
what instructions would exist and how they would be encoded. Then MIPS
(the company) designed CPUs that implemented this MIPS architecture.
One of those CPUs is R3000. Other companies and individuals have also
designed CPUs that implement the MIPS architecture.

When I go to Wikipedia I see MIPS as the designer, so I take it is like saying "Intel CPU" or "AMD CPU" but that does not tell me anything about the assembly instruction it uses, right?

Right, or at least not intrinsically. In practice MIPS (the company)
revolved around their instruction set so it would be very odd if it
wasn't.

But then also I see as Design "RISC", as I understood it describes the assembly instructions?

RISC is an adjective describing certain instruction sets (see
Reduced instruction set computer - Wikipedia). The
lines have gotten a bit blurry and debatable, but broadly RISC
instruction sets tend to usemore, simpler instructions to do the work.
Where x86 has "load from memory and multiply by this register" as a
single instruction, RISC instruction sets would have a separate load
instruction followed by a multiply instruction. They also tend to have
more, and more general purpose registers.

So RISC on its own doesn't tell you what instructions are supported or
what their encodings are.

But why would I tell Clang to target "mips1" when the design of the R3000 is RISC?

The R3000 implemented version 1 of the MIPS architecture. As far as
the compiler is concerned, all CPUs implementing the mips1 version of
the instruction set are roughly the same. They all support the same
instructions and code compiled for one of them will run on the others.

Of course there are sometimes differences in how many cycles each
instruction takes to run and so on. And to support that Clang would
add a separate R3000 CPU that can be targeted.

Why isn't RISCV correct then? Or RISC1 or so...

RISC-V is categorically different from RISC. RISC-V *is* a separate
instruction set architecture (with specific instructions and
encodings), named because it's the 5th generation of RISC
architectures by some reckoning. There weren't any RISC-I through
RISC-IV really.

Also how does that influence floating point arithmetic? I often heard that those are separated processors FPUs(?).

They used to be, back in the 80s and 90s, but were fairly quickly
integrated into the main CPU. Older RISC designs (like MIPS & ARM)
treat them like they're separate in the instruction set even after
they got integrated. They had separate instructions for dealing with
all kinds of coprocessors, and one of those kinds is the FPU, which
then got fossilized.

So could it be, that there is an additional processor besides the processor I know about? Like R3000 + FPU? Wouldn't had Clang or any other compiler to know about such a construct or is that not the case?

In theory, yes. In practice a CPU either has the FPU corresponding to
its era or it doesn't. Clang assumes it does but lets you override
this with some options (-msoft-float, -msingle-float)

Cheers.

Tim.

Woah! Thank you so much! That helped me a lot! Now I understand the entire subject a better then before.