Request for Help: Teach ARM target to auto-detect cpu / subtarget features

Hi all,

I've just filed PR12794: Add ARM cpu / subtarget features auto-detection. And I would very much appreciate the community's help to implement this.

What motivated this? Well this:
http://www.phoronix.com/scan.php?page=news_item&px=MTA5OTM

I believe one of the reason the benchmark numbers are totally bogus is that the compilation are done on ARM hosts. Given the benchmarks are apparently compiled without -mcpu=cortex-a9, I suspect LLVM ended up generating code for "generic" ARMv4 cpu. This article makes me sick in my stomach.

Thanks,

Evan

I've just filed PR12794: Add ARM cpu / subtarget features auto-detection. And I would very much appreciate the community's help to implement this.

What motivated this? Well this:
http://www.phoronix.com/scan.php?page=news_item&px=MTA5OTM

I believe one of the reason the benchmark numbers are totally bogus is that the compilation are done on ARM hosts. Given the benchmarks are apparently compiled without -mcpu=cortex-a9, I suspect LLVM ended up generating code for "generic" ARMv4 cpu. This article makes me sick in my stomach.

  I skip through MCTargetDesc/ARMAsmBackend.cpp, it seems llvm::createARMAsmBackend
only pickup different ARM ISA for Darwin. As for Linux, I guess we need to tweak
ELFARMAsmBackend? Do we need to modify Clang as well?

Regards,
chenwj

The backend sounds like the wrong place to implement this feature.

I'd have thought the Clang driver would be the ideal place?

The right place to implement this is in lib/Support/Host.cpp. X86 has an implementation of sys::getHostCPUName(), but everything else just uses the:

std::string sys::getHostCPUName() {
  return "generic";
}

implementation.

-Chris

Hi James,

> The backend sounds like the wrong place to implement this feature.
>
> I'd have thought the Clang driver would be the ideal place?

The right place to implement this is in lib/Support/Host.cpp. X86 has an implementation of sys::getHostCPUName(), but everything else just uses the:

std::string sys::getHostCPUName() {
  return "generic";
}

  Do you happened to know examples show how to get cpu model like X86 does in
Host.cpp?

Regards,
chenwj

I've just filed PR12794: Add ARM cpu / subtarget features auto-detection. And I would very much appreciate the community's help to implement this.

What motivated this? Well this:
http://www.phoronix.com/scan.php?page=news_item&px=MTA5OTM

I believe one of the reason the benchmark numbers are totally bogus is that the compilation are done on ARM hosts. Given the benchmarks are apparently compiled without -mcpu=cortex-a9, I suspect LLVM ended up generating code for "generic" ARMv4 cpu. This article makes me sick in my stomach.

I skip through MCTargetDesc/ARMAsmBackend.cpp, it seems llvm::createARMAsmBackend
only pickup different ARM ISA for Darwin. As for Linux, I guess we need to tweak
ELFARMAsmBackend? Do we need to modify Clang as well?

Regards,
chenwj

The backend sounds like the wrong place to implement this feature.

I'd have thought the Clang driver would be the ideal place?

The right place to implement this is in lib/Support/Host.cpp. X86 has an implementation of sys::getHostCPUName(), but everything else just uses the:

std::string sys::getHostCPUName() {
return "generic";
}

implementation.

Right. There are many advantages to doing this in the backend. It would work for everything, such as llc or arbitrary clients of ARM backend. Also the support library is being used by Clang frontend to handle -march=native.

Evan

Hi Chris,

The right place to implement this is in lib/Support/Host.cpp. X86 has an implementation of sys::getHostCPUName(), but everything else just uses the:

std::string sys::getHostCPUName() {
  return "generic";
}

implementation.

  I tried to let it return "armv7l" or "cortex-a9" on pandaboard, but the
bitcode output by clang still has

  target triple = "armv4t-unknown-linux-gnueabi"

not what I expect

  target triple = "armv7l-unknown-linux-gnueabi"

Do I miss something? Thanks.

Regards,
chenwj

Hi Chris,

The right place to implement this is in lib/Support/Host.cpp. X86 has an implementation of sys::getHostCPUName(), but everything else just uses the:

std::string sys::getHostCPUName() {
return "generic";
}

implementation.

I tried to let it return "armv7l" or "cortex-a9" on pandaboard, but the
bitcode output by clang still has

target triple = "armv4t-unknown-linux-gnueabi"

not what I expect

target triple = "armv7l-unknown-linux-gnueabi"

Do I miss something? Thanks.

No, it should be armv7-unknown-linux-gnueabi. Then first part of the triple is not the cpu name, it's architecture string. I think you need to add some code so clang would derive the architecture from the CPU name. Alternatively, you might want to add a llvm system routine that does this (or auto-detect the arch string). So you invoke clang, add -v. You will be what triple and CPU are being passed to the backend.

Evan