How to add new arch for llvm-cov show?

Hi all,

I'm trying to support llvm-cov for a new architecture and I have successfully built compiler-rt for my arch. Following steps shown in Source-based Code Coverage — Clang 16.0.0git documentation , I encountered an error for the last step(step of llvm-cov show).
The command line was (supposed my arch is XXXX)

"llvm-cov show -arch=XXXX ./foo -instr-profile=foo.profdata"

and the error was

"Failed to load coverage: No object file for requested architecture."

I think I should add my arch information to somewhere(maybe an llvm-cov support list?) but I don't know where to add these information. Can someone give me some suggestions?

Best Regards,
Ruobin.

Hi Ruobin,

Hi all,

I'm trying to support llvm-cov for a new architecture and I have successfully built compiler-rt for my arch. Following steps shown in Source-based Code Coverage — Clang 16.0.0git documentation , I encountered an error for the last step(step of llvm-cov show).
The command line was (supposed my arch is XXXX)

"llvm-cov show -arch=XXXX ./foo -instr-profile=foo.profdata"

and the error was

"Failed to load coverage: No object file for requested architecture."

I think I should add my arch information to somewhere(maybe an llvm-cov support list?) but I don't know where to add these information. Can someone give me some suggestions?

You’ll need to teach libObject about this architecture. Specifically, the coverage reader checks that calling getArch() on a loaded ObjectFile matches Triple(Arch).getArch() (see loadBinaryFormat in CoverageMappingReader.cpp).

best,
vedant

Hi vedant,

  The program didn't pass the checking "OF->getArch() != Triple(Arch).getArch()" loadBinaryFormat in CoverageMappingReader.cpp and returned an error. It's because "OF->getArch()" returned null and "Triple(Arch).getArch()" returned XXXX(name of my arch).
  The returned value of " OF->getArch()" is decided by " EF.getHeader()->e_machine" but I found "e_machine" is defined somewhere in MCAssembler(My compiler uses binutils as assembler) . Although I make some hacks to pass this checking, I still get other errors. So my problem is whether llvm-cov has to work with MCAssembler and is it possible to do it with binutils?

Best,
Ruobin

Hi vedant,

The program didn’t pass the checking “OF->getArch() != Triple(Arch).getArch()” loadBinaryFormat in CoverageMappingReader.cpp and returned an error. It’s because “OF->getArch()” returned null and “Triple(Arch).getArch()” returned XXXX(name of my arch).
The returned value of " OF->getArch()" is decided by " EF.getHeader()->e_machine" but I found “e_machine” is defined somewhere in MCAssembler

I haven’t double-checked, but I thought this definition came from llvm/Support/ELF.h?

(My compiler uses binutils as assembler) . Although I make some hacks to pass this checking, I still get other errors. So my problem is whether llvm-cov has to work with MCAssembler and is it possible to do it with binutils?

Coverage support should be / is largely compatible with binutils, but you may have to watch out for this BFD bug: https://clang.llvm.org/docs/SourceBasedCodeCoverage.html#drawbacks-and-limitations

What is the error you see?

vedant

Hi vedant,

  1. The definition is from llvm/Supprot/ELF.h. But this machine information(e_machine) is given to compiler at lib/MC/ELFObjectWriter.cpp. I greped the whole llvm project and found that e_machine was assigned at only two files. One was lib/MC/ELFObjectWriter.cpp(there was an comment said “e_machine=target”) and the other was tools/obj2yaml/elf2yaml.cpp(GDB stopped only at the former one when using x86_64 llvm-cov so I thought it was MC provide this e_machine information to compiler).

  2. New error was “Failed to load coverage: No coverage data found” since the compiler cannot get NamesSection(at loadBinaryFormat in CoverageMappingReader.cpp). I thought it was my ldscript problem because I put __llvm_prf_names, __llvm_prf_cnts, __llvm_prf_data and __llvm_prf_vnds inside .rodata section. Compiler checked .rodata but not things inside .rodata. What’s the right position to put these 4 _llvm_prf* sections?

Best,

Ruobin.

Hi vedant,

I also used command “ld –verbose” to check x86_64 ldscript and found no definition of _llvm_prf* but if successfully passed. Why?

Best,

Ruobin.

Hi vedant,

1. The definition is from llvm/Supprot/ELF.h. But this machine information(e_machine) is given to compiler at lib/MC/ELFObjectWriter.cpp. I greped the whole llvm project and found that e_machine was assigned at only two files. One was lib/MC/ELFObjectWriter.cpp(there was an comment said “e_machine=target”) and the other was tools/obj2yaml/elf2yaml.cpp(GDB stopped only at the former one when using x86_64 llvm-cov so I thought it was MC provide this e_machine information to compiler).

First, I think this reinforces my theory that llvm’s object file reading libraries do not “understand” the architecture you’re working on.

Second — and I’m not super familiar with this part of the codebase, so apologies for any mistakes here — you might have missed the ELF file reader?

$ git grep -iE "\<e_?machine\>” lib

lib/Object/ELF.cpp: return getDynamicTagAsString(getHeader()->e_machine, Type);

2. New error was “Failed to load coverage: No coverage data found” since the compiler cannot get NamesSection(at loadBinaryFormat in CoverageMappingReader.cpp). I thought it was my ldscript problem because I put __llvm_prf_names, __llvm_prf_cnts, __llvm_prf_data and __llvm_prf_vnds inside .rodata section. Compiler checked .rodata but not things inside .rodata. What’s the right position to put these 4 __llvm_prf_* sections?

I’m not sure what changed, exactly, between the point you encountered the last error and this one. Could you elaborate?

This is just a shot in the dark, but you may need to teach getInstrProfSectionName about any custom linker directives needed for your architecture.

vedant

Hi vedant,

  1. First, I think your theory is right that llvm’s object file reading libraries do not “understand” the architecture I’m working on. Since I’m using binutils as assembler which means llvm can only provide asm and object file is provided by biutils. I think these ELF header information is provided by my binutils now, so maybe I have to modify binutils code to provide ELF header to llvm?

Second, I’m sorry to say that I’m now working on llvm-4.0.0 and in ELF.cpp there is no “return getDynamicTagAsString(getHeader()->e_machine, Type);”. But I think it makes sense ELF file reader get nothing because my compiler doesn’t write these information. It seems I have to find somewhere(maybe an ELF file writer) to write e_machine so my reader will read this. But I have no idea where to write it now.

  1. In CoverageMapping.cpp, there is a checking “OF->getArch() != Triple(Arch).getArch()” and it makes an error if not equal. “OF->getArch” will go into a switch/case and the code enters the default branch which returns an UnknownArch. I hacked the code to make default branch return Triple::XXXX(only a temporary solution for the first point).

Where should I place _llvm_prf* sections? Is it ok to put them into .rodata section?

Best,

Ruobin

Hi vedant,

1. First, I think your theory is right that llvm’s object file reading libraries do not “understand” the architecture I’m working on. Since I’m using binutils as assembler which means llvm can only provide asm and object file is provided by biutils. I think these ELF header information is provided by my binutils now, so maybe I have to modify binutils code to provide ELF header to llvm?

Thanks, this helps me understand your workflow a bit better. And yes, that sounds right.

Second, I’m sorry to say that I’m now working on llvm-4.0.0 and in ELF.cpp there is no “return getDynamicTagAsString(getHeader()->e_machine, Type);”. But I think it makes sense ELF file reader get nothing because my compiler doesn’t write these information.

You mentioned earlier that you're passing assembly to binutils. Perhaps binutils expects an asm directive to inform it which architecture it's targeting?

It seems I have to find somewhere(maybe an ELF file writer) to write e_machine so my reader will read this. But I have no idea where to write it now.

If there's no asm directive the compiler can emit, you could hack binutils to have it emit the correct e_machine bits for your architecture only.

2. In CoverageMapping.cpp, there is a checking “OF->getArch() != Triple(Arch).getArch()” and it makes an error if not equal. “OF->getArch” will go into a switch/case and the code enters the default branch which returns an UnknownArch. I hacked the code to make default branch return Triple::XXXX(only a temporary solution for the first point).
Where should I place __llvm_prf_* sections? Is it ok to put them into .rodata section?

I'm not really familiar with ELF, but as I understand it, .rodata is its own section (unlike MachO's __DATA_CONST, which is a segment). So I'm not sure that it's possible to put __llvm_prf* *in* another section.

vedant

Hi vedant,

I found the machine code which defined in binutils and changed the number in llvm/support/ELF.h to make them consistent. And defined these 4 _llvm_prf* sections as 4 independent sections. Now the llvm-cov works fine.

Thanks,

Ruobin.