Hi Sam£¬
There's another case that clang riscv backend generate 33% more code than gcc in a loop code .
C code:
float max(float * maxval_it, int len)
{
float maxval = 0;
float * end = maxval_it + len;
while( maxval_it < end)
{
if (*maxval_it > maxval)
{
maxval = *maxval_it;
}
}
return maxval;
}
Compile command:
riscv32-unkown-elf-g++ -nostartfiles -nostdlib -O2 -march=rv32imf -mabi=ilp32f -fno-builtin -S perf.c -o perf.g++
clang++ -O2 ¨Ctarget=riscv32 -march=rv32img -mabi=ilp32f -nostdlib -fno-builtin -S perf.c -o perf.lang
the gcc version is 7.2.0
the llvm version is 10.0.0
the code of loop generate by gcc:
.L5:
flw fa5, 0(a0)
addi a0,a0,4
fgt.s a5,fa5,fa0
beqz a5, .L3
fmv.s fa0, fa5
.L3:
bgtu a1, a0, .L5
the code of loop generate by clang riscv backend:
.LBB0_2:
addi a0, a0, 4
fmv.s ft0, fa0
bgeu a0, a1, .LBB0_5
.LBB0_3:
flw fa0, 0(a0)
flt.s a2, ft0, fa0
bnez a2, .LBB0_2
fmv.s fa0, ft0
j .LBB0_2
Thanks~
Lori
-----ÓʼþÔ¼þ-----
·¢¼þÈË: Lori Yao Yu
·¢ËÍʱ¼ä: 2020Äê4ÔÂ28ÈÕ 11:00
ÊÕ¼þÈË: Sam Elliott <selliott@lowrisc.org>
³ËÍ: LLVM Developers Mailing List <llvm-dev@lists.llvm.org>
Ö÷Ìâ: »Ø¸´: [llvm-dev] assembly code for array iteration generated by llvm is much slower than gcc
Hi Sam,
Yes, it is riscv assembly code. The test code is show bellow. You can copy the code to a c file named perf.c, then you can compile perf.c using the compile command bellow.
We can see than gcc prefer to use pointer to iterate the array, but llvm perfer to use index to iterate the array. So llvm generate more codes to calculate the memory address of an array element from the index.
Test C code:
//perf.c
int func(int w1, int w2, int *b, int *c) {
int wstart = 0;
int i = 0;
int j = 0;
int sum = 0;
int wend = 0;
int dst_idx = 0;
int dst_idx2 = 0;
for (i = 0; i < w2; i++) {
wstart = i * w1;
wend = i / w1;
sum = c[wstart];
for (j = wstart + 1; j < wend; j++) {
sum += c[j * w2];
sum += c[j * w1];
}
dst_idx = w1 * i + w2;
dst_idx2 = w2 * i + w1;
b[dst_idx] = sum;
b[dst_idx2] = sum/2;
}
}
Compile command:
riscv32-unkown-elf-g++ -nostartfiles -nostdlib -O2 -march=rv32imf -mabi=ilp32f -fno-builtin -S perf.c -o perf.g++
clang++ -O2 ¨Ctarget=riscv32 -march=rv32img -mabi=ilp32f -nostdlib -fno-builtin -S perf.c -o perf.lang
the gcc version is 7.2.0
the llvm version is 10.0.0
thanks!~
Lori
-----ÓʼþÔ¼þ-----
·¢¼þÈË: Sam Elliott <selliott@lowrisc.org>
·¢ËÍʱ¼ä: 2020Äê4ÔÂ27ÈÕ 21:36
ÊÕ¼þÈË: Lori Yao Yu <loriyu@panyi.ai>
³ËÍ: LLVM Developers Mailing List <llvm-dev@lists.llvm.org>
Ö÷Ìâ: Re: [llvm-dev] assembly code for array iteration generated by llvm is much slower than gcc
Hi,
Am I right in thinking that this is RISC-V assembly?
Please can you provide a testcase (a C file, or LLVM IR) that we can use to diagnose this issue further? It would also be useful to know what architecture (including extensions) and other compiler flags you are using.
We know that the assembly that LLVM generates for RISC-V is not always the most efficient, and we're working on this issue at the moment. We would welcome more testcases.
Sam