I am learning LLVM Backend for RISCV. I applied a selection pattern for several cases such as pack , grev, sh1add. Instructions are from this page. However, I couldn’t apply it for 2.1.2 Count Bits Set (cpop) and 2.1.1 Count Leading/Trailing Zeros (clz, ctz) because they are done by branch to create a loop or check the condition.
I searched examples and look the DAG output by giving flag -view-isel-dags
. However, I couldn’t understand how to write a selection pattern with a branch. My goal is not to write selection pattern to learn how to write it for different porpuses.
Could you help me to write a selection pattern with branch? Or could you direct me to a source that I can follow?
I may also start with the pattern for summing array values(below) instead of the instruction above, if it is more simple.
int arr_sum(int *arr, int size)
{
int res = 0;
for (int i = 0; i < size; i++)
{
res += arr[i];
}
return res;
}
You can’t write patterns involving branches at the moment. The issue is that the SelectionDAG infrastructure always and only operates on one basic-block at a time.
Because of this, much like the byte-reverse sequence you looked into before LLVM’s optimizer will look for idioms that implement population count and replace the whole loop with a call to @llvm.ctpop
which can be matched by a pattern because the control-flow has been removed. For example: Compiler Explorer.
3 Likes
Can I assume that every kind of backend support for a CPU can be derived from the ISA of that CPU and the other instructions of LLVM optimizer like ctpop
, clt
, etc?
I mean a new instruction whose selection pattern can be written with the help of an LLVM optimizer, might be supported. However, what if I cannot write a selection pattern by using those instructions(ctpop
, clt
, etc)?
Am I thinking of an extreme case?
I’m not quite sure what you’re asking. But there certainly can be (and are) instructions that are simply too complicated for the compiler to recognize and use from normal IR, either in patterns or earlier (at least without effort far beyond the benefit).
In those cases programmers that want them have to write inline assembly or C-level intrinsics to access those features.
I expect most crypto instructions are a good example of this category. LLVM doesn’t have a builtin AES recognizer so it’s asm or nothing.
1 Like
This was what I want to ask. I got the answer.
Thank you so much.