I’m trying to add a new optimization rule that checks the prefix of a string.
Context and Example
Here is the link to the compiler explorer.
Say, we had something like, this.
pub fn example_function(a: &str) -> i64 {
match a {
"asdf" => -1,
"asdf1" => 1,
"asdf2" => 2,
"asdf3" => 3,
"asdf4" => 4,
"asdf10" => 10,
"asdf11" => 11,
"asdf12" => 12,
"asdf13" => 13,
"asdf14" => 14,
"asdf100" => 100,
"asdf101" => 101,
"asdf102" => 102,
"asdf103" => 103,
"asdf104" => 104,
_ => 0
}
}
Compiler translates this into a large if-else ladder, checking the entire string over and over, like this.
.LBB0_4:
mov eax, 1717859169
mov ecx, dword ptr [rdi]
xor ecx, eax
movzx edx, byte ptr [rdi + 4]
xor edx, 50
or edx, ecx
je .LBB0_5
xor eax, dword ptr [rdi]
movzx ecx, byte ptr [rdi + 4]
xor ecx, 51
or ecx, eax
je .LBB0_7
mov eax, 1717859169
mov ecx, dword ptr [rdi]
xor ecx, eax
movzx edx, byte ptr [rdi + 4]
xor edx, 52
or edx, ecx
You can see that generated code is checking whether the first 4 letters of the input is “asdf” over and over, even though you already know this.
I considered adding a rule on the rust compiler’s side as well, but I suspecting it would be better off implementing them on the llvm’s side.
This should be especially beneficial for cases where the value doesn’t match with any of the match arm, as you can basically skip every single cases.