[RFC] New analysis for Polynomial Hash recognition

artagnon · May 9, 2025, 8:37pm

Various targets have specialized crypto instructions for elliptic curve cryptography, and polynomial hash functions like CRC and GHASH. For instance, see x86/crypto in the kernel, or see the upcoming RISC-V vector crypto extension. There are innumerable implementations of crypto algorithms in a source language that is not assembly: this is an opportunity for LLVM to recognize these naively-written algorithms, and replace them with a more optimal version. An overwhelming majority of the work to be done to optimize these algorithms is to recognize them in middle-end IR: the transform component is relatively trivial.

To this effect, I propose a new analysis to recognize polynomial hashes. Most of the explanation can be found in a header comment, so I’ll paste it here:

//===----------------------------------------------------------------------===//
//
// The HashRecognize analysis recognizes unoptimized polynomial hash functions
// with operations over a Galois field of characteristic 2, also called binary
// fields, or GF(2^n): this class of hash functions can be optimized using a
// lookup-table-driven implementation, or with target-specific instructions.
// Examples:
//
//  1. Cyclic redundancy check (CRC), which is a polynomial division in GF(2).
//  2. Rabin fingerprint, a component of the Rabin-Karp algorithm, which is a
//     rolling hash polynomial division in GF(2).
//  3. Rijndael MixColumns, a step in AES computation, which is a polynomial
//     multiplication in GF(2^3).
//  4. GHASH, the authentication mechanism in AES Galois/Counter Mode (GCM),
//     which is a polynomial evaluation in GF(2^128).
//
// All of them use an irreducible generating polynomial of degree m,
//
//    c_m * x^m + c_(m-1) * x^(m-1) + ... + c_0 * x^0
//
// where each coefficient c is can take values in GF(2^n), where 2^n is termed
// the order of the Galois field. For GF(2), each coefficient can take values
// either 0 or 1, and the polynomial is simply represented by m+1 bits,
// corresponding to the coefficients. The different variants of CRC are named by
// degree of generating polynomial used: so CRC-32 would use a polynomial of
// degree 32.
//
// The reason algorithms on GF(2^n) can be optimized with a lookup-table is the
// following: in such fields, polynomial addition and subtraction are identical
// and equivalent to XOR, polynomial multiplication is an AND, and polynomial
// division is identity: the XOR and AND operations in unoptimized
// implmentations are performed bit-wise, and can be optimized to be performed
// chunk-wise, by interleaving copies of the generating polynomial, and storing
// the pre-computed values in a table.
//
// A generating polynomial of m bits always has the MSB set, so we usually
// omit it. An example of a 16-bit polynomial is the CRC-16-CCITT polynomial:
//
//   (x^16) + x^12 + x^5 + 1 = (1) 0001 0000 0010 0001 = 0x1021
//
// Transmissions are either in big-endian or little-endian form, and hash
// algorithms are written according to this. For example, IEEE 802 and RS-232
// specify little-endian transmission.
//
//===----------------------------------------------------------------------===//

Now, cryptographic functions are usually operations in a Galois field of prime characteristic, but we limit ourselves to characteristic 2 for an initial narrowing of the problem space, and because these algorithms can be optimized with a table-lookup in the fallback case.

The analysis I propose currently recognizes the cyclic redundancy check (CRC) algorithm, and can be found at:

github.com/llvm/llvm-project

[HashRecognize] Introduce new analysis

main ← artagnon:hash-recognize

opened 06:18PM - 08 May 25 UTC

artagnon

+1583 -2

Introduce a fresh analysis for recognizing polynomial hashes, with the rationale… that several targets have specific instructions to optimize things like CRC and GHASH (eg. X86 and RISC-V crypto extension). We limit the scope to polynomial hashes computed in a Galois field of characteristic 2, since this class of operations can also be optimized in the absence of target-specific instructions to use a lookup table. At the moment, we only recognize the CRC algorithm. RFC: https://discourse.llvm.org/t/rfc-new-analysis-for-polynomial-hash-recognition/86268

The canonical reference for CRC can be found in the kernel.

The implementation rests on a lot of existing infrastructure, including KnownBits, ConstantRange, ValueTracking, and ScalarEvolution, and roughly works by computing KnownBits over the iterations of the loop, and checking that the result is zero.

If the rationale is sound, we can begin reviewing the patch, and land it without any risk, since it is purely an analysis. The next steps would be to thoroughly test the analysis, and add a bit of code in LoopIdiomRecognize to use the results and optimize CRC algorithms so we can check it against llvm-opt-benchmark.

Footnote: GCC got a similar analysis last year, and optimizes CRC as of GCC 15.

oscardssmith · May 9, 2025, 11:28pm

Would [RFC] carry-less multiplication instruction be useful here to allow representing galois field multiplications in LLVM IR?

artagnon · May 10, 2025, 11:57am

Yes, carry-less multiply is useful for multiplications in Galois fields of characteristic 2. Not sure if it warrants an intrinsic, as discussed on the RFC, as it is also possible to optimize CRC and GHASH using specialized target-specific instructions (in which case, it would just be one instruction).

Topic		Replies	Views
[RFC] CRC Recognition in LoopIdiomRecognizer Loop Optimizations	13	513	May 8, 2025
[RFC]: A "poly" dialect for polynomial arithmetic MLIR	8	947	October 25, 2023
MLIR News, 58th edition (5th December 2023) Newsletter llvm-weekly	0	685	December 4, 2023
[libc][GSoC 2025] Profiling and testing of the LLVM libc GPU math GSoC gpu , libc , offload , gsoc2025	17	699	April 6, 2025
Open MLIR Meeting 10/26/2023: A Polynomial Dialect Proposal Announcements	1	363	October 27, 2023

[RFC] New analysis for Polynomial Hash recognition

Related topics