The C language requires “short” arithmetic to be promoted to the size
of “int”, hence the conversions to “int” and later the optimizations back to “short”
but only when the optimizer can prove that the result will be the same.
If your machine has only 16-bit registers and arithmetic then you should
change clang. There won’t be any conversions in the IR (but there are
A variety of problems with LLVM’s optimizations that you will run into ! ).
If your machine has both 16-bit and 32-bit registers and arithmetic, then
you probably must leave clang alone. I am inclined to read your email
as implying this is the case for you.
Do you really need signed div and rem, usually people don’t need the
quirky results of signed div and rem (in fact more often than not they
need results consistent with two’s-complement shifts and masks) ?
If unsigned is OK then CI Should (?) transform unsigned 32-bit div
and rem of unsigned short into 16-bit unsigned div and rem. (Can someone
verify / confirm that I’m thinking correctly here ?)
The only thing I can think of off the top of my head for getting 16-bit sdiv
and srem instructions emitted on a 32-bit machine is with inline-asm ?
BTW, IIRC sdiv and srem also inhibit vectorization to 16-bit SIMD
instructions for the same reason (similarly shifts become undef for different
shift amounts in 16-bit), I wonder what work-arounds folks use in
this context, perhaps someone else on this list can chime in ?