[PATCH] Add mul_hi implementation

Everything except long/ulong is handled by just casting to the next larger type,
doing the math and then shifting/casting the result.

For 64-bit types, we break the high/low parts of each operand apart, and do
a FOIL-based multiplication. The algorithm was originally from StackOverflow
and modified for CL-based purposes and extended to handle ulong.

If we have concerns about the source, we can use the POCL implementation
instead since I believe that also has a compatible license declaration.

Just a note on this one:
I've tested char/uchar/short/ushort types on R600 (Cedar)
successfully, but the int/uint/long/ulong versions fail. I am 99%
sure that the failures are due to deficiencies in the R600 back-end
(i.e. handling long arithmetic correctly), and not in the code (which
I've tested compiled C versions of).

--Aaron

Everything except long/ulong is handled by just casting to the next larger type,
doing the math and then shifting/casting the result.

For 64-bit types, we break the high/low parts of each operand apart, and do
a FOIL-based multiplication. The algorithm was originally from StackOverflow
and modified for CL-based purposes and extended to handle ulong.

If we have concerns about the source, we can use the POCL implementation
instead since I believe that also has a compatible license declaration.

I can't really help you with any licensing issues. You will need to do
your own research to determine whether or not StackOverflow postings are
licensed using a license that is compatible to libclc.

Aside from any licensing issues, this patch is:

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

StackOverflow posts are creative commons: "user contributed content licensed under cc-wiki with attribution required"

Ick. Some of the follow-up discussion from that page makes me a bit
nervous. I'll re-write this using a similar method so as to avoid any
possible issues.

--Aaron