Hi Tanya,
Thanks for the patch; it’s great to get your contribution for the OpenCL support.
Sorry for the delay in my response.
Creating operators that may get types as arguments for as_typen and convert is a very elegant solution; however it’s quite intrusive, as these features might be implemented as standard functions, and linked to the user’s code later. As_typen needs some special checks, so it could be a clang built-in, but conversions basically could be implemented without changing clang at all.
Having AsType or Convert as part of the IR is beneficial for error reporting and tools such as the static analyzer. There is no negative impact on the end user, only benefits.
Vec_step is very similar to the sizeof and the alignof operators, so I think it’s OK to implement it using the same expression type, instead of code duplication.
I agree that this is a good alternative to our implementation. I've looked at your patch and it looks like we are doing the same work but merging it with sizeof is a good idea to prevent code duplication.
Index: include/clang/Basic/DiagnosticSemaKinds.td
--- include/clang/Basic/DiagnosticSemaKinds.td (revision 125808)
+++ include/clang/Basic/DiagnosticSemaKinds.td (working copy)
@@ -3723,6 +3723,26 @@
"%0 does not refer to the name of a parameter pack; did you mean %1?">;
def note_parameter_pack_here : Note<"parameter pack %0 declared here">;
…
+def err_cvt_arg_must_be_constant : Error<
+ "__builtin_convert requires constant 3rd and 4th argument">;
+def err_invalid_astype_of_different_size : Error<
+ "invalid astype between type '%0' and '%1' of different size">;
+def err_vec_step_bitfield : Error<
+ "invalid application of 'vec_step' to bitfield">;
Bitfields are not supported in OpenCL at all – is this error message really needed?
Probably not then.
Index: include/clang/Basic/TokenKinds.def
--- include/clang/Basic/TokenKinds.def (revision 125808)
+++ include/clang/Basic/TokenKinds.def (working copy)
@@ -356,6 +356,11 @@
KEYWORD(__vector , KEYALTIVEC)
KEYWORD(__pixel , KEYALTIVEC)
+// OpenCL Extensions.
+KEYWORD(__builtin_astype , KEYALL)
+KEYWORD(__builtin_convert , KEYALL)
+KEYWORD(__builtin_vec_step , KEYALL)
+
The latest version of CLANG already has KEYOPENCL flag, so keywords can be flagged as OpenCL only. This way vec_step could be a keyword, instead of using macros later to turn it to __builtin_vec_step. Also, it saves later checking of right target language.
Ok, I'll change this.
Index: lib/CodeGen/CGExprScalar.cpp
--- lib/CodeGen/CGExprScalar.cpp (revision 125808)
+++ lib/CodeGen/CGExprScalar.cpp (working copy)
@@ -2534,6 +2537,86 @@
return CGF.EmitBlockLiteral(block);
}
+Value *ScalarExprEmitter::VisitAsTypeExpr(AsTypeExpr *E) {
+ Value *Src = CGF.EmitScalarExpr(E->getSrcExpr());
+ const llvm::Type * DstTy = ConvertType(E->getDstType());
+
+ // Going from vec4->vec3 or vec3->vec4 is a special case and requires
+ // a shuffle vector instead of a bitcast.
+ const llvm::Type *SrcTy = Src->getType();
+ if (isa<llvm::VectorType>(DstTy) && isa<llvm::VectorType>(SrcTy)) {
+ unsigned numElementsDst = cast<llvm::VectorType>(DstTy)->getNumElements();
+ unsigned numElementsSrc = cast<llvm::VectorType>(SrcTy)->getNumElements();
+
+ if ((numElementsDst == 3 && numElementsSrc == 4)
+ || (numElementsDst == 4 && numElementsSrc == 3)) {
The OpenCL spec defines the behavior of vec4->vec3, but about vec3->vec4 it writes:
float3 f;
// Error. Result and operand have different sizes
float4 g = as_float4(f);
Also, shuffle vector is not enough – there might be a conversion from int4 to float3, which is legal, but will crash the compiler if it’s done only by shuffle. There must be a bitcast here.
Ok. I'll modify.
+ if (dst_fp && src_fp)
+ ID = llvm::Intrinsic::convertff;
+ else if (dst_fp && !src_fp)
+ ID = src_s ? llvm::Intrinsic::convertfsi : llvm::Intrinsic::convertfui;
+ else if (!dst_fp && src_fp)
+ ID = dst_s ? llvm::Intrinsic::convertsif : llvm::Intrinsic::convertuif;
+ else if (dst_s && src_s)
+ ID = llvm::Intrinsic::convertss;
+ else if (dst_s && !src_s)
+ ID = llvm::Intrinsic::convertsu;
+ else if (!dst_s && src_s)
+ ID = llvm::Intrinsic::convertus;
+ else
+ ID = llvm::Intrinsic::convertuu;
+
These intrinsics are not documented. Are they safe to use? Do they fulfill the OpenCL spec’s precision requirements?
I believe they meet OpenCL's spec precision requirements as they are working in our implementation. Do you have a specific example where you think they won't?
As for being safe to use, what do you mean?
Thanks,
Tanya