How can I implement in MLIR code the definition of a 16-bit (or 64-bit) ConstantFloatOp?

Hello.
Could you please tell me how can I implement in MLIR code the definition of a 16-bit (or 64-bit) ConstantFloatOp?
The problem is that I’m only able to define 32-bit floating point ConstantFloatOp by using:
mlir::MLIRContext* ctx = op->getContext();
float f = 3.0;
auto valOp1 = builder.createmlir::arith::ConstantFloatOp(loc, llvm::APFloat(f),
mlir::FloatType::getF32(ctx));

Could you please tell me if and how it is possible to declare 16-bit (or 64-bit) floating point constant?
For example, if I write:
    mlir::MLIRContext* ctx = op->getContext();
    float f = 3.0;
    auto valOp1 = builder.create<mlir::arith::ConstantFloatOp>(loc, llvm::APFloat(f),
                                                               mlir::FloatType::getF16(ctx));

 then I get the following runtime error:
    <unknown>:0: error: FloatAttr type doesn't match the type implied by its value
    vpux-opt: llvm-project/mlir/include/mlir/IR/StorageUniquerSupport.h:139: static ConcreteT mlir::detail::StorageUserBase<ConcreteT, BaseT, StorageT,
    UniquerT, Traits>::get(mlir::MLIRContext*, Args ...) [with Args = {mlir::Type, llvm::APFloat}; ConcreteT = mlir::FloatAttr; BaseT = mlir::Attribute; StorageT = mlir::detail::FloatAttrStorage; UniquerT =

mlir::detail::AttributeUniquer; Traits = {}]: Assertion `succeeded(ConcreteT::verify(getDefaultDiagnosticEmitFn(ctx), args…))’ failed.

Please help me implement a 16-bit (or 64-bit) ConstantFloatOp.
Thank you very much,
Alex

You need to create the APFloat with a half precision, see here for example: llvm-project/Deserializer.cpp at 7d76da539fca28c6e2b920d760223fc39b15d21a · llvm/llvm-project · GitHub

1 Like