Passes piglit on turks and carrizo
fp64 passes CTS on carrizo
v2: fix formatting
check fp32 denormal support at runtime
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Passes piglit on turks and carrizo
fp64 passes CTS on carrizo
v2: fix formatting
check fp32 denormal support at runtime
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Passes piglit on turks and carrizo
fp64 passes CTS on carrizo
v2: fix formatting
check fp32 denormal support at runtime
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Passes piglit on turks and carrizo
fp64 passes cts on carrizo
v2: fix formatting
check fp32 denormal support at runtime
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Passes piglit on turks and carrizo
fp64 passes ctx on carrizo
v2: check fp32 denormal support at runtime
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Hi Jan,
The indentation still seems wrong in the rootn case. Otherwise, no more comments from
my side.
Jeroen
Hi Jan,
The indentation still seems wrong in the rootn case. Otherwise, no more comments from
my side.
fixed locally. I assume your LGTM from v1 still applies?
Jan
Hi Jan,
The indentation still seems wrong in the rootn case. Otherwise, no more comments from
my side.fixed locally. I assume your LGTM from v1 still applies?
Indeed.
It seems my other answer got stuck in the moderator queue, but the bottom line of that email
is that I agree the denormal issues are better addressed separately.
Jeroen
Hi Jan,
Triggered by your patch below, I started looking a bit more closely at __clc_fp32_subnormals_supported.
Am I correct that this function currently always returns false no matter the target? Or does some
override occur somewhere?
Thanks,
Jeroen
Hi Jan,
Triggered by your patch below, I started looking a bit more closely at __clc_fp32_subnormals_supported.
Am I correct that this function currently always returns false no matter the target? Or does some
override occur somewhere?
Looks like it. Tom might be better qualified to answer this.
The original idea was to link default or disabled to select proper
operation at kernel link time, but it looks likes it only applies to
fp64.
That is weird since fp64 denormals are required by specs. I'd say we
can drop the entire mechanism.
Jan
Hi Jan,
Triggered by your patch below, I started looking a bit more closely at __clc_fp32_subnormals_supported.
Am I correct that this function currently always returns false no matter the target? Or does some
override occur somewhere?Looks like it. Tom might be better qualified to answer this.
The original idea was to link default or disabled to select proper
operation at kernel link time, but it looks likes it only applies to
fp64.
That is weird since fp64 denormals are required by specs. I'd say we
can drop the entire mechanism.
That’s not totally true. Check out the cl-denorms-are-zero option.
Jeroen
>
> > Hi Jan,
> >
> > Triggered by your patch below, I started looking a bit more closely at __clc_fp32_subnormals_supported.
> > Am I correct that this function currently always returns false no matter the target? Or does some
> > override occur somewhere?
>
> Looks like it. Tom might be better qualified to answer this.
> The original idea was to link default or disabled to select proper
> operation at kernel link time, but it looks likes it only applies to
> fp64.
> That is weird since fp64 denormals are required by specs. I'd say we
> can drop the entire mechanism.That’s not totally true. Check out the cl-denorms-are-zero option.
sorry, this one slipped under my radar. You're right. The mechanism was
aimed at AMD gpus that only care about fp64 denormals (fp32 denorms are
preferred disabled for performance/compliance reasons).
I don't mind someone reworking it to be more general (and idealy in CLC
instead of llvm ir).
regards,
Jan
Hi Jan,
Triggered by your patch below, I started looking a bit more closely at __clc_fp32_subnormals_supported.
Am I correct that this function currently always returns false no matter the target? Or does some
override occur somewhere?Looks like it. Tom might be better qualified to answer this.
The original idea was to link default or disabled to select proper
operation at kernel link time, but it looks likes it only applies to
fp64.
That is weird since fp64 denormals are required by specs. I’d say we
can drop the entire mechanism.That’s not totally true. Check out the cl-denorms-are-zero option.
sorry, this one slipped under my radar. You’re right. The mechanism was
aimed at AMD gpus that only care about fp64 denormals (fp32 denorms are
preferred disabled for performance/compliance reasons).
I don’t mind someone reworking it to be more general (and idealy in CLC
instead of llvm ir).
Thanks for the clarification. In that light the implementation makes sense to me.
Jeroen