[PATCH 1/1] math: Don't use llvm instrinsic for native_log

AMDGPU targets don't have instruction for it so needs to be expanded to C * log2 anyway.

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

AMDGPU targets don't have insturction for it,
so it'll be expanded to C * log2 anyway.

v2: use native_log2 instead of the more precise sw implementation

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

It’s still better because the point of using the intrinsics is the algebraic optimizations that apply to them

It's still better because the point of using the intrinsics is the
algebraic optimizations that apply to them

Are there optimizations that would apply to log and not log2?
I'd expect exposing the constant early would be beneficial to
optimization.

I don't mind changing this to:
#if _clang_major_ > 6
  return __clc_native_log(val);
#else
  return native_log2(val) * (1.0f / M_LOG2E_F);
#endif

when Vedran's patch lands.

Jan

AMDGPU targets don't have insturction for it,
so it'll be expanded to C * log2 anyway.

v2: use native_log2 instead of the more precise sw implementation

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

ping. this is currently crashing.
even after https://reviews.llvm.org/D29942 lands, we'll need this for
older llvm versions.

Jan

AMDGPU targets don’t have insturction for it,
so it’ll be expanded to C * log2 anyway.

v2: use native_log2 instead of the more precise sw implementation

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

ping. this is currently crashing.
even after https://reviews.llvm.org/D29942 lands, we’ll need this for
older llvm versions.

Even with this, ‘pow’ crashes out llvm on me at the moment on my SI card. I’ve got family in town today yet ,but tonight or tomorrow I should have time to rebuild my stack and try again.

–Aaron

> > AMDGPU targets don't have insturction for it,
> > so it'll be expanded to C * log2 anyway.
> >
> > v2: use native_log2 instead of the more precise sw implementation
> >
> > Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
>
> ping. this is currently crashing.
> even after https://reviews.llvm.org/D29942 lands, we'll need this for
> older llvm versions.
>

Even with this, 'pow' crashes out llvm on me at the moment on my SI card.
I've got family in town today yet ,but tonight or tomorrow I should have
time to rebuild my stack and try again.

hi, I'm not sure what 'this' refers to. I assume it's D29942 since
native_log patch is not related to pow. Note that D29942 only addresses
f32 and f16 types, f64 still crashes. You'll need my other ('math:
Don't use llvm intrinsic for pow') patch to fix that.

I assume Matt would prefer to keep the default to use intrinsic, and
the expansion is not quite correct (it does not handle 0 or negative
input, in fact it's better suited for powr).
I'll be posting another version of that patch as well. I just have too
many patches in flight atm.

Jan

> > AMDGPU targets don't have insturction for it,
> > so it'll be expanded to C * log2 anyway.
> >
> > v2: use native_log2 instead of the more precise sw implementation
> >
> > Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
>
> ping. this is currently crashing.
> even after https://reviews.llvm.org/D29942 lands, we'll need this for
> older llvm versions.
>

Even with this, 'pow' crashes out llvm on me at the moment on my SI card.
I've got family in town today yet ,but tonight or tomorrow I should have
time to rebuild my stack and try again.

hi, I'm not sure what 'this' refers to. I assume it's D29942 since
native_log patch is not related to pow. Note that D29942 only addresses
f32 and f16 types, f64 still crashes. You'll need my other ('math:
Don't use llvm intrinsic for pow') patch to fix that.

I assume Matt would prefer to keep the default to use intrinsic, and
the expansion is not quite correct (it does not handle 0 or negative
input, in fact it's better suited for powr).
I'll be posting another version of that patch as well. I just have too
many patches in flight atm.

Yeah, sorry. This email was a massive brain-fart on my part. Too much
stuff going on at home this last week, and I haven't had time to keep
up on emails.

In the case of native_* functions, any result that doesn't crash is a
valid result, so in this case, I'm fine with your proposed function
expansion.

--Aaron