[PATCH 1/3] Introduce per device defines

Make cl_khr_fp64 define per-device.
This patch does not change the generated Makefile

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

This halves the size of builtin.opt.{cedar,barts}.bc

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

Make cl_khr_fp64 define per-device.
This patch does not change the generated Makefile

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
---
I've tried to find a way to make libclc more device specific.
This series handles fp64, but I plan to extend it to other defines like
__CLC_HAVE_FMA, or (with Tom's recent patches) __CLC_HAVE_LDEXP.

Alternatives I could think of was to try to get the information from clang,
but I don't think it should provide that kind of low level information.
Or add r600 support to librt.

Clang is already defining cl_khr_fp64 for SI+ devices. I think these other
define belong in clang too. Clang's X86 front-end defines a __FMA__ macro,
plus macros for other instructions, so I think we should follow the same
convention for r600/si.

This halves the size of builtin.opt.{cedar,barts}.bc

LGTM.

LGTM.

> Make cl_khr_fp64 define per-device.
> This patch does not change the generated Makefile
>
> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
> ---
> I've tried to find a way to make libclc more device specific.
> This series handles fp64, but I plan to extend it to other defines like
> __CLC_HAVE_FMA, or (with Tom's recent patches) __CLC_HAVE_LDEXP.
>
> Alternatives I could think of was to try to get the information from clang,
> but I don't think it should provide that kind of low level information.
> Or add r600 support to librt.
>

Clang is already defining cl_khr_fp64 for SI+ devices. I think these other
define belong in clang too. Clang's X86 front-end defines a __FMA__ macro,
plus macros for other instructions, so I think we should follow the same
convention for r600/si.

I found the clang target definitions. It does not really solve the
problem of having the information in two different places, but I guess
clang is a better place than libclc. I've prepared a patch to get it
working on all asics.
Do we want this solution for NVPTX/until 3.7 is released?
so the process would be to add a desired define to libclc/configure, and
clang, and remove it from configure when the respective clang version
gets released?

jan

> > Make cl_khr_fp64 define per-device.
> > This patch does not change the generated Makefile
> >
> > Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
> > ---
> > I've tried to find a way to make libclc more device specific.
> > This series handles fp64, but I plan to extend it to other defines like
> > __CLC_HAVE_FMA, or (with Tom's recent patches) __CLC_HAVE_LDEXP.
> >
> > Alternatives I could think of was to try to get the information from clang,
> > but I don't think it should provide that kind of low level information.
> > Or add r600 support to librt.
> >
>
> Clang is already defining cl_khr_fp64 for SI+ devices. I think these other
> define belong in clang too. Clang's X86 front-end defines a __FMA__ macro,
> plus macros for other instructions, so I think we should follow the same
> convention for r600/si.

I found the clang target definitions. It does not really solve the
problem of having the information in two different places, but I guess
clang is a better place than libclc. I've prepared a patch to get it
working on all asics.
Do we want this solution for NVPTX/until 3.7 is released?
so the process would be to add a desired define to libclc/configure, and
clang, and remove it from configure when the respective clang version
gets released?

jan

I would leave NVPTX as is and let people who work on it make the
decision about what to do.

In general, I think we can remove defines from libclc ToT that exit
in clang ToT.

-Tom

> > > Make cl_khr_fp64 define per-device.
> > > This patch does not change the generated Makefile
> > >
> > > Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
> > > ---
> > > I've tried to find a way to make libclc more device specific.
> > > This series handles fp64, but I plan to extend it to other defines like
> > > __CLC_HAVE_FMA, or (with Tom's recent patches) __CLC_HAVE_LDEXP.
> > >
> > > Alternatives I could think of was to try to get the information from clang,
> > > but I don't think it should provide that kind of low level information.
> > > Or add r600 support to librt.
> > >
> >
> > Clang is already defining cl_khr_fp64 for SI+ devices. I think these other
> > define belong in clang too. Clang's X86 front-end defines a __FMA__ macro,
> > plus macros for other instructions, so I think we should follow the same
> > convention for r600/si.
>
> I found the clang target definitions. It does not really solve the
> problem of having the information in two different places, but I guess
> clang is a better place than libclc. I've prepared a patch to get it
> working on all asics.
> Do we want this solution for NVPTX/until 3.7 is released?
> so the process would be to add a desired define to libclc/configure, and
> clang, and remove it from configure when the respective clang version
> gets released?
>
> jan
>

I would leave NVPTX as is and let people who work on it make the
decision about what to do.

In general, I think we can remove defines from libclc ToT that exit
in clang ToT.

should I go ahead with 1/3, and adapt 2/3 to remove cl_khr_fp64 for all
amd targets, now that clang provides the define?
what about EdB's effort to maintain 3.6 compatibility?

jan

Make cl_khr_fp64 define per-device.
This patch does not change the generated Makefile

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
---
I've tried to find a way to make libclc more device specific.
This series handles fp64, but I plan to extend it to other defines like
__CLC_HAVE_FMA, or (with Tom's recent patches) __CLC_HAVE_LDEXP.

Alternatives I could think of was to try to get the information from clang,
but I don't think it should provide that kind of low level information.
Or add r600 support to librt.

Clang is already defining cl_khr_fp64 for SI+ devices. I think these other
define belong in clang too. Clang's X86 front-end defines a __FMA__ macro,
plus macros for other instructions, so I think we should follow the same
convention for r600/si.

I found the clang target definitions. It does not really solve the
problem of having the information in two different places, but I guess
clang is a better place than libclc. I've prepared a patch to get it
working on all asics.
Do we want this solution for NVPTX/until 3.7 is released?
so the process would be to add a desired define to libclc/configure, and
clang, and remove it from configure when the respective clang version
gets released?

jan

I would leave NVPTX as is and let people who work on it make the
decision about what to do.

In general, I think we can remove defines from libclc ToT that exit
in clang ToT.

should I go ahead with 1/3, and adapt 2/3 to remove cl_khr_fp64 for all
amd targets, now that clang provides the define?
what about EdB's effort to maintain 3.6 compatibility?

To what extent do the recent patches depend on changes in clang/llvm
(besides the cl_khr_fp64 change you mention above)? If there are quite
a few of these, then it doesn’t seem worth keeping 3.6 compatibility, and
I would be in favour of creating a release_36 branch for the last known
good version for llvm 3.6; just as we did for llvm 3.5.

Jeroen

>
>>>>> Make cl_khr_fp64 define per-device.
>>>>> This patch does not change the generated Makefile
>>>>>
>>>>> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
>>>>> ---
>>>>> I've tried to find a way to make libclc more device specific.
>>>>> This series handles fp64, but I plan to extend it to other defines like
>>>>> __CLC_HAVE_FMA, or (with Tom's recent patches) __CLC_HAVE_LDEXP.
>>>>>
>>>>> Alternatives I could think of was to try to get the information from clang,
>>>>> but I don't think it should provide that kind of low level information.
>>>>> Or add r600 support to librt.
>>>>>
>>>>
>>>> Clang is already defining cl_khr_fp64 for SI+ devices. I think these other
>>>> define belong in clang too. Clang's X86 front-end defines a __FMA__ macro,
>>>> plus macros for other instructions, so I think we should follow the same
>>>> convention for r600/si.
>>>
>>> I found the clang target definitions. It does not really solve the
>>> problem of having the information in two different places, but I guess
>>> clang is a better place than libclc. I've prepared a patch to get it
>>> working on all asics.
>>> Do we want this solution for NVPTX/until 3.7 is released?
>>> so the process would be to add a desired define to libclc/configure, and
>>> clang, and remove it from configure when the respective clang version
>>> gets released?
>>>
>>> jan
>>>
>>
>> I would leave NVPTX as is and let people who work on it make the
>> decision about what to do.
>>
>> In general, I think we can remove defines from libclc ToT that exit
>> in clang ToT.
>
> should I go ahead with 1/3, and adapt 2/3 to remove cl_khr_fp64 for all
> amd targets, now that clang provides the define?
> what about EdB's effort to maintain 3.6 compatibility?

To what extent do the recent patches depend on changes in clang/llvm
(besides the cl_khr_fp64 change you mention above)? If there are quite
a few of these, then it doesn’t seem worth keeping 3.6 compatibility, and
I would be in favour of creating a release_36 branch for the last known
good version for llvm 3.6; just as we did for llvm 3.5.

Right now there's only __HAS_LDEXPF__ that is used for r600.
If the define is not present (like older llvm), it should use the __clc
version of that function. so it should be safe (albeit slower) to use
llvm3.6. it was my intention to follow this pattern (provide only
positive information defines).

cl_krh_fp64 is provided only by clang3.7.
changing patch 2/3 to remove -Dcl_khr_fp64 from all AMD targets will
remove fp64 functions from AMD hw on llvm3.6. I have no idea whether
fp64 works at all with 3.6 (I'd say it does not).

the posted versions of 1/3, and 2/3 are safe for llvm3.6

jan

Hi Tom, Jeroen,

any objections to pushing 1/3?
it never got review

thank you,
jan

Hi Tom, Jeroen,

any objections to pushing 1/3?
it never got review

Sorry for the delay. Why is this still needed if the defines are in clang?
Is it for supporting older versions of llvm?

-Tom

> Hi Tom, Jeroen,
>
> any objections to pushing 1/3?
> it never got review
>

Sorry for the delay. Why is this still needed if the defines are in
clang?
Is it for supporting older versions of llvm?

yes. I planned to sen v2 of 2/3 dropping all defines for amd hw, but
since then EdB's patch with llvm3.6 support got in, so I guessed that
ppl care about llvm
3.6.

Jan

> > Hi Tom, Jeroen,
> >
> > any objections to pushing 1/3?
> > it never got review
> >
>
> Sorry for the delay. Why is this still needed if the defines are in
> clang?
> Is it for supporting older versions of llvm?

yes. I planned to sen v2 of 2/3 dropping all defines for amd hw, but
since then EdB's patch with llvm3.6 support got in, so I guessed that
ppl care about llvm
3.6.

Is it possible to only add the defines when compiling with llvm 3.6?

-Tom

> > > Hi Tom, Jeroen,
> > >
> > > any objections to pushing 1/3?
> > > it never got review
> >
> > Sorry for the delay. Why is this still needed if the defines are in
> > clang?
> > Is it for supporting older versions of llvm?
>
> yes. I planned to sen v2 of 2/3 dropping all defines for amd hw, but
> since then EdB's patch with llvm3.6 support got in, so I guessed that
> ppl care about llvm
> 3.6.

Is it possible to only add the defines when compiling with llvm 3.6?

Jan, thanks for carrying about llvm 3.6.
However when I made the patch in order to retain compatibly with older
version, I didn't want it to be a burden.
So, if the here no easy and clean way to keep those define for older version,
may be, as Jeroen suggests earlier, it's time for Tom to create a release_36
branch. Then, right after creating the branch, remove the file SOURCES_LLVM3.6
and the LLVM3.6 dir and only claim ToT compatibility.

> > > > Hi Tom, Jeroen,
> > > >
> > > > any objections to pushing 1/3?
> > > > it never got review
> > >
> > > Sorry for the delay. Why is this still needed if the defines are in
> > > clang?
> > > Is it for supporting older versions of llvm?
> >
> > yes. I planned to sen v2 of 2/3 dropping all defines for amd hw, but
> > since then EdB's patch with llvm3.6 support got in, so I guessed that
> > ppl care about llvm
> > 3.6.
>
> Is it possible to only add the defines when compiling with llvm 3.6?

Jan, thanks for carrying about llvm 3.6.
However when I made the patch in order to retain compatibly with older
version, I didn't want it to be a burden.
So, if the here no easy and clean way to keep those define for older version,
may be, as Jeroen suggests earlier, it's time for Tom to create a release_36
branch. Then, right after creating the branch, remove the file SOURCES_LLVM3.6
and the LLVM3.6 dir and only claim ToT compatibility.

Hi,

modifying the patch to support llvm3.6 is easy, my plan is to change the
defines field to array with entry for each supported llvm version.
However I have currently neither access to my dev box, nor reliable
internet connection, so it might take some time, before I post v2.

jan

Hi Tom, Jeroen,

any objections to pushing 1/3?
it never got review

Sorry for the delay. Why is this still needed if the defines are in
clang?
Is it for supporting older versions of llvm?

yes. I planned to sen v2 of 2/3 dropping all defines for amd hw, but
since then EdB's patch with llvm3.6 support got in, so I guessed that
ppl care about llvm
3.6.

Is it possible to only add the defines when compiling with llvm 3.6?

Jan, thanks for carrying about llvm 3.6.
However when I made the patch in order to retain compatibly with older
version, I didn't want it to be a burden.
So, if the here no easy and clean way to keep those define for older version,
may be, as Jeroen suggests earlier, it's time for Tom to create a release_36
branch. Then, right after creating the branch, remove the file SOURCES_LLVM3.6
and the LLVM3.6 dir and only claim ToT compatibility.

Hi,

modifying the patch to support llvm3.6 is easy, my plan is to change the
defines field to array with entry for each supported llvm version.
However I have currently neither access to my dev box, nor reliable
internet connection, so it might take some time, before I post v2.

jan

Hi,

Since 3.6 works for now thanks to efforts EdB, I suggest we keep it as
long as it’s not too hard to maintain.

Jeroen