[PATCH] configure: Allow targets to override generic cl implementations with LLVM IR

Hi Tom,

I am happy with the idea of allowing targets to override .cl's with
.ll's (or vice versa).

However, I don't think this would correctly handle the file
layout we currently have for add_sat (and sub_sat), which is
currently implemented using three files: add_sat.cl, add_sat.ll
and add_sat_impl.ll. (I don't like the fact that this family of
functions has to be implemented using three files, but it turns out
to be necessary for PTX, which only supports two non-default calling
conventions).

If you can modify this patch to not break add_sat and sub_sat (for
example, by renaming some of the files), I'd be happy to accept it.

Thanks,

> From: Tom Stellard <thomas.stellard@amd.com>
>
> ---
> configure.py | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/configure.py b/configure.py
> index 66c6410..0449a0e 100755
> --- a/configure.py
> +++ b/configure.py
> @@ -112,8 +112,12 @@ for target in targets:
> manifest_deps.add(subdir_list_file)
> for src in open(subdir_list_file).readlines():
> src = src.rstrip()
> - if src not in sources_seen:
> - sources_seen.add(src)
> + # Only add the base filename (e.g. Add get_global_id instead of
> + # get_global_id.cl) to sources_seen.
> + # This allows targets to overide generic .cl sources with .ll sources.
> + src_base = os.path.splitext(src)[0]
> + if src_base not in sources_seen:
> + sources_seen.add(src_base)
> obj = os.path.join(target, 'lib', src + '.bc')
> objects.append(obj)
> src_file = os.path.join(libdir, src)

Hi Tom,

I am happy with the idea of allowing targets to override .cl's with
.ll's (or vice versa).

However, I don't think this would correctly handle the file
layout we currently have for add_sat (and sub_sat), which is
currently implemented using three files: add_sat.cl, add_sat.ll
and add_sat_impl.ll. (I don't like the fact that this family of
functions has to be implemented using three files, but it turns out
to be necessary for PTX, which only supports two non-default calling
conventions).

I took a look at the add_sat implementation, and I'm not quite sure how
the code is using an alternate calling convention. Would mind
explaining a little more what is happening here?

Would it be possible to move some of this code into the NVPTX
implementation?

-Tom

> > From: Tom Stellard <thomas.stellard@amd.com>
> >
> > ---
> > configure.py | 8 ++++++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/configure.py b/configure.py
> > index 66c6410..0449a0e 100755
> > --- a/configure.py
> > +++ b/configure.py
> > @@ -112,8 +112,12 @@ for target in targets:
> > manifest_deps.add(subdir_list_file)
> > for src in open(subdir_list_file).readlines():
> > src = src.rstrip()
> > - if src not in sources_seen:
> > - sources_seen.add(src)
> > + # Only add the base filename (e.g. Add get_global_id instead of
> > + # get_global_id.cl) to sources_seen.
> > + # This allows targets to overide generic .cl sources with .ll sources.
> > + src_base = os.path.splitext(src)[0]
> > + if src_base not in sources_seen:
> > + sources_seen.add(src_base)
> > obj = os.path.join(target, 'lib', src + '.bc')
> > objects.append(obj)
> > src_file = os.path.join(libdir, src)
>
> Hi Tom,
>
> I am happy with the idea of allowing targets to override .cl's with
> .ll's (or vice versa).
>
> However, I don't think this would correctly handle the file
> layout we currently have for add_sat (and sub_sat), which is
> currently implemented using three files: add_sat.cl, add_sat.ll
> and add_sat_impl.ll. (I don't like the fact that this family of
> functions has to be implemented using three files, but it turns out
> to be necessary for PTX, which only supports two non-default calling
> conventions).
>

I took a look at the add_sat implementation, and I'm not quite sure how
the code is using an alternate calling convention. Would mind
explaining a little more what is happening here?

Currently all OpenCL C functions when compiled to NVPTX use the
ptx_device calling convention for function calls. This means that
we cannot simply write the add_sat functions in IR in a generic way,
as the caller will expect the callee to use ptx_device on PTX and
the default calling convention on all other architectures. To solve
this, the functions are implemented in the default calling convention
in add_sat_impl.ll and add_sat.ll acts as a shim between the default
and the OpenCL C calling convention by calling the respective functions
in add_sat_impl.ll.

I'd love for there to be a better way to handle this. Perhaps if you are
at the LLVM developers meeting tomorrow we can discuss it then.

Would it be possible to move some of this code into the NVPTX
implementation?

Unfortunately this scheme could not be implemented without either
touching the generic part of the code or duplicating code for every
architecture.

Thanks,