I think you should use mesa for the middle part/vendor instead of leaving it empty
---
amdgcn-mesa3d/lib/OVERRIDES | 1 +
amdgcn-mesa3d/lib/SOURCES | 1 +
amdgcn-mesa3d/lib/workitem/get_num_groups.cl | 10 ++++++++++
3 files changed, 12 insertions(+)
create mode 100644 amdgcn-mesa3d/lib/OVERRIDES
create mode 100644 amdgcn-mesa3d/lib/SOURCES
create mode 100644 amdgcn-mesa3d/lib/workitem/get_num_groups.cldiff --git a/amdgcn-mesa3d/lib/OVERRIDES b/amdgcn-
mesa3d/lib/OVERRIDES
new file mode 100644
index 0000000..c9bd69b
--- /dev/null
+++ b/amdgcn-mesa3d/lib/OVERRIDES
@@ -0,0 +1 @@
+workitem/get_num_groups.ll
diff --git a/amdgcn-mesa3d/lib/SOURCES b/amdgcn-mesa3d/lib/SOURCES
new file mode 100644
index 0000000..77eda67
--- /dev/null
+++ b/amdgcn-mesa3d/lib/SOURCES
@@ -0,0 +1 @@
+workitem/get_num_groups.cl
diff --git a/amdgcn-mesa3d/lib/workitem/get_num_groups.cl b/amdgcn-
mesa3d/lib/workitem/get_num_groups.cl
new file mode 100644
index 0000000..3a4a6f4
--- /dev/null
+++ b/amdgcn-mesa3d/lib/workitem/get_num_groups.cl
@@ -0,0 +1,10 @@
+#include <clc/clc.h>
+
+_CLC_DEF size_t get_num_groups(uint dim) {
+ switch (dim) {
+ case 0: return __builtin_amdgcn_workgroup_count_x();
+ case 1: return __builtin_amdgcn_workgroup_count_y();
+ case 2: return __builtin_amdgcn_workgroup_count_z();
I planned to switch r600 to reading these (and global size and local
size) from implicitarg ptr. Is there an advantage to gcn reading this
from dispatch info, rather than implicitarg ptr?
Jan
> ---
> amdgcn-mesa3d/lib/OVERRIDES | 1 +
> amdgcn-mesa3d/lib/SOURCES | 1 +
> amdgcn-mesa3d/lib/workitem/get_num_groups.cl | 10 ++++++++++
> 3 files changed, 12 insertions(+)
> create mode 100644 amdgcn-mesa3d/lib/OVERRIDES
> create mode 100644 amdgcn-mesa3d/lib/SOURCES
> create mode 100644 amdgcn-mesa3d/lib/workitem/get_num_groups.cl
>
> diff --git a/amdgcn-mesa3d/lib/OVERRIDES b/amdgcn-
> mesa3d/lib/OVERRIDES
> new file mode 100644
> index 0000000..c9bd69b
> --- /dev/null
> +++ b/amdgcn-mesa3d/lib/OVERRIDES
> @@ -0,0 +1 @@
> +workitem/get_num_groups.ll
> diff --git a/amdgcn-mesa3d/lib/SOURCES b/amdgcn-mesa3d/lib/SOURCES
> new file mode 100644
> index 0000000..77eda67
> --- /dev/null
> +++ b/amdgcn-mesa3d/lib/SOURCES
> @@ -0,0 +1 @@
> +workitem/get_num_groups.cl
> diff --git a/amdgcn-mesa3d/lib/workitem/get_num_groups.cl b/amdgcn-
> mesa3d/lib/workitem/get_num_groups.cl
> new file mode 100644
> index 0000000..3a4a6f4
> --- /dev/null
> +++ b/amdgcn-mesa3d/lib/workitem/get_num_groups.cl
> @@ -0,0 +1,10 @@
> +#include <clc/clc.h>
> +
> +_CLC_DEF size_t get_num_groups(uint dim) {
> + switch (dim) {
> + case 0: return __builtin_amdgcn_workgroup_count_x();
> + case 1: return __builtin_amdgcn_workgroup_count_y();
> + case 2: return __builtin_amdgcn_workgroup_count_z();I planned to switch r600 to reading these (and global size and local
size) from implicitarg ptr. Is there an advantage to gcn reading this
from dispatch info, rather than implicitarg ptr?
These intrinsics read these values from the user SGPRs and not the
dispatch info. This is really the only difference between what Mesa
does and what HSA does.
-Tom
>
> >
> > ---
> > amdgcn-mesa3d/lib/OVERRIDES | 1 +
> > amdgcn-mesa3d/lib/SOURCES | 1 +
> > amdgcn-mesa3d/lib/workitem/get_num_groups.cl | 10 ++++++++++
> > 3 files changed, 12 insertions(+)
> > create mode 100644 amdgcn-mesa3d/lib/OVERRIDES
> > create mode 100644 amdgcn-mesa3d/lib/SOURCES
> > create mode 100644 amdgcn-mesa3d/lib/workitem/get_num_groups.cl
> >
> > diff --git a/amdgcn-mesa3d/lib/OVERRIDES b/amdgcn-
> > mesa3d/lib/OVERRIDES
> > new file mode 100644
> > index 0000000..c9bd69b
> > --- /dev/null
> > +++ b/amdgcn-mesa3d/lib/OVERRIDES
> > @@ -0,0 +1 @@
> > +workitem/get_num_groups.ll
> > diff --git a/amdgcn-mesa3d/lib/SOURCES b/amdgcn-
> > mesa3d/lib/SOURCES
> > new file mode 100644
> > index 0000000..77eda67
> > --- /dev/null
> > +++ b/amdgcn-mesa3d/lib/SOURCES
> > @@ -0,0 +1 @@
> > +workitem/get_num_groups.cl
> > diff --git a/amdgcn-mesa3d/lib/workitem/get_num_groups.cl
> > b/amdgcn-
> > mesa3d/lib/workitem/get_num_groups.cl
> > new file mode 100644
> > index 0000000..3a4a6f4
> > --- /dev/null
> > +++ b/amdgcn-mesa3d/lib/workitem/get_num_groups.cl
> > @@ -0,0 +1,10 @@
> > +#include <clc/clc.h>
> > +
> > +_CLC_DEF size_t get_num_groups(uint dim) {
> > + switch (dim) {
> > + case 0: return __builtin_amdgcn_workgroup_count_x();
> > + case 1: return __builtin_amdgcn_workgroup_count_y();
> > + case 2: return __builtin_amdgcn_workgroup_count_z();
>
> I planned to switch r600 to reading these (and global size and
> local
> size) from implicitarg ptr. Is there an advantage to gcn reading
> this
> from dispatch info, rather than implicitarg ptr?
>These intrinsics read these values from the user SGPRs and not the
dispatch info. This is really the only difference between what Mesa
does and what HSA does.
The questions are mostly out of my curiosity, I can't really comment on
HSA patches.
I take it SGPRs are faster?
If these intrinsics follow HSA ABI shouldn't they be in amdgcn-amdhsa
directory?
The first patch introduced amdgcn-mesa3d directory, but if mesa/clover
is switched to use HSA ABI shouldn't libclc produced for amdgcn-amdhsa
work for clover as well?
thanks,
Jan
These are defined in the HSA ABI, but are never actually used since they are not really implemented in the microcode. They also introduce the possibility of using more user SGPRs than the maximum of 16
-Matt
>
>
> >
> > >
> > >
> > > I planned to switch r600 to reading these (and global size and
> > > local
> > > size) from implicitarg ptr. Is there an advantage to gcn
> > > reading
> > > this
> > > from dispatch info, rather than implicitarg ptr?
> > >
> >
> > These intrinsics read these values from the user SGPRs and not
> > the
> > dispatch info. This is really the only difference between what
> > Mesa
> > does and what HSA does.
>
> The questions are mostly out of my curiosity, I can't really
> comment on
> HSA patches.
> I take it SGPRs are faster?
> If these intrinsics follow HSA ABI shouldn't they be in amdgcn-
> amdhsa
> directory?
> The first patch introduced amdgcn-mesa3d directory, but if
> mesa/clover
> is switched to use HSA ABI shouldn't libclc produced for amdgcn-
> amdhsa
> work for clover as well?
>
> thanks,
> JanThese are defined in the HSA ABI, but are never actually used since
they are not really implemented in the microcode. They also introduce
the possibility of using more user SGPRs than the maximum of 16
I'm not sure what this implies. does gcn hw need fw update to use this?
the rest of the questions still remain.
why --mesa3d directory if it should use the same ABI as HSA?
is it more beneficial for gcn clover to share code with r600 clover or
gcn hsa?
Jan