RFC: R600, a new backend for AMD GPUs

Hi,

We've been working on an LLVM backend for the previous generation of AMD
GPUs (HD 2XXX - HD 6XXX) and we would like submit it for inclusion in the
main LLVM tree. The latest code can be found in this git repository:
http://cgit.freedesktop.org/~tstellar/llvm/ in the r600-initial-review
branch or if you prefer you can download the entire tree with this link:
http://cgit.freedesktop.org/~tstellar/llvm/snapshot/llvm-r600-initial-review.tar.gz
The R600 backend is located in lib/Target/AMDIL

First, a brief description of the backend:

The r600 backend is being developed as a part of the Open Source compute
stack in Mesa (http://www.mesa3d.org/), which uses the Gallium API.
It uses large portions of the AMDIL backend which was open-sourced
last December and you'll notice the TargetMachine for this backend
(AMDGPUTargetMachine) is a sub-class of AMDILTargetMachine. We are also
currently working on an LLVM backend for Southern Islands GPUs, and we
would like to get that code into the LLVM tree as well, once it has been
approved for release. The Southern Islands backend will be used for
compute shaders and also graphics shaders in the AMD open source 3D driver.

One thing that I would like to point out is that all of the
code from the AMDIL backend is licensed under a BSD license with an
additional clause that deals with United States export laws (non-AMDIL
code is licensed with the University of Illinois Open Source License).
Will the LLVM project accept a backend with code licensed under this BSD
license? We would prefer to keep this license, but if it isn't
acceptable, we can try to relicense it.

Second, I am looking for two categories of feedback for the r600 backend:

1. What changes do I need to make to get the backend included in the LLVM tree.
2. What changes can I make to improve the backend overall.

My top priority is to get the backend into the LLVM tree, so
when you provide feedback if you could be clear about what
changes are needed to get the backend into the tree versus what changes
are just general improvements, I would appreciate it.

Lastly, I did a very brief run through of the code to check the coding style,
but I know there are still some violations. For example, a lot of the
file headers are missing file descriptions. I didn't want to spend a
lot of time on coding style changes prior to the initial review in case
I was asked to make big changes to the code, so I will address these
issues once I have received an OK on the organization of the code.
However, please still point out coding style errors to me, and I'll be
sure to fix them during the final pass.

Looking forward to your feedback.

Thanks,
Tom Stellard

Tom,
Two things. One is missing tests. I have some I could send you, but they are mainly OpenCL based for the AMDIL backend, not for the R600.

That brings me to the second thing. Are the AMDIL backend and the R600 backend the same, or not? At this point, they really do feel like they are separate back ends, with one dependent on the other.

As there is no other backend that is dependent on another backend in the tree, how would that work if the back ends diverge? Should we work on integrating both closer, or separate them completely?

Micah

Tom,
Two things. One is missing tests. I have some I could send you, but they are mainly OpenCL based for the AMDIL backend, not for the R600.

I've started working on the tests, and I've pushed a few up to my llvm
repo. I can add the AMDIL tests too if you want to send them. I think
they will be helpful.

That brings me to the second thing. Are the AMDIL backend and the R600 backend the same, or not? At this point, they really do feel like they are separate back ends, with one dependent on the other.

As there is no other backend that is dependent on another backend in the tree, how would that work if the back ends diverge? Should we work on integrating both closer, or separate them completely?

I think the R600 backend is probably more like an alternative code
generator for the AMDIL backend rather than a stand-alone backend.
I think keeping them closely integrated is a good idea for now, because
they share a lot of code. Maybe we could look into doing something
similar to what the ARM backend does and create a common base class for
shared code.

-Tom

Hello everyone,

  I have a design choice question about MCAsmStreamer. Recently I found
myself in need to override a single method from it (EmitCommonSymbol) for
Hexagon backend while leaving all other functionality intact. I was
surprised to realize that I had to create a whole new class
HexagonMCAsmStreamer public MCStreamer which is 99% duplicate of
MCAsmStreamer.
  My question is - what was the reason for designing the MCAsmStreamer the
way it is? Why could it not be made "derivable" for targets? I could see
that PTX had to do it the same way, and maybe they had a good reason to
override multiple methods, but I see no justification in my case for
creating duplicate code base of that complexity and size.
  If there is no compelling reason for the current implementation, I would
be more than happy to provide a patch.

Thanks.

Sergei Larin

Has anyone had a chance to review this code yet? What about the
licensing issue? Getting this backend included in LLVM is currently
a blocker for the Mesa Open Source compute stack on AMD hardware as
well as the Southern Islands Open Source 3D driver, so I would really
appreciate it if someone could take a look at the code and let me know what
is needed to get it upstream.

Thanks,
Tom Stellard

I've just received approval to relicense all the code under the
University of Illinois Open Source License, so this shouldn't be an
issue any more.

-Tom

This is good news, I was going to mention that I think this would be necessary to move forward.

I can start taking a look through this, but I’m not a backend expert and so you may need to work to get some others involved in the review as well.

Hi,

I'm not sure if anyone has had a chance to look at this yet, but I've
just updated my tree
http://cgit.freedesktop.org/~tstellar/llvm/ r600-initial-review,
with the following changes:

1. Relicensed everything under the University of Illinois Open Source
License
2. Rebased against current LLVM TOT
3. Added Support for Southern Islands GPUs

Looking forward to your feedback.

-Tom

Hi,

Based on some feedback I received on IRC today, it sounds like this
is just too much code to review at once. I'm going to try to strip
the R600 backend down to the bare minimum that is needed for our Mesa
drivers and then repost the code to the list. Hopefully, this will make it
easier for people to review.

-Tom

Hi Tom,

I have a higher-level question regarding this back-end. If I have an LLVM IR module and run it through this back-end, it seems like the only output option is a binary format. Is this a device binary, or another intermediate format?

If the input LLVM IR module was a compute kernel, how would I go about executing it on an AMD GPU? Can I use the APP SDK to load the binary, perhaps through the CAL interfaces? How about the OpenCL binary interface?

Hi Tom,

I have a higher-level question regarding this back-end. If I have an LLVM
IR module and run it through this back-end, it seems like the only output
option is a binary format. Is this a device binary, or another
intermediate format?

If the input LLVM IR module was a compute kernel, how would I go about
executing it on an AMD GPU? Can I use the APP SDK to load the binary,
perhaps through the CAL interfaces? How about the OpenCL binary interface?

Hi Justin,

The binary produced by this backend is meant to be consumed by AMD's Open
Source 3D/Compute drivers which are part of the Mesa3D[1] project. The
backend is integrated into the driver, so you don't need to compile
shaders offline. Currently we are using the backend for graphics and
compute shaders in our r600g driver (HD2xxx-HD6xxx GPUs) and for graphics
in our radeonsi (HD7xxx GPUs). In the future we will use it for compute
shaders on radensi too.

In order to use the backend for graphics on r600g, you need to build
Mesa with the --enable-r600-llvm-compiler option. For compute the
installation instructions are here:
http://dri.freedesktop.org/wiki/GalliumCompute

We're working hard to get everything upstream into LLVM to so we can
have compute shaders working out of the box, so our users don't need to
manually apply patches.

Let me know if you have any other questions.

-Tom

[1] http://www.mesa3d.org/

Hi Tom,

I have a higher-level question regarding this back-end. If I have an LLVM
IR module and run it through this back-end, it seems like the only output
option is a binary format. Is this a device binary, or another
intermediate format?

If the input LLVM IR module was a compute kernel, how would I go about
executing it on an AMD GPU? Can I use the APP SDK to load the binary,
perhaps through the CAL interfaces? How about the OpenCL binary interface?

Hi Justin,

The binary produced by this backend is meant to be consumed by AMD’s Open
Source 3D/Compute drivers which are part of the Mesa3D[1] project. The
backend is integrated into the driver, so you don’t need to compile
shaders offline. Currently we are using the backend for graphics and
compute shaders in our r600g driver (HD2xxx-HD6xxx GPUs) and for graphics
in our radeonsi (HD7xxx GPUs). In the future we will use it for compute
shaders on radensi too.

In order to use the backend for graphics on r600g, you need to build
Mesa with the --enable-r600-llvm-compiler option. For compute the
installation instructions are here:
http://dri.freedesktop.org/wiki/GalliumCompute

We’re working hard to get everything upstream into LLVM to so we can
have compute shaders working out of the box, so our users don’t need to
manually apply patches.

Okay, so there is no way to use the backend to produce loadable compute kernels for the proprietary drivers, or on Mac/windows?

>
> > Hi Tom,
> >
> > I have a higher-level question regarding this back-end. If I have an
LLVM
> > IR module and run it through this back-end, it seems like the only
output
> > option is a binary format. Is this a device binary, or another
> > intermediate format?
> >
> > If the input LLVM IR module was a compute kernel, how would I go about
> > executing it on an AMD GPU? Can I use the APP SDK to load the binary,
> > perhaps through the CAL interfaces? How about the OpenCL binary
interface?
> >
>
> Hi Justin,
>
> The binary produced by this backend is meant to be consumed by AMD's Open
> Source 3D/Compute drivers which are part of the Mesa3D[1] project. The
> backend is integrated into the driver, so you don't need to compile
> shaders offline. Currently we are using the backend for graphics and
> compute shaders in our r600g driver (HD2xxx-HD6xxx GPUs) and for graphics
> in our radeonsi (HD7xxx GPUs). In the future we will use it for compute
> shaders on radensi too.
>
> In order to use the backend for graphics on r600g, you need to build
> Mesa with the --enable-r600-llvm-compiler option. For compute the
> installation instructions are here:
> http://dri.freedesktop.org/wiki/GalliumCompute
>
> We're working hard to get everything upstream into LLVM to so we can
> have compute shaders working out of the box, so our users don't need to
> manually apply patches.

Okay, so there is no way to use the backend to produce loadable compute
kernels for the proprietary drivers, or on Mac/windows?

Right, this is not currently possible.

-Tom

fwiw..

Sometimes the bleeding edge features found only on Windows (e.g. zero copy) are interesting to evaluate.
But by in large, HPC folks refuse to even entertain the idea of booting nodes of a cluster to Windows.
An open source solution like this is very appealing for large-scale scientific workloads.

Marcus

Justin,

This backend is for the Open Source linux driver, not the AMD Catalyst OpenCL driver. The backend for AMDIL that I open sourced in December can be used on the APP SDK with some modifications via calclCompile and can be loaded with a tool that creates ELF files via clCompileProgramFromBinary in OpenCL.

We are working on making this process more useful, but right now it requires outside tools to work.

Micah

From: Stellard, Thomas
Sent: Monday, May 28, 2012 9:07 AM
To: Justin Holewinski
Cc: Villmow, Micah; Tom Stellard; llvmdev@cs.uiuc.edu
Subject: Re: [LLVMdev] RFC: R600, a new backend for AMD GPUs

> >
> > > Hi Tom,
> > >
> > > I have a higher-level question regarding this back-end. If I have
> > > an
> LLVM
> > > IR module and run it through this back-end, it seems like the only
> output
> > > option is a binary format. Is this a device binary, or another
> > > intermediate format?
> > >
> > > If the input LLVM IR module was a compute kernel, how would I go
> > > about executing it on an AMD GPU? Can I use the APP SDK to load
> > > the binary, perhaps through the CAL interfaces? How about the
> > > OpenCL binary
> interface?
> > >
> >
> > Hi Justin,
> >
> > The binary produced by this backend is meant to be consumed by AMD's
> > Open Source 3D/Compute drivers which are part of the Mesa3D[1]
> > project. The backend is integrated into the driver, so you don't
> > need to compile shaders offline. Currently we are using the backend
> > for graphics and compute shaders in our r600g driver (HD2xxx-HD6xxx
> > GPUs) and for graphics in our radeonsi (HD7xxx GPUs). In the future
> > we will use it for compute shaders on radensi too.
> >
> > In order to use the backend for graphics on r600g, you need to build
> > Mesa with the --enable-r600-llvm-compiler option. For compute the
> > installation instructions are here:
> > http://dri.freedesktop.org/wiki/GalliumCompute
> >
> > We're working hard to get everything upstream into LLVM to so we can
> > have compute shaders working out of the box, so our users don't need
> > to manually apply patches.
>
> Okay, so there is no way to use the backend to produce loadable
> compute kernels for the proprietary drivers, or on Mac/windows?
>

Right, this is not currently possible.

[Villmow, Micah] This is not possible with the R600 code generator, yes, but it
is possible with some ELF fiddling with the AMDIL code generator which the R600
is based on.

Is there a version of the AMDIL back-end that is compatible with LLVM 3.0/3.1?

It is in the pipeline to be made public, just have not had time to finish it.

Micah