GSoC proposal: TGSI compiler back-end.

Although I'm sending this as a GSoC proposal, I'm well aware that the
amount of work that a project of this kind involves largely exceeds the
scope of the GSoC program. I think that's okay: my work here wouldn't
be finished at the end of this summer by any means, it would merely be a
start.

TGSI is the intermediate representation that all open-source GPU drivers
using the Gallium3D architecture understand. Until now it's mainly been
used for graphics (vertex and fragment shaders and such), but doing
general-purpose computing with it is possible in principle (actually,
necessary for GL4), and it's been the object of a number of extensions
and improvements during the last couple of years to make it more
suitable for that purpose.

The objective of the project would be to set a basis for a compiler
back-end targeting the TGSI language.

The first to be benefited from such a back-end would be the Nouveau nv50
and nve4 drivers, that would get OpenCL support easily without much
additional work, other Gallium drivers will also benefit from it as they
implement the missing language and API bits.

I think a reasonable objective for this summer would be to be able to
generate correct TGSI code for a decent subset of the CL tests from the
piglit suite [1]. In any case I believe that it would be more important
to concentrate on writing a clean code base (that would be likely to be
reviewed by others and accepted into mainline in the future) and
addressing the main design challenges (ideally with the broadest
possible consensus from the community and the least possible quantity of
target-specific band-aids working around limitations of the common
infrastructure) than to concentrate on feature completeness.

Among the issues that have to be addressed is the fact that TGSI is a
stack-less architecture (though it seems that some register-based
calling convention could be used and some sort of emulated stack could
be used for the program's automatic storage), the fact that the language
has only limited (if any) support for unstructured control flow
(apparently 'Target/R600/AMDGPUStructurizeCFG.cpp' could be helpful if
we turned it into a generic transform pass, but I'm not convinced it's
the best choice yet), the fact that the language cannot represent any
arithmetic or memory access with a word width of less than 32 bits
(though that seems less of an issue...), and the fact that the MC layer
doesn't seem to fit the object format that Gallium expects right now
especially well.

Looking forward to your feedback.

[1] http://cgit.freedesktop.org/piglit

Francisco Jerez <currojerez@riseup.net> writes:

Although I'm sending this as a GSoC proposal, I'm well aware that the
amount of work that a project of this kind involves largely exceeds the
scope of the GSoC program. I think that's okay: my work here wouldn't
be finished at the end of this summer by any means, it would merely be a
start.

TGSI is the intermediate representation that all open-source GPU drivers
using the Gallium3D architecture understand. Until now it's mainly been
used for graphics (vertex and fragment shaders and such), but doing
general-purpose computing with it is possible in principle (actually,
necessary for GL4), and it's been the object of a number of extensions
and improvements during the last couple of years to make it more
suitable for that purpose.

The objective of the project would be to set a basis for a compiler
back-end targeting the TGSI language.

The first to be benefited from such a back-end would be the Nouveau nv50
and nve4 drivers, that would get OpenCL support easily without much
additional work, other Gallium drivers will also benefit from it as they
implement the missing language and API bits.

I think a reasonable objective for this summer would be to be able to
generate correct TGSI code for a decent subset of the CL tests from the
piglit suite [1]. In any case I believe that it would be more important
to concentrate on writing a clean code base (that would be likely to be
reviewed by others and accepted into mainline in the future) and
addressing the main design challenges (ideally with the broadest
possible consensus from the community and the least possible quantity of
target-specific band-aids working around limitations of the common
infrastructure) than to concentrate on feature completeness.

Among the issues that have to be addressed is the fact that TGSI is a
stack-less architecture (though it seems that some register-based
calling convention could be used and some sort of emulated stack could
be used for the program's automatic storage), the fact that the language
has only limited (if any) support for unstructured control flow
(apparently 'Target/R600/AMDGPUStructurizeCFG.cpp' could be helpful if
we turned it into a generic transform pass, but I'm not convinced it's
the best choice yet), the fact that the language cannot represent any
arithmetic or memory access with a word width of less than 32 bits
(though that seems less of an issue...), and the fact that the MC layer
doesn't seem to fit the object format that Gallium expects right now
especially well.

Looking forward to your feedback.

[1] http://cgit.freedesktop.org/piglit

Pity not to see any interest in this since I brought up the idea two
weeks ago. I've uploaded a first attempt at writing a TGSI back-end
here [2]. It's able to generate code -- though only in assembly form
and with many loose ends still.

Also, would it be possible for Tom Stellard (CC'ed) to mentor me? He's
been working on the R600 back-end (which is similar in purpose and
limitations) and the Mesa/Gallium3D project so he might be the right
person?

If it's OK I'll be preparing a more formal proposal during the next few
days.

[2] https://github.com/curro/llvm/commit/a1aad41463c36220f2c5b03645843f39e6bf1b9d

Francisco Jerez <currojerez@riseup.net> writes:

[...]
Pity not to see any interest in this since I brought up the idea two
weeks ago. I've uploaded a first attempt at writing a TGSI back-end
here [2]. It's able to generate code -- though only in assembly form
and with many loose ends still.

Also, would it be possible for Tom Stellard (CC'ed) to mentor me? He's
been working on the R600 back-end (which is similar in purpose and
limitations) and the Mesa/Gallium3D project so he might be the right
person?

If it's OK I'll be preparing a more formal proposal during the next few
days.

[2] TGSI back-end: Initial import. · curro/llvm@a1aad41 · GitHub

I'm attaching a preliminary version of my proposal -- would be happy to
get some feedback about it.

llvm-tgsi-backend-proposal.text (4.29 KB)

Francisco Jerez <currojerez@riseup.net> writes:

>[...]
> Pity not to see any interest in this since I brought up the idea two
> weeks ago. I've uploaded a first attempt at writing a TGSI back-end
> here [2]. It's able to generate code -- though only in assembly form
> and with many loose ends still.
>
> Also, would it be possible for Tom Stellard (CC'ed) to mentor me? He's
> been working on the R600 back-end (which is similar in purpose and
> limitations) and the Mesa/Gallium3D project so he might be the right
> person?
>
> If it's OK I'll be preparing a more formal proposal during the next few
> days.
>
> [2] https://github.com/curro/llvm/commit/a1aad41463c36220f2c5b03645843f39e6bf1b9d

Hi Francisco,

I would be happy to be a mentor for this project if it is accepted. I
have a few comments about your proposal:

I'm attaching a preliminary version of my proposal -- would be happy to
get some feedback about it.

GSoC proposal: TGSI compiler back-end.

- Proposal

TGSI is the intermediate representation that all open-source GPU
drivers using the Gallium3D architecture understand. Until now it's
mainly been used for graphics (vertex, fragment shaders, etc.), but
doing general-purpose computing with it is possible in principle
(actually, necessary for GL4), and it's been the object of a number of
extensions and improvements to make it more suitable for that purpose.

The TGSI IR has some peculiarities that are unusual in a typical CPU
instruction architecture (and slightly annoying to deal with) -- It's
a vector-centric architecture with a variable set of typeless
registers, no stack and no proper support for irreducible control
flow.

The objective of this project would be to write an LLVM compiler
back-end with the TGSI IR as target.

- Benefits

This back-end is the last piece missing for a working and fully
open-source implementation of OpenCL running on the nVidia nv50 and
nve4 architectures -- though there's nothing nVidia-specific in the
TGSI language, and code generated by this back-end will be expected to
be usable by any other driver implementing the compute API of
Gallium3D.

- Biographical background

I'm currently a masters student in the field of theoretical physics.

I've already (successfully) participated in the GSoC program with a
device driver development project (which had to do with
reverse-engineering nVidia's TV encoders) mentored by the X.Org
Foundation in 2009, after that I've remained a frequent contributor to
the Nouveau and Mesa projects for the next few years.

Last year I wrote most of an OpenCL implementation running on nVidia
hardware as part of the X.Org Foundation's EVoC program [1] -- the only
piece missing being the compiler.

I've gained some experience with LLVM by writing a proof-of-concept
TGSI back-end which is minimally working [2] -- the goal of this
project would be to bring it to a useful state.

- Timeline

Summary of the work that would be done:

I'm not sure what the current status of your TGSI backend, but I would
recommend getting assembly generation working first, since this will
enable you to write lit tests.

  * Get object file generation working.
    (approx. June 17 - July 8)

    The output format will be the one expected by Mesa. The
    implementation will take advantage of the existing MC assembler
    API as much as possible.

Can you elaborate a little more on the output format you will be using?
For example, will you be generating ELF binaries with special metadata
sections (This is what R600 currently does) or will you be creating your
own object format.

  * Fix handling of the multiple OpenCL address spaces.
    (approx. July 8 - July 22)

    Operations on __global, __local, and __private memory will be
    dealt with using the resource access opcodes, __kernel function
    parameters will be accessed through a special resource meant for
    parameter passing, __constant memory will be mapped to constant
    buffers.

  * Get function calls working reliably.
    (approx. July 22 - August 5)

    This will involve fixing the passing of aggregate types and
    anything that doesn't fit in a 32-bit register, fixing stack
    allocations (i.e. the "alloca" instruction), and fixing calls to
    functions that use the "kernel" calling convention from non-kernel
    functions.

  * Get control flow working reliably.
    (approx. August 5 - August 19)

    This will involve writing a control flow structurizing pass -- It
    might be possible to promote the R600 one to a common analysis
    pass and reuse it.

I have a feeling this task may take longer than two weeks. When you
write the final version of your proposal, I think you should have a
definitive plan for how you will implement the structurization. Whether
it's reusing existing R600 code (this is my recommendation) or writing
something from scratch.

Also, I would really prefer if your structurization solution was target
independent and could live outside of the backend in the common code,
because a good structurization solution would be a great benefit to the
LLVM project.

  * Get the missing arithmetic and data conversion instructions
    working.
    (approx. August 19 - August 26)

    Most of the floating point, integer and vector operations required
    by the OpenCL spec will be functional by the end of this period.

  * Work on the standard library and intrinsics.
    (approx. August 26 - September 16)

    This will involve getting a reasonable subset of the OpenCL
    standard library working, including math functions, thread
    synchronisation functions, atomic functions, memory barriers and
    surface sampling/write-back functions.

I'm assuming you are planning to use libclc (http://libclc.llvm.org) for
this.

While implementing standard library builtins is important, I think this
task may be a little bit outside the scope of this project. I would
recommend dropping this from the schedule and adding it as a task to
work on if you finish everything else early. This way you can give
yourself more time to work on the actual backend.

  * Documentation and remaining clean-up work.
    (approx. September 16 - September 23)

I think your proposal should also include a plan for getting the backend
into mainline LLVM, because this is really the ultimate goal of the
project. Your plan should include where the code repository will be
stored and how you will engage with the community to help you review
the code. I think this is really important no just for you, but also
for the LLVM community to know what they need to do as far as helping
get the backend into the main tree.

By the end of each period all the relevant OpenCL language tests from
the piglit suite [3] and opencl-example [4] will be expected to pass.
New tests will be written for implemented features that don't have
sufficient coverage from the existing test suites.

I know you'll be using the nouveau drivers to test this backend on real
hardware, and I think that's OK, but I do think you need to be careful
about not spending too much time fixing bugs in the nouveau driver. I
think piglit passes is a good goal, but I would also like to see OpenCL
or LLVM IR based lit tests added as a goal, because TGSI code gen
is the main focus of this project.

Thank you for submitting an early draft of your proposal, I think it is
really good to get developer feedback early. I would encourage you to
continue to submit drafts up until the deadline to maximize the input
you get from LLVM developers.

-Tom

Tom Stellard <tom@stellard.net> writes:

[...]
Hi Francisco,

Hi Tom,

I would be happy to be a mentor for this project if it is accepted. I
have a few comments about your proposal:

Great.

I'm attaching a preliminary version of my proposal -- would be happy to
get some feedback about it.

GSoC proposal: TGSI compiler back-end.

- Proposal

TGSI is the intermediate representation that all open-source GPU
drivers using the Gallium3D architecture understand. Until now it's
mainly been used for graphics (vertex, fragment shaders, etc.), but
doing general-purpose computing with it is possible in principle
(actually, necessary for GL4), and it's been the object of a number of
extensions and improvements to make it more suitable for that purpose.

The TGSI IR has some peculiarities that are unusual in a typical CPU
instruction architecture (and slightly annoying to deal with) -- It's
a vector-centric architecture with a variable set of typeless
registers, no stack and no proper support for irreducible control
flow.

The objective of this project would be to write an LLVM compiler
back-end with the TGSI IR as target.

- Benefits

This back-end is the last piece missing for a working and fully
open-source implementation of OpenCL running on the nVidia nv50 and
nve4 architectures -- though there's nothing nVidia-specific in the
TGSI language, and code generated by this back-end will be expected to
be usable by any other driver implementing the compute API of
Gallium3D.

- Biographical background

I'm currently a masters student in the field of theoretical physics.

I've already (successfully) participated in the GSoC program with a
device driver development project (which had to do with
reverse-engineering nVidia's TV encoders) mentored by the X.Org
Foundation in 2009, after that I've remained a frequent contributor to
the Nouveau and Mesa projects for the next few years.

Last year I wrote most of an OpenCL implementation running on nVidia
hardware as part of the X.Org Foundation's EVoC program [1] -- the only
piece missing being the compiler.

I've gained some experience with LLVM by writing a proof-of-concept
TGSI back-end which is minimally working [2] -- the goal of this
project would be to bring it to a useful state.

- Timeline

Summary of the work that would be done:

I'm not sure what the current status of your TGSI backend, but I would
recommend getting assembly generation working first, since this will
enable you to write lit tests.

That already sort of works... The only thing is that the assembly files
that it produces are somewhat non-standard because they include section
annotations and other unusual syntax that wouldn't be recognized by the
normal TGSI parser... It might be worth looking into it at some point
but I don't think it's very high-priority, what I have seems to be
enough to make lit happy.

  * Get object file generation working.
    (approx. June 17 - July 8)

    The output format will be the one expected by Mesa. The
    implementation will take advantage of the existing MC assembler
    API as much as possible.

Can you elaborate a little more on the output format you will be using?
For example, will you be generating ELF binaries with special metadata
sections (This is what R600 currently does) or will you be creating your
own object format.

I'd be fine with using ELF, but it would definitely need special
metadata sections as you say (for kernel prototypes and so on), and
clover would have to be fixed to deal with it correctly -- OTOH the
minimalistic format implemented in 'clover/core/module.cpp' seems to do
everything we need, so another option would be to stick to it.

  * Fix handling of the multiple OpenCL address spaces.
    (approx. July 8 - July 22)

    Operations on __global, __local, and __private memory will be
    dealt with using the resource access opcodes, __kernel function
    parameters will be accessed through a special resource meant for
    parameter passing, __constant memory will be mapped to constant
    buffers.

  * Get function calls working reliably.
    (approx. July 22 - August 5)

    This will involve fixing the passing of aggregate types and
    anything that doesn't fit in a 32-bit register, fixing stack
    allocations (i.e. the "alloca" instruction), and fixing calls to
    functions that use the "kernel" calling convention from non-kernel
    functions.

  * Get control flow working reliably.
    (approx. August 5 - August 19)

    This will involve writing a control flow structurizing pass -- It
    might be possible to promote the R600 one to a common analysis
    pass and reuse it.

I have a feeling this task may take longer than two weeks. When you
write the final version of your proposal, I think you should have a
definitive plan for how you will implement the structurization. Whether
it's reusing existing R600 code (this is my recommendation) or writing
something from scratch.

Reusing the R600 code would be possible for sure with just a few
changes, but I think it would be nice to split the algorithm in an
"analysis" and a "transformation" pass to leave the target the choice on
how irreducible edges in the CFG should be handled -- Depending on the
hardware and the specific case it might be better to remove irreducible
edges by duplicating basic blocks, by introducing temporary "control
flow" variables (as the SI structurizer does), or by not doing anything
at all (e.g. on nVidia hardware arbitrary branches are actually
supported, they're just somewhat inefficient).

It would also be nice if inter-pass dependencies were handled correctly
and we didn't have to disable any other optimization passes that
decanonicalize the control flow as R600 does.

I agree that two weeks might be too little for what I have in mind, but
I guess if we drop the standard library point below (or we make it
optional) it should be plenty of time to do it right.

Also, I would really prefer if your structurization solution was target
independent and could live outside of the backend in the common code,
because a good structurization solution would be a great benefit to the
LLVM project.

Yeah, that was my idea too.

  * Get the missing arithmetic and data conversion instructions
    working.
    (approx. August 19 - August 26)

    Most of the floating point, integer and vector operations required
    by the OpenCL spec will be functional by the end of this period.

  * Work on the standard library and intrinsics.
    (approx. August 26 - September 16)

    This will involve getting a reasonable subset of the OpenCL
    standard library working, including math functions, thread
    synchronisation functions, atomic functions, memory barriers and
    surface sampling/write-back functions.

I'm assuming you are planning to use libclc (http://libclc.llvm.org) for
this.

Yes.

While implementing standard library builtins is important, I think this
task may be a little bit outside the scope of this project. I would
recommend dropping this from the schedule and adding it as a task to
work on if you finish everything else early. This way you can give
yourself more time to work on the actual backend.

OK, I'll make this one optional.

  * Documentation and remaining clean-up work.
    (approx. September 16 - September 23)

I think your proposal should also include a plan for getting the backend
into mainline LLVM, because this is really the ultimate goal of the
project. Your plan should include where the code repository will be
stored and how you will engage with the community to help you review
the code. I think this is really important no just for you, but also
for the LLVM community to know what they need to do as far as helping
get the backend into the main tree.

I'm a little lost on this point... My plan is just to keep working on
it until it's good enough to be considered suitable for mainline,
meanwhile it could live in a separate repository in freedesktop or
github. Not sure what else would be expected from me -- of course, I'm
willing to keep fixing bugs, API breakages and reviewing related patches
once it's merged to mainline.

By the end of each period all the relevant OpenCL language tests from
the piglit suite [3] and opencl-example [4] will be expected to pass.
New tests will be written for implemented features that don't have
sufficient coverage from the existing test suites.

I know you'll be using the nouveau drivers to test this backend on real
hardware, and I think that's OK, but I do think you need to be careful
about not spending too much time fixing bugs in the nouveau driver. I
think piglit passes is a good goal, but I would also like to see OpenCL
or LLVM IR based lit tests added as a goal, because TGSI code gen
is the main focus of this project.

Yes, good point, I agree that for now it would make more sense to focus
on having extensive coverage in form of lit tests.

Thank you for submitting an early draft of your proposal, I think it is
really good to get developer feedback early. I would encourage you to
continue to submit drafts up until the deadline to maximize the input
you get from LLVM developers.

OK, I will do that. Thank you for taking the time to read and comment
on my proposal.

Tom Stellard <tom@stellard.net> writes:
>[...]
> Hi Francisco,
>

Hi Tom,

> I would be happy to be a mentor for this project if it is accepted. I
> have a few comments about your proposal:
>
Great.

>> I'm attaching a preliminary version of my proposal -- would be happy to
>> get some feedback about it.
>>
>
>> GSoC proposal: TGSI compiler back-end.
>>
>> - Proposal
>>
>> TGSI is the intermediate representation that all open-source GPU
>> drivers using the Gallium3D architecture understand. Until now it's
>> mainly been used for graphics (vertex, fragment shaders, etc.), but
>> doing general-purpose computing with it is possible in principle
>> (actually, necessary for GL4), and it's been the object of a number of
>> extensions and improvements to make it more suitable for that purpose.
>>
>> The TGSI IR has some peculiarities that are unusual in a typical CPU
>> instruction architecture (and slightly annoying to deal with) -- It's
>> a vector-centric architecture with a variable set of typeless
>> registers, no stack and no proper support for irreducible control
>> flow.
>>
>> The objective of this project would be to write an LLVM compiler
>> back-end with the TGSI IR as target.
>>
>> - Benefits
>>
>> This back-end is the last piece missing for a working and fully
>> open-source implementation of OpenCL running on the nVidia nv50 and
>> nve4 architectures -- though there's nothing nVidia-specific in the
>> TGSI language, and code generated by this back-end will be expected to
>> be usable by any other driver implementing the compute API of
>> Gallium3D.
>>
>> - Biographical background
>>
>> I'm currently a masters student in the field of theoretical physics.
>>
>> I've already (successfully) participated in the GSoC program with a
>> device driver development project (which had to do with
>> reverse-engineering nVidia's TV encoders) mentored by the X.Org
>> Foundation in 2009, after that I've remained a frequent contributor to
>> the Nouveau and Mesa projects for the next few years.
>>
>> Last year I wrote most of an OpenCL implementation running on nVidia
>> hardware as part of the X.Org Foundation's EVoC program [1] -- the only
>> piece missing being the compiler.
>>
>> I've gained some experience with LLVM by writing a proof-of-concept
>> TGSI back-end which is minimally working [2] -- the goal of this
>> project would be to bring it to a useful state.
>>
>> - Timeline
>>
>> Summary of the work that would be done:
>
> I'm not sure what the current status of your TGSI backend, but I would
> recommend getting assembly generation working first, since this will
> enable you to write lit tests.
>

That already sort of works... The only thing is that the assembly files
that it produces are somewhat non-standard because they include section
annotations and other unusual syntax that wouldn't be recognized by the
normal TGSI parser... It might be worth looking into it at some point
but I don't think it's very high-priority, what I have seems to be
enough to make lit happy.

>>
>> * Get object file generation working.
>> (approx. June 17 - July 8)
>>
>> The output format will be the one expected by Mesa. The
>> implementation will take advantage of the existing MC assembler
>> API as much as possible.
>>
>
> Can you elaborate a little more on the output format you will be using?
> For example, will you be generating ELF binaries with special metadata
> sections (This is what R600 currently does) or will you be creating your
> own object format.

I'd be fine with using ELF, but it would definitely need special
metadata sections as you say (for kernel prototypes and so on), and
clover would have to be fixed to deal with it correctly -- OTOH the
minimalistic format implemented in 'clover/core/module.cpp' seems to do
everything we need, so another option would be to stick to it.

Generating ELF binaries with the LLVM API is really easy to do and you
can use R600 as a minimalistic example. Also, keep in mind that OpenCL
1.2 has API calls for linking kernel objects so that is something that
will need to be supported. I'm not sure if it will be easier to do
linking with ELF or with a custom format, but that's something you
should look into.

If you want to use a custom format, I would recommend doing a little
research ahead of time so you know exactly what needs to be done.
Probably the best place to start would be to look at the PureStreamer
class and figure out what additional features you might need.

>
>> * Fix handling of the multiple OpenCL address spaces.
>> (approx. July 8 - July 22)
>>
>> Operations on __global, __local, and __private memory will be
>> dealt with using the resource access opcodes, __kernel function
>> parameters will be accessed through a special resource meant for
>> parameter passing, __constant memory will be mapped to constant
>> buffers.
>>
>> * Get function calls working reliably.
>> (approx. July 22 - August 5)
>>
>> This will involve fixing the passing of aggregate types and
>> anything that doesn't fit in a 32-bit register, fixing stack
>> allocations (i.e. the "alloca" instruction), and fixing calls to
>> functions that use the "kernel" calling convention from non-kernel
>> functions.
>>
>> * Get control flow working reliably.
>> (approx. August 5 - August 19)
>>
>> This will involve writing a control flow structurizing pass -- It
>> might be possible to promote the R600 one to a common analysis
>> pass and reuse it.
>>
>
> I have a feeling this task may take longer than two weeks. When you
> write the final version of your proposal, I think you should have a
> definitive plan for how you will implement the structurization. Whether
> it's reusing existing R600 code (this is my recommendation) or writing
> something from scratch.
>
Reusing the R600 code would be possible for sure with just a few
changes, but I think it would be nice to split the algorithm in an
"analysis" and a "transformation" pass to leave the target the choice on
how irreducible edges in the CFG should be handled -- Depending on the
hardware and the specific case it might be better to remove irreducible
edges by duplicating basic blocks, by introducing temporary "control
flow" variables (as the SI structurizer does), or by not doing anything
at all (e.g. on nVidia hardware arbitrary branches are actually
supported, they're just somewhat inefficient).

It would also be nice if inter-pass dependencies were handled correctly
and we didn't have to disable any other optimization passes that
decanonicalize the control flow as R600 does.

I agree that two weeks might be too little for what I have in mind, but
I guess if we drop the standard library point below (or we make it
optional) it should be plenty of time to do it right.

> Also, I would really prefer if your structurization solution was target
> independent and could live outside of the backend in the common code,
> because a good structurization solution would be a great benefit to the
> LLVM project.

Yeah, that was my idea too.

>
>> * Get the missing arithmetic and data conversion instructions
>> working.
>> (approx. August 19 - August 26)
>>
>> Most of the floating point, integer and vector operations required
>> by the OpenCL spec will be functional by the end of this period.
>>
>> * Work on the standard library and intrinsics.
>> (approx. August 26 - September 16)
>>
>> This will involve getting a reasonable subset of the OpenCL
>> standard library working, including math functions, thread
>> synchronisation functions, atomic functions, memory barriers and
>> surface sampling/write-back functions.
>
> I'm assuming you are planning to use libclc (http://libclc.llvm.org) for
> this.
>
Yes.

> While implementing standard library builtins is important, I think this
> task may be a little bit outside the scope of this project. I would
> recommend dropping this from the schedule and adding it as a task to
> work on if you finish everything else early. This way you can give
> yourself more time to work on the actual backend.

OK, I'll make this one optional.

>>
>> * Documentation and remaining clean-up work.
>> (approx. September 16 - September 23)
>>
>
> I think your proposal should also include a plan for getting the backend
> into mainline LLVM, because this is really the ultimate goal of the
> project. Your plan should include where the code repository will be
> stored and how you will engage with the community to help you review
> the code. I think this is really important no just for you, but also
> for the LLVM community to know what they need to do as far as helping
> get the backend into the main tree.
>
I'm a little lost on this point... My plan is just to keep working on
it until it's good enough to be considered suitable for mainline,
meanwhile it could live in a separate repository in freedesktop or
github. Not sure what else would be expected from me -- of course, I'm
willing to keep fixing bugs, API breakages and reviewing related patches
once it's merged to mainline.

Basically what I'm trying to say is that I think the goal should be to
merge the backend by the end of the summer. It is difficult to get
new backends accepted into the main tree mostly due to the fact that
the core developers don't usually have much spare time to do reviews.
I think if you put off trying to merge the code until after the summer,
the backend may fall off the radar of the LLVM developers, and it will
be even harder to get someone to look at it.

-Tom

Tom Stellard <tom@stellard.net> writes:

[...]

>>
>> * Get object file generation working.
>> (approx. June 17 - July 8)
>>
>> The output format will be the one expected by Mesa. The
>> implementation will take advantage of the existing MC assembler
>> API as much as possible.
>>
>
> Can you elaborate a little more on the output format you will be using?
> For example, will you be generating ELF binaries with special metadata
> sections (This is what R600 currently does) or will you be creating your
> own object format.

I'd be fine with using ELF, but it would definitely need special
metadata sections as you say (for kernel prototypes and so on), and
clover would have to be fixed to deal with it correctly -- OTOH the
minimalistic format implemented in 'clover/core/module.cpp' seems to do
everything we need, so another option would be to stick to it.

Generating ELF binaries with the LLVM API is really easy to do and you
can use R600 as a minimalistic example. Also, keep in mind that OpenCL
1.2 has API calls for linking kernel objects so that is something that
will need to be supported. I'm not sure if it will be easier to do
linking with ELF or with a custom format, but that's something you
should look into.

Yeah, I think you're right, ELF is probably our best bet with a view to
OpenCL 1.2. I've fixed my proposal.

[...]

>>
>> * Documentation and remaining clean-up work.
>> (approx. September 16 - September 23)
>>
>
> I think your proposal should also include a plan for getting the backend
> into mainline LLVM, because this is really the ultimate goal of the
> project. Your plan should include where the code repository will be
> stored and how you will engage with the community to help you review
> the code. I think this is really important no just for you, but also
> for the LLVM community to know what they need to do as far as helping
> get the backend into the main tree.
>
I'm a little lost on this point... My plan is just to keep working on
it until it's good enough to be considered suitable for mainline,
meanwhile it could live in a separate repository in freedesktop or
github. Not sure what else would be expected from me -- of course, I'm
willing to keep fixing bugs, API breakages and reviewing related patches
once it's merged to mainline.

Basically what I'm trying to say is that I think the goal should be to
merge the backend by the end of the summer. It is difficult to get
new backends accepted into the main tree mostly due to the fact that
the core developers don't usually have much spare time to do reviews.
I think if you put off trying to merge the code until after the summer,
the backend may fall off the radar of the LLVM developers, and it will
be even harder to get someone to look at it.

OK, I've rephrased the project objective slighly to make that point a
bit clearer.

Thank you, a revised proposal follows.

llvm-tgsi-backend-proposal.text (4.83 KB)