this is a follow-up on my email from august
i have, finally, released my OpenCL backend and control-flow
restructuring framework for LLVM (AST-Extractor, or short axtor). The
framework restructures function CFGs such that they can be expressed
entirely without GOTOs or switch/loop-trickery. Hence, making it
possible to emit source-code for strictly control-flow structured
languages (OpenCL, GLSL). The code includes a drop-in OpenCL driver that
allows source-to-source OpenCL code transformations on existing OpenCL
The OpenCL backend has been under development for a while now and was
tested against the NVIDIA, AMD and Rodinia demo/benchmark suites with
recent NVIDIA/AMD drivers. Results for NVIDIA and AMD show, with few
exceptions, that the source-to-source-loop does not introduce any
performance penalty on the generated kernels (known exception: AES on
recent AMD drivers),
However, kernels with sampler types are currently unsupported and the
source-to-source-loop may introduce slight imprecisions to floating
The project builds against the current SVN version of LLVM and Clang.
The GLSL backend has been lacking some attention (still at 2.9) and will
be ported later to LLVM-svn.
To have a look at the source, go to https://bitbucket.org/gnarf/axtor/
where it is hosted under the GPL license.
Please get back to me, if you have any questions or want to work on the
code (however, i won't be able to regulary check on my emails before
April but you will get your reply sooner or later).
Have you looked at the control flow structizer that we have in the Open Source AMDIL backend?
i just had a quick look at your structurizer. Here is what if found
(correct me, if i am mistaken):
* Our approaches for handling Loops with multiple exits are identical.
* Axtor implements Controlled-Node Splitting and can cope with
* Axtor translates switches to cascading IF-instructions
* You are cloning nodes for predecessors to restructure IF-structures.
In Axtor, additionally to that, i implemented another method of dealing
with unstructured IFs. That method basically does the same as the
When parsing an IF, it collects all branches to conflicting blocks you
would otherwise clone and puts all those blocks behind a landing block.
That block than makes the exit for the IF.
I favour that approach for several reasons:
Firstly, it is not safe to clone blocks that contain memory
barriers/fences (at least not wrt the OpenCL specification, because the
pathes of the threads leading to a barrier might not all be governed by
Secondly, i assumed that it is easier for the "receiving" OpenCL
compiler to recover the original CFG with the landing block approach. It
seems much harder to identify duplicate blocks than to trace a successor
through the landing block.
The idea behind axtor was to make functions with arbitrary CFGs work on
GPUs (usual exceptions apply: no fnc/block ptrs), such that a reliable
OpenCL backend becomes feasible.
The person that wrote our structurizer agrees with your analysis. Too bad the licenses are incompatible, it would be nice to merge similar efforts.
I am currently looking into the options in re/multi-licensing it under a
more permissive license.