Generating superblocks (SEME regions w/o loops and calls) in LLVM

Hi all,

While developing compile-time instrumentation for ThreadSanitizer
(Google Code Archive - Long-term storage for Google Code Project Hosting.) I need to generate SEME
regions without loop and call instructions
(I'll call them superblocks hereafter, although some researchers do
allow loops in their definition of superblocks).

This is necessary to get the largest piece of IR in which the memory
operations can be enumerated in order to record the addresses of the
memory accesses into a fixed-size buffer.

So far I was using my home-brewed structure to hold the superblocks:

struct SBlock {
  BlockSet blocks;
  llvm::BasicBlock *entry;
  BlockSet exits;
  InstSet mops_to_instrument;
  int num_mops;
  SBlock() : num_mops(0) {}
};

and several functions that split basic blocks to eliminate calls from
them, traverse the call graph and create the SBlock instances from the
basic blocks.

Now I want to simplify my instrumentation pass and move that
superblock creation logic out of it.
Is there a strong need to make such a functionality common? Can other
passes benefit from it?

There's also the Trace class, which can be used to hold the
superblocks, but I haven't found any code that generates them at
compile time (in my case those should not depend on any dynamic
analysis).

Thanks in advance,
Alexander Potapenko
Software Engineer
Google Moscow

It would be very interesting to me to further quality of code scheduling for
our VLIW target.
With very preliminary nature of this discussion I could only say that we
would want to see universal support for a variety of extended BB
representations - EBB, superblock, hyperblock, treegion (though the core of
it would probably be virtually identical).
It should also fit seamlessly into Evan's proposal for bundle representation
(see this thread
http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-December/045846.html)

Sergei Larin

Hi Alexander,

did you have a look at the RegionInfo pass? It currently detects some kind of refinded SESE regions. I use them in Polly and as far as I know the Intel OpenCL SDK also uses them in some way. It is not SEME, but it may either fit your needs or we may think about extending it.

If you want to give it a try you can use:

opt -view-regions-only file.ll

It would be great if we could have just a single RegionInfo analysis that can then be used by other passes to detect and/or generate the kind of regions they need.

Cheers
Tobi