Graphviz and LLVM-TV

Hi

I'm trying to get a graphviz output (DOT) of a code I'm compiling. I want to see the DFG/CFG of the LLVM assembly, how the operations are chained together. The documentation mentions something about calling certain methods from within gdb, but isn't there some option when invoking the compiler (I've seen some -print-cfg and -dot-cfg options mentioned in some source files, but I can't find which tool supports them)

Also, LLVM-TV seems outdated. I've tried to compile it with 2.5 LLVM and got various errors due to API changes. Tried to correct them, but got overwhelmed. Has the project been abandoned?

thanks,
Ioannis

Hi

I'm trying to get a graphviz output (DOT) of a code I'm compiling. I
want to see the DFG/CFG of the LLVM assembly, how the operations are
chained together. The documentation mentions something about calling
certain methods from within gdb, but isn't there some option when
invoking the compiler (I've seen some -print-cfg and -dot-cfg options
mentioned in some source files, but I can't find which tool supports them)

This should help:
http://llvm.org/docs/ProgrammersManual.html#ViewGraph

Also, LLVM-TV seems outdated. I've tried to compile it with 2.5 LLVM and
got various errors due to API changes. Tried to correct them, but got
overwhelmed. Has the project been abandoned?

Yes, it is really really old and abandoned.

-Chris

Chris Lattner wrote:

Hi

I'm trying to get a graphviz output (DOT) of a code I'm compiling. I
want to see the DFG/CFG of the LLVM assembly, how the operations are
chained together. The documentation mentions something about calling
certain methods from within gdb, but isn't there some option when
invoking the compiler (I've seen some -print-cfg and -dot-cfg options
mentioned in some source files, but I can't find which tool supports them)

This should help:
http://llvm.org/docs/ProgrammersManual.html#ViewGraph

Hey Chris. Thanks for the prompt response.
That's what I was referring to before. If I'm getting this right, I should do a debug build of LLVM and the run it from gdb? So say, I run llvmc from gdb, pass on the code I want to compile and then what? Should I do a break point somewhere first, or wait for the program to finish and then call the methods?(which I don't think it would work). I'm confused. I was hoping for some CLI option or a separate tool that just parses in LLVM assembly and spits out DOT.

Also, LLVM-TV seems outdated. I've tried to compile it with 2.5 LLVM and
got various errors due to API changes. Tried to correct them, but got
overwhelmed. Has the project been abandoned?

Yes, it is really really old and abandoned.

too bad. looked interesting

thanks,
Ioannis

The flags for dot output work as any other optimization or analysis
pass. So you can specify them at any tool that allows to pass
optimization flags.
You can use e.g. the "opt" binary and run it on any code that is already
in llvm intermediate code.

Use:

opt -dot-cfg file.bc

Any other tool that passes the flags to opt should also work.

Tobi

Thanks Tobi for the tip.

however I also need the Data-Flow-Graph of each basic block/functions. As I said, I need a view of how the instructions 'link' to each other (via registers or memory aliasing or whatnot)

thanks again and sorry for the ping-pong,
Ioannis

Tobias Grosser wrote:

Hi Ioannis,

Thanks Tobi for the tip.

however I also need the Data-Flow-Graph of each basic block/functions.
As I said, I need a view of how the instructions 'link' to each other
(via registers or memory aliasing or whatnot)

I believe that there is not yet a DOT printer for this kind of
information.

grep DOTGraphTraits -R *

should give you all data types, for which dot printing is implemented,
and the data flow does not seem to exist. However there is already a
graph representation of the data flow information in
"include/llvm/Support/DataFlow.h":

ok, thanks for these pointers. I will look into this

Ioannis

Tobias Grosser wrote:

I’ve fixed the compilation and linking errors and updated the instructions on how to build it, take a look at README.txt in SVN.

Note that it requires poolalloc module and LLVM, but poolalloc hasn’t been updated to work with top-of-trunk LLVM, so I included a recent SVN revision which worked for me.
Poolalloc build is also broken right now, but it works enough to be used with llvm-tv (you only need libLLVMDataStructure).

The current issue is the wxWidgets exception that you’ll get when trying to select a module in the list:

./src/gtk/evtloop.cpp(64): assert “!m_impl” failed in ~wxEventLoop(): should have been deleted in Run()

I haven’t been able to figure out what’s going on here. If you are interested enough in llvm-tv to help me figure it out, let me know.

Misha

OK, I’ve fixed that too-- take a look and let me know if you have any issues building/using LLVM-TV.

Thanks Misha.

I've got a peculiar syntax error with the wxS() function. Seems that there is some kind of name conflict. I've renamed it to wxStr.

llvm-tv window runs, however I have problems with the opt-snap. It does not recognise the -gcse CLI argument and if I omit that, it segfaults, though I do get the views on the llvm-tv window (don't know if they are complete/correct)

Processing: /tmp/llvm-tv-ioannis/snapshots/0-licm.bc
0 opt 0x0840abbe
1 libstdc++.so.6 0x007c7a81 operator delete(void*) + 33
2 opt 0x08393d35 llvm::Pass::~Pass() + 85
3 opt 0x083970de llvm::PMDataManager::~PMDataManager() + 62
Segmentation fault

regards,
Ioannis

Misha Brukman wrote:

wxS_to_wxStr.patch (4.37 KB)

I've tried to write a DFGPrinter based on the CFGPrinter, as you suggested, but encountered problems.

The GraphWriter expects GraphTraits<GraphType>::nodes_begin()/nodes_end(). The way this is implemented in CFG.h, a function is a graph of basic blocks. A GraphTraits<Function*> inherits from GraphTraits<BasicBlock*>, and implements those nodes_begin()/nodes_end() wrapper functions. Should I modify CFG.h and make now BasicBlock a graph of Instruction(s) ?

The DataFlow.h deals with Value and User. Now BasicBlock is derived from Value and Instruction from User, but I don't understand how that helps.

It's a bit confusing to be honest. Can you help?

thanks,
Ioannis

Tobias Grosser wrote:

Here are some examples on how to use the dataflow graphs:

Value *V = ...
for(df_iterator<Value*> UI = df_begin(V), UE = df_end(V); UI != UE; ++UI) {
...
}

typedef SmallPtrSet<const Value*, 16> SmallValueSet;
SmallValueSet DFSet;
const User* U = ...;
for (idf_ext_iterator<const User*, SmallValueSet> I=idf_ext_begin(U,
DFSet), E=idf_ext_end(U, DFSet); I != E; ++I) {
..
}

There is no common root for the dataflow graph from which you can
enumerate all dataflows, but you could take
each instruction in a function that you're interested in, and output a
dataflow graph rooted at that instruction,
unless that instruction was part of some other dataflow graph already.

Perhaps something like:

SmallValueSet PrintedSet;
for (Function::arg_iterator I=F.arg_begin(), E=F.arg_end(); I != E; ++I) {
       if (!PrintedSet.count(*I)) {
             ... print graph rooted at V, add each node printed to
PrintedSet
      }
}
for (inst_iterator I=inst_begin(F), E=inst_end(F); I != E; ++I) {
       if (!PrintedSet.count(*I)) {
             ... print graph rooted at V, add each node printed to
PrintedSet
      }
}

Best regards,
--Edwin

Edwin,

thank you for your effort, but I'm not sure I understand.
Are you describing a graph traversal problem? Is the data model stored in a predecessor/successor fashion, which requires you to 'walk' the graph in order to visit all nodes? (and what happens when you have disjointed DFGs?).

inline comments follow...

Török Edwin wrote:

  

I've tried to write a DFGPrinter based on the CFGPrinter, as you suggested, but encountered problems.

The GraphWriter expects GraphTraits<GraphType>::nodes_begin()/nodes_end(). The way this is implemented in CFG.h, a function is a graph of basic blocks. A GraphTraits<Function*> inherits from GraphTraits<BasicBlock*>, and implements those nodes_begin()/nodes_end() wrapper functions. Should I modify CFG.h and make now BasicBlock a graph of Instruction(s) ?

The DataFlow.h deals with Value and User. Now BasicBlock is derived from Value and Instruction from User, but I don't understand how that helps.

It's a bit confusing to be honest. Can you help?
  
Here are some examples on how to use the dataflow graphs:

Value *V = ...
for(df_iterator<Value*> UI = df_begin(V), UE = df_end(V); UI != UE; ++UI) {
...
}

typedef SmallPtrSet<const Value*, 16> SmallValueSet;
SmallValueSet DFSet;
const User* U = ...;
for (idf_ext_iterator<const User*, SmallValueSet> I=idf_ext_begin(U,
DFSet), E=idf_ext_end(U, DFSet); I != E; ++I) {
..
}

I don't understand why I'd need a depth-first iterator.

There is no common root for the dataflow graph from which you can
enumerate all dataflows, but you could take
each instruction in a function that you're interested in, and output a
dataflow graph rooted at that instruction,
unless that instruction was part of some other dataflow graph already.

from this I'm getting that you suggest finding all "root" instructions and traverse each chain of instruction (data-flow) separately. Considering that most dataflows are not simple expanding trees, and are instead merging and spliting at several points, I'm not sure what good that would do.

Perhaps something like:

SmallValueSet PrintedSet;
for (Function::arg_iterator I=F.arg_begin(), E=F.arg_end(); I != E; ++I) {
       if (!PrintedSet.count(*I)) {
             ... print graph rooted at V, add each node printed to
PrintedSet
      } }
for (inst_iterator I=inst_begin(F), E=inst_end(F); I != E; ++I) {
       if (!PrintedSet.count(*I)) {
             ... print graph rooted at V, add each node printed to
PrintedSet
      } }

Best regards,
--Edwin
  

Most likely I'm missing some information here. There's probably some internal structure, probably how the data-model is build, that requires this sort of traversal. Can you elaborate please.

What I'm looking for, to start with, is something rather simple (or at least it should be). A per BasicBlock view of their DFGs. I'm not interested in the inter-BasicBlock connections, for now. I'm tempted to just iterate through the instructions container of each BasicBlock and output the graphviz manually. However, I'd prefer using any facilities present. The way I see it, all I need to do is iterate through the instructions of the BasicBlock, enumerate them to the graphviz data model, then iterate through them once more, checking their predecessor/successor instructions (inputs/outputs), to register their connectivity (edges). Right?

thanks,
Ioannis

PS: By the way. How come and you guys are not using Graphviz's library API? Avoiding the dependency?

Edwin,

thank you for your effort, but I'm not sure I understand.
Are you describing a graph traversal problem? Is the data model stored
in a predecessor/successor fashion, which requires you to 'walk' the
graph in order to visit all nodes? (and what happens when you have
disjointed DFGs?).

Sorry for the confusion, it was only an example on how to use the
dataflow graphs from DataFlow.h, you don't need depth-first iteration to
implement dot graphs for DFGs.

Disjoint DFGs are the reason I mentioned the need to walk all
instructions/arguments in order to get all DFGs.
If you are only interested in a DFG starting from a particular Value*,
then forget about the iteration I mentioned.

I think your fundamental problem was that there is already a Graph for
Function*, the CFG, and you want a DFG Graph.

I think you could do something like this:
template <typename T>
class DFG {
private:
  T p;
public:
  DFG(T t) : p(t) {}
  T operator*() { return p; }
};
template <> struct GraphTraits<DFG<Function*> > : public
GraphTraits<Value*> {
  typedef inst_iterator nodes_iterator;
  static nodes_iterator nodes_begin(DFG<Function *> F) {
    return inst_begin(*F);
  }
  static nodes_iterator nodes_end(DFG<Function *> F) {
    return inst_end(*F);
  }
};

...
  ViewGraph(DFG<Function*>(F), "test");

Then you could implement a DOTGraphTraits for DFG<Function*>.

inline comments follow...

Török Edwin wrote:

I've tried to write a DFGPrinter based on the CFGPrinter, as you
suggested, but encountered problems.

The GraphWriter expects
GraphTraits<GraphType>::nodes_begin()/nodes_end(). The way this is
implemented in CFG.h, a function is a graph of basic blocks. A
GraphTraits<Function*> inherits from GraphTraits<BasicBlock*>, and
implements those nodes_begin()/nodes_end() wrapper functions. Should
I modify CFG.h and make now BasicBlock a graph of Instruction(s) ?

The DataFlow.h deals with Value and User. Now BasicBlock is derived
from Value and Instruction from User, but I don't understand how
that helps.

It's a bit confusing to be honest. Can you help?
      
Here are some examples on how to use the dataflow graphs:

Value *V = ...
for(df_iterator<Value*> UI = df_begin(V), UE = df_end(V); UI != UE;
++UI) {
...
}

typedef SmallPtrSet<const Value*, 16> SmallValueSet;
SmallValueSet DFSet;
const User* U = ...;
for (idf_ext_iterator<const User*, SmallValueSet> I=idf_ext_begin(U,
DFSet), E=idf_ext_end(U, DFSet); I != E; ++I) {
..
}

I don't understand why I'd need a depth-first iterator.

It was just an example to show something that uses the GraphTraits from
DataFlow.h.

There is no common root for the dataflow graph from which you can
enumerate all dataflows, but you could take
each instruction in a function that you're interested in, and output a
dataflow graph rooted at that instruction,
unless that instruction was part of some other dataflow graph already.

from this I'm getting that you suggest finding all "root" instructions
and traverse each chain of instruction (data-flow) separately.
Considering that most dataflows are not simple expanding trees, and
are instead merging and spliting at several points, I'm not sure what
good that would do.

int foo(int b, int c)
{
int a = b+4;
int z = a+c;
int y = c+1;
..
}

If you start writing out the dataflow for 'b', then the graph contains
only 'a' and 'z'. If you start at 'c' the graph will contain only 'z'
and 'y'.
This is unlike CFG graphs where the entrypoint in a function is the root
of the CFG for that function, the DFG graphs are disjoint in this case.

Perhaps something like:

SmallValueSet PrintedSet;
for (Function::arg_iterator I=F.arg_begin(), E=F.arg_end(); I != E;
++I) {
       if (!PrintedSet.count(*I)) {
             ... print graph rooted at V, add each node printed to
PrintedSet
      } }
for (inst_iterator I=inst_begin(F), E=inst_end(F); I != E; ++I) {
       if (!PrintedSet.count(*I)) {
             ... print graph rooted at V, add each node printed to
PrintedSet
      } }

Best regards,
--Edwin
  

Most likely I'm missing some information here. There's probably some
internal structure, probably how the data-model is build, that
requires this sort of traversal. Can you elaborate please.

What I'm looking for, to start with, is something rather simple (or at
least it should be). A per BasicBlock view of their DFGs. I'm not
interested in the inter-BasicBlock connections, for now. I'm tempted
to just iterate through the instructions container of each BasicBlock
and output the graphviz manually. However, I'd prefer using any
facilities present. The way I see it, all I need to do is iterate
through the instructions of the BasicBlock, enumerate them to the
graphviz data model, then iterate through them once more, checking
their predecessor/successor instructions (inputs/outputs), to register
their connectivity (edges). Right?

Well the GraphTraits in DataFlow.h are really simple, LLVM's graph
algorithms expect
a child_begin()/child_end(), so DataFlow.h only maps
child_begin/child_end to use_begin/use_end, and op_begin/op_end for the
Def-Use, and Use-Def graphs respectively.
That however doesn't know of any basicblock boundaries, it enumerates
*all* uses of a value, so I think that writing out a full DFG is easier
than writing out a partial one.

thanks,
Ioannis

PS: By the way. How come and you guys are not using Graphviz's library
API? Avoiding the dependency?

Writing out a graph for dot is easy enough to do :wink:

Best regards,
--Edwin

Edwin,

thanks, it starts making sense

inline comments...

Török Edwin wrote:

  

Edwin,

thank you for your effort, but I'm not sure I understand.
Are you describing a graph traversal problem? Is the data model stored
in a predecessor/successor fashion, which requires you to 'walk' the
graph in order to visit all nodes? (and what happens when you have
disjointed DFGs?).
    
Sorry for the confusion, it was only an example on how to use the
dataflow graphs from DataFlow.h, you don't need depth-first iteration to
implement dot graphs for DFGs.

Disjoint DFGs are the reason I mentioned the need to walk all
instructions/arguments in order to get all DFGs.
If you are only interested in a DFG starting from a particular Value*,
then forget about the iteration I mentioned.

I think your fundamental problem was that there is already a Graph for
Function*, the CFG, and you want a DFG Graph.

yes that's right, the CFG is getting in the way

I think you could do something like this:
template <typename T>
class DFG {
private:
  T p;
public:
  DFG(T t) : p(t) {}
  T operator*() { return p; }
};
template <> struct GraphTraits<DFG<Function*> > : public
GraphTraits<Value*> {
  typedef inst_iterator nodes_iterator;
  static nodes_iterator nodes_begin(DFG<Function *> F) {
    return inst_begin(*F);
  }
  static nodes_iterator nodes_end(DFG<Function *> F) {
    return inst_end(*F);
  }
};

...
  ViewGraph(DFG<Function*>(F), "test");

Then you could implement a DOTGraphTraits for DFG<Function*>.

ok I'll give it a try.

inline comments follow...

Török Edwin wrote:
    

I've tried to write a DFGPrinter based on the CFGPrinter, as you
suggested, but encountered problems.

The GraphWriter expects
GraphTraits<GraphType>::nodes_begin()/nodes_end(). The way this is
implemented in CFG.h, a function is a graph of basic blocks. A
GraphTraits<Function*> inherits from GraphTraits<BasicBlock*>, and
implements those nodes_begin()/nodes_end() wrapper functions. Should
I modify CFG.h and make now BasicBlock a graph of Instruction(s) ?

The DataFlow.h deals with Value and User. Now BasicBlock is derived
from Value and Instruction from User, but I don't understand how
that helps.

It's a bit confusing to be honest. Can you help?
      

Here are some examples on how to use the dataflow graphs:

Value *V = ...
for(df_iterator<Value*> UI = df_begin(V), UE = df_end(V); UI != UE;
++UI) {
...
}

typedef SmallPtrSet<const Value*, 16> SmallValueSet;
SmallValueSet DFSet;
const User* U = ...;
for (idf_ext_iterator<const User*, SmallValueSet> I=idf_ext_begin(U,
DFSet), E=idf_ext_end(U, DFSet); I != E; ++I) {
..
}

I don't understand why I'd need a depth-first iterator.
    
It was just an example to show something that uses the GraphTraits from
DataFlow.h.

There is no common root for the dataflow graph from which you can
enumerate all dataflows, but you could take
each instruction in a function that you're interested in, and output a
dataflow graph rooted at that instruction,
unless that instruction was part of some other dataflow graph already.

from this I'm getting that you suggest finding all "root" instructions
and traverse each chain of instruction (data-flow) separately.
Considering that most dataflows are not simple expanding trees, and
are instead merging and spliting at several points, I'm not sure what
good that would do.
    
int foo(int b, int c)
{
int a = b+4;
int z = a+c;
int y = c+1;
..
}

If you start writing out the dataflow for 'b', then the graph contains
only 'a' and 'z'. If you start at 'c' the graph will contain only 'z'
and 'y'.
This is unlike CFG graphs where the entrypoint in a function is the root
of the CFG for that function, the DFG graphs are disjoint in this case.

Perhaps something like:

SmallValueSet PrintedSet;
for (Function::arg_iterator I=F.arg_begin(), E=F.arg_end(); I != E;
++I) {
       if (!PrintedSet.count(*I)) {
             ... print graph rooted at V, add each node printed to
PrintedSet
      } }
for (inst_iterator I=inst_begin(F), E=inst_end(F); I != E; ++I) {
       if (!PrintedSet.count(*I)) {
             ... print graph rooted at V, add each node printed to
PrintedSet
      } }

Best regards,
--Edwin
  

Most likely I'm missing some information here. There's probably some
internal structure, probably how the data-model is build, that
requires this sort of traversal. Can you elaborate please.

What I'm looking for, to start with, is something rather simple (or at
least it should be). A per BasicBlock view of their DFGs. I'm not
interested in the inter-BasicBlock connections, for now. I'm tempted
to just iterate through the instructions container of each BasicBlock
and output the graphviz manually. However, I'd prefer using any
facilities present. The way I see it, all I need to do is iterate
through the instructions of the BasicBlock, enumerate them to the
graphviz data model, then iterate through them once more, checking
their predecessor/successor instructions (inputs/outputs), to register
their connectivity (edges). Right?
    
Well the GraphTraits in DataFlow.h are really simple, LLVM's graph
algorithms expect
a child_begin()/child_end(), so DataFlow.h only maps
child_begin/child_end to use_begin/use_end, and op_begin/op_end for the
Def-Use, and Use-Def graphs respectively.
That however doesn't know of any basicblock boundaries, it enumerates
*all* uses of a value, so I think that writing out a full DFG is easier
than writing out a partial one.

ok I see

thanks,
Ioannis

Edwin,

thanks, it starts making sense

inline comments...

Török Edwin wrote:

  

Edwin,

thank you for your effort, but I'm not sure I understand.
Are you describing a graph traversal problem? Is the data model stored
in a predecessor/successor fashion, which requires you to 'walk' the
graph in order to visit all nodes? (and what happens when you have
disjointed DFGs?).
    
Sorry for the confusion, it was only an example on how to use the
dataflow graphs from DataFlow.h, you don't need depth-first iteration to
implement dot graphs for DFGs.

Disjoint DFGs are the reason I mentioned the need to walk all
instructions/arguments in order to get all DFGs.
If you are only interested in a DFG starting from a particular Value*,
then forget about the iteration I mentioned.

I think your fundamental problem was that there is already a Graph for
Function*, the CFG, and you want a DFG Graph.

yes that's right, the CFG is getting in the way

I think you could do something like this:
template <typename T>
class DFG {
private:
  T p;
public:
  DFG(T t) : p(t) {}
  T operator*() { return p; }
};
template <> struct GraphTraits<DFG<Function*> > : public
GraphTraits<Value*> {
  typedef inst_iterator nodes_iterator;
  static nodes_iterator nodes_begin(DFG<Function *> F) {
    return inst_begin(*F);
  }
  static nodes_iterator nodes_end(DFG<Function *> F) {
    return inst_end(*F);
  }
};

...
  ViewGraph(DFG<Function*>(F), "test");

Then you could implement a DOTGraphTraits for DFG<Function*>.

ok I'll give it a try.

inline comments follow...

Török Edwin wrote:
    

I've tried to write a DFGPrinter based on the CFGPrinter, as you
suggested, but encountered problems.

The GraphWriter expects
GraphTraits<GraphType>::nodes_begin()/nodes_end(). The way this is
implemented in CFG.h, a function is a graph of basic blocks. A
GraphTraits<Function*> inherits from GraphTraits<BasicBlock*>, and
implements those nodes_begin()/nodes_end() wrapper functions. Should
I modify CFG.h and make now BasicBlock a graph of Instruction(s) ?

The DataFlow.h deals with Value and User. Now BasicBlock is derived
from Value and Instruction from User, but I don't understand how
that helps.

It's a bit confusing to be honest. Can you help?
      

Here are some examples on how to use the dataflow graphs:

Value *V = ...
for(df_iterator<Value*> UI = df_begin(V), UE = df_end(V); UI != UE;
++UI) {
...
}

typedef SmallPtrSet<const Value*, 16> SmallValueSet;
SmallValueSet DFSet;
const User* U = ...;
for (idf_ext_iterator<const User*, SmallValueSet> I=idf_ext_begin(U,
DFSet), E=idf_ext_end(U, DFSet); I != E; ++I) {
..
}

I don't understand why I'd need a depth-first iterator.
    
It was just an example to show something that uses the GraphTraits from
DataFlow.h.

There is no common root for the dataflow graph from which you can
enumerate all dataflows, but you could take
each instruction in a function that you're interested in, and output a
dataflow graph rooted at that instruction,
unless that instruction was part of some other dataflow graph already.

from this I'm getting that you suggest finding all "root" instructions
and traverse each chain of instruction (data-flow) separately.
Considering that most dataflows are not simple expanding trees, and
are instead merging and spliting at several points, I'm not sure what
good that would do.
    
int foo(int b, int c)
{
int a = b+4;
int z = a+c;
int y = c+1;
..
}

If you start writing out the dataflow for 'b', then the graph contains
only 'a' and 'z'. If you start at 'c' the graph will contain only 'z'
and 'y'.
This is unlike CFG graphs where the entrypoint in a function is the root
of the CFG for that function, the DFG graphs are disjoint in this case.

Perhaps something like:

SmallValueSet PrintedSet;
for (Function::arg_iterator I=F.arg_begin(), E=F.arg_end(); I != E;
++I) {
       if (!PrintedSet.count(*I)) {
             ... print graph rooted at V, add each node printed to
PrintedSet
      } }
for (inst_iterator I=inst_begin(F), E=inst_end(F); I != E; ++I) {
       if (!PrintedSet.count(*I)) {
             ... print graph rooted at V, add each node printed to
PrintedSet
      } }

Best regards,
--Edwin
  

Most likely I'm missing some information here. There's probably some
internal structure, probably how the data-model is build, that
requires this sort of traversal. Can you elaborate please.

What I'm looking for, to start with, is something rather simple (or at
least it should be). A per BasicBlock view of their DFGs. I'm not
interested in the inter-BasicBlock connections, for now. I'm tempted
to just iterate through the instructions container of each BasicBlock
and output the graphviz manually. However, I'd prefer using any
facilities present. The way I see it, all I need to do is iterate
through the instructions of the BasicBlock, enumerate them to the
graphviz data model, then iterate through them once more, checking
their predecessor/successor instructions (inputs/outputs), to register
their connectivity (edges). Right?
    
Well the GraphTraits in DataFlow.h are really simple, LLVM's graph
algorithms expect
a child_begin()/child_end(), so DataFlow.h only maps
child_begin/child_end to use_begin/use_end, and op_begin/op_end for the
Def-Use, and Use-Def graphs respectively.
That however doesn't know of any basicblock boundaries, it enumerates
*all* uses of a value, so I think that writing out a full DFG is easier
than writing out a partial one.

ok I see

thanks,
Ioannis

Ioannis Nousias <ioannis.nousias <at> googlemail.com> writes:

Edwin,

thanks, it starts making sense

inline comments...

Török Edwin wrote:
>
>> Edwin,
>>
>> thank you for your effort, but I'm not sure I understand.
>> Are you describing a graph traversal problem? Is the data model stored
>> in a predecessor/successor fashion, which requires you to 'walk' the
>> graph in order to visit all nodes? (and what happens when you have
>> disjointed DFGs?).
>>
>
> Sorry for the confusion, it was only an example on how to use the
> dataflow graphs from DataFlow.h, you don't need depth-first iteration to
> implement dot graphs for DFGs.
>
> Disjoint DFGs are the reason I mentioned the need to walk all
> instructions/arguments in order to get all DFGs.
> If you are only interested in a DFG starting from a particular Value*,
> then forget about the iteration I mentioned.
>
> I think your fundamental problem was that there is already a Graph for
> Function*, the CFG, and you want a DFG Graph.
>
>
yes that's right, the CFG is getting in the way

> I think you could do something like this:
> template <typename T>
> class DFG {
> private:
> T p;
> public:
> DFG(T t) : p(t) {}
> T operator*() { return p; }
> };
> template <> struct GraphTraits<DFG<Function*> > : public
> GraphTraits<Value*> {
> typedef inst_iterator nodes_iterator;
> static nodes_iterator nodes_begin(DFG<Function *> F) {
> return inst_begin(*F);
> }
> static nodes_iterator nodes_end(DFG<Function *> F) {
> return inst_end(*F);
> }
> };
>
> ...
> ViewGraph(DFG<Function*>(F), "test");
>
> Then you could implement a DOTGraphTraits for DFG<Function*>.
>
>
ok I'll give it a try.

>> inline comments follow...
>>
>> Török Edwin wrote:
>>
>>>
>>>
>>>> I've tried to write a DFGPrinter based on the CFGPrinter, as you
>>>> suggested, but encountered problems.
>>>>
>>>> The GraphWriter expects
>>>> GraphTraits<GraphType>::nodes_begin()/nodes_end(). The way this is
>>>> implemented in CFG.h, a function is a graph of basic blocks. A
>>>> GraphTraits<Function*> inherits from GraphTraits<BasicBlock*>, and
>>>> implements those nodes_begin()/nodes_end() wrapper functions. Should
>>>> I modify CFG.h and make now BasicBlock a graph of Instruction(s) ?
>>>>
>>>> The DataFlow.h deals with Value and User. Now BasicBlock is derived
>>>> from Value and Instruction from User, but I don't understand how
>>>> that helps.
>>>>
>>>> It's a bit confusing to be honest. Can you help?
>>>>
>>>>
>>> Here are some examples on how to use the dataflow graphs:
>>>
>>> Value *V = ...
>>> for(df_iterator<Value*> UI = df_begin(V), UE = df_end(V); UI != UE;
>>> ++UI) {
>>> ...
>>> }
>>>
>>>
>>> typedef SmallPtrSet<const Value*, 16> SmallValueSet;
>>> SmallValueSet DFSet;
>>> const User* U = ...;
>>> for (idf_ext_iterator<const User*, SmallValueSet> I=idf_ext_begin(U,
>>> DFSet), E=idf_ext_end(U, DFSet); I != E; ++I) {
>>> ..
>>> }
>>>
>>>
>>>
>>>
>> I don't understand why I'd need a depth-first iterator.
>>
>
> It was just an example to show something that uses the GraphTraits from
> DataFlow.h.
>
>
>>
>>> There is no common root for the dataflow graph from which you can
>>> enumerate all dataflows, but you could take
>>> each instruction in a function that you're interested in, and output a
>>> dataflow graph rooted at that instruction,
>>> unless that instruction was part of some other dataflow graph already.
>>>
>>>
>>>
>> from this I'm getting that you suggest finding all "root" instructions
>> and traverse each chain of instruction (data-flow) separately.
>> Considering that most dataflows are not simple expanding trees, and
>> are instead merging and spliting at several points, I'm not sure what
>> good that would do.
>>
>
> int foo(int b, int c)
> {
> int a = b+4;
> int z = a+c;
> int y = c+1;
> ..
> }
>
> If you start writing out the dataflow for 'b', then the graph contains
> only 'a' and 'z'. If you start at 'c' the graph will contain only 'z'
> and 'y'.
> This is unlike CFG graphs where the entrypoint in a function is the root
> of the CFG for that function, the DFG graphs are disjoint in this case.
>
>
>>> Perhaps something like:
>>>
>>> SmallValueSet PrintedSet;
>>> for (Function::arg_iterator I=F.arg_begin(), E=F.arg_end(); I != E;
>>> ++I) {
>>> if (!PrintedSet.count(*I)) {
>>> ... print graph rooted at V, add each node printed to
>>> PrintedSet
>>> } }
>>> for (inst_iterator I=inst_begin(F), E=inst_end(F); I != E; ++I) {
>>> if (!PrintedSet.count(*I)) {
>>> ... print graph rooted at V, add each node printed to
>>> PrintedSet
>>> } }
>>>
>>> Best regards,
>>> --Edwin
>>>
>>>
>> Most likely I'm missing some information here. There's probably some
>> internal structure, probably how the data-model is build, that
>> requires this sort of traversal. Can you elaborate please.
>>
>> What I'm looking for, to start with, is something rather simple (or at
>> least it should be). A per BasicBlock view of their DFGs. I'm not
>> interested in the inter-BasicBlock connections, for now. I'm tempted
>> to just iterate through the instructions container of each BasicBlock
>> and output the graphviz manually. However, I'd prefer using any
>> facilities present. The way I see it, all I need to do is iterate
>> through the instructions of the BasicBlock, enumerate them to the
>> graphviz data model, then iterate through them once more, checking
>> their predecessor/successor instructions (inputs/outputs), to register
>> their connectivity (edges). Right?
>>
>
>
> Well the GraphTraits in DataFlow.h are really simple, LLVM's graph
> algorithms expect
> a child_begin()/child_end(), so DataFlow.h only maps
> child_begin/child_end to use_begin/use_end, and op_begin/op_end for the
> Def-Use, and Use-Def graphs respectively.
> That however doesn't know of any basicblock boundaries, it enumerates
> *all* uses of a value, so I think that writing out a full DFG is easier
> than writing out a partial one.
>
>
ok I see

thanks,
Ioannis

Hi Edwin, Ioannis, all,

I am trying to generate the DFG of machine functions.

Initially, I added a pass to generate the DFG of LLVM IR functions. This
was based on the mail thread -
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-September/025582.html. This
pass worked fine and I was able to generate DFG of LLVM IR functions.

Later, I ported the DFG pass code for machine functions. I ported the
InstIterator.h (which was required to use inst_iterator in the *template <>
struct GraphTraits<DFG<Function*> >*) to MachineInstrIterator.h to iterate
over machine instructions in a machine function. But the build is throwing
the following error:

llvm[2]: Compiling MC_DFG.cpp for Debug+Asserts build
/home/abhi/work/llvm31/llvm/include/llvm/Support/GraphWriter.h: In member
function ‘void llvm::GraphWriter<GraphType>::writeNodes() [with GraphType =
llvm::MCDFGraph<llvm::MachineFunction*>]’:
/home/abhi/work/llvm31/llvm/include/llvm/Support/GraphWriter.h:100:
instantiated from ‘void llvm::GraphWriter<GraphType>::writeGraph(const
std::string&) [with GraphType = llvm::MCDFGraph<llvm::MachineFunction*>]’
/home/abhi/work/llvm31/llvm/include/llvm/Support/GraphWriter.h:304:
instantiated from ‘llvm::raw_ostream& llvm::WriteGraph(llvm::raw_ostream&,
const GraphType&, bool, const llvm::Twine&) [with GraphType =
llvm::MCDFGraph<llvm::MachineFunction*>]’
/home/abhi/work/llvm31/llvm/lib/CodeGen/MC_DFG.cpp:249: instantiated from
here
/home/abhi/work/llvm31/llvm/include/llvm/Support/GraphWriter.h:139: error:
no matching function for call to
‘llvm::GraphWriter<llvm::MCDFGraph<llvm::MachineFunction*>

::isNodeHidden(llvm::MachineInstr&)’

/home/abhi/work/llvm31/llvm/include/llvm/Support/GraphWriter.h:143: note:
candidates are: bool llvm::GraphWriter<GraphType>::isNodeHidden(typename
llvm::GraphTraits<T>::NodeType&) [with GraphType =
llvm::MCDFGraph<llvm::MachineFunction*>]
/home/abhi/work/llvm31/llvm/include/llvm/Support/GraphWriter.h:147: note:
                bool llvm::GraphWriter<GraphType>::isNodeHidden(typename
llvm::GraphTraits<T>::NodeType* const*) [with GraphType =
llvm::MCDFGraph<llvm::MachineFunction*>]
/home/abhi/work/llvm31/llvm/include/llvm/Support/GraphWriter.h:151: note:
                bool llvm::GraphWriter<GraphType>::isNodeHidden(typename
llvm::GraphTraits<T>::NodeType*) [with GraphType =
llvm::MCDFGraph<llvm::MachineFunction*>]
/home/abhi/work/llvm31/llvm/include/llvm/Support/GraphWriter.h:140: error:
no matching function for call to
‘llvm::GraphWriter<llvm::MCDFGraph<llvm::MachineFunction*>

::writeNode(llvm::MachineInstr&)’

/home/abhi/work/llvm31/llvm/include/llvm/Support/GraphWriter.h:155: note:
candidates are: void llvm::GraphWriter<GraphType>::writeNode(typename
llvm::GraphTraits<T>::NodeType&) [with GraphType =
llvm::MCDFGraph<llvm::MachineFunction*>]
/home/abhi/work/llvm31/llvm/include/llvm/Support/GraphWriter.h:159: note:
                void llvm::GraphWriter<GraphType>::writeNode(typename
llvm::GraphTraits<T>::NodeType* const*) [with GraphType =
llvm::MCDFGraph<llvm::MachineFunction*>]
/home/abhi/work/llvm31/llvm/include/llvm/Support/GraphWriter.h:163: note:
                void llvm::GraphWriter<GraphType>::writeNode(typename
llvm::GraphTraits<T>::NodeType*) [with GraphType =
llvm::MCDFGraph<llvm::MachineFunction*>]
gmake[2]: ***
[/home/abhi/work/llvm31/llvm-build-new/lib/CodeGen/Debug+Asserts/MC_DFG.o]
Error 1
gmake[2]: Leaving directory
`/home/abhi/work/llvm31/llvm-build-new/lib/CodeGen'
gmake[1]: *** [CodeGen/.makeall] Error 2
gmake[1]: Leaving directory `/home/abhi/work/llvm31/llvm-build-new/lib'
gmake: *** [all] Error 1

The error seems to be a mismatch between types of the argument passed to
the isNodeHidden and writeNode function in the GraphWriter.h file. Although I am
not sure where this type mismatch is originating from or how to fix it, I have a
hunch this has something to do with the implementation of
MachineInstrIterator.h. Any idea what could be the issue here?

Here are some of the code snippets:

//templates for DFG and specializations of GraphTraits

  template <typename T>
  class MCDFGraph {
  private:
    T p;
  public:
    MCDFGraph(T t) : p(t) {}
    T operator*() { return p; }
  };
  template <> struct GraphTraits<MCDFGraph<MachineFunction*> > : public
  GraphTraits<Value*> {
    typedef mc_inst_iterator nodes_iterator;
    static nodes_iterator nodes_begin(MCDFGraph<MachineFunction *> F) {
      return mc_inst_begin(*F);
    }
    static nodes_iterator nodes_end(MCDFGraph<MachineFunction *> F) {
      return mc_inst_end(*F);
    }
  };
  template<>
  struct DOTGraphTraits<MCDFGraph<MachineFunction*> > : public
DefaultDOTGraphTraits {
    DOTGraphTraits (bool isSimple=false) : DefaultDOTGraphTraits(isSimple)
{}
    static std::string getGraphName(MCDFGraph<MachineFunction *> F) {
      return "DFG for the function";
    }
    static std::string getSimpleNodeLabel(Value *Node,
                                          const MCDFGraph<MachineFunction
*> &F) {
      std::string Str;
      raw_string_ostream OS(Str);
      WriteAsOperand(OS, Node, false);
      return OS.str();
    }
    static std::string getCompleteNodeLabel(Value *Node,
                                            const MCDFGraph<MachineFunction
*> &F) {
      std::string Str;
      raw_string_ostream OS(Str);
      if (!Node->getName().empty()) {
        WriteAsOperand(OS, Node, false);
        OS << ":\n";
      }
      OS << *Node;
      std::string OutStr = OS.str();
      if (OutStr[0] == '\n') OutStr.erase(OutStr.begin());
      // Process string output to make it nicer...
      for (unsigned i = 0; i != OutStr.length(); ++i)
        if (OutStr[i] == '\n') { // Left justify
          OutStr[i] = '\\';
          OutStr.insert(OutStr.begin()+i+1, 'l');
        } else if (OutStr[i] == ';') { // Delete
comments!
          unsigned Idx = OutStr.find('\n', i+1); // Find end of
line
          OutStr.erase(OutStr.begin()+i, OutStr.begin()+Idx);
          --i;
        }
      return OutStr;
    }
    std::string getNodeLabel(Value *&Node,
                             const MCDFGraph<MachineFunction *> &Graph) {
      if (isSimple())
        return getSimpleNodeLabel(Node, Graph);
      else
        return getCompleteNodeLabel(Node, Graph);
    }
  };

//Calls to ViewGraph and WriteGraph from the respective passes'
runOnMachineFunction

    ViewGraph(MCDFGraph<MachineFunction*>(&F), "Function:" +
(F.getFunction())->getName());

          WriteGraph(File, (MCDFGraph<MachineFunction*>)&F);

The MachineInstrIterator.h is quite lengthy, please let me know if I need
to post it.

Regards,
Abhishek