Idioms for retrieving global symbols and inheritance

Hello,

  I have a couple of doubts, as listed below:

1. To list all the global variables in the module, I am iterating
using type_iterator and for each Type I get, I am using value_iterator
to iterate over Values . In the second iteration I am getting
unexpected results. For each type obtained from type_iterator->second,
value_iterator->first produces the same list as what
type_iterator->first would produce. Also, an attempt to access
value_iterator->second->getName() produces junk names.
value_iterator->second->hasName() causes segfaults. All this is
compiled fine. Following is sample code and output.

Code:

Hello,

Hi Pratik

  I have a couple of doubts, as listed below:

Oh, those are never good to have :slight_smile:

1. To list all the global variables in the module, I am iterating
using type_iterator and for each Type I get, I am using value_iterator
to iterate over Values . In the second iteration I am getting
unexpected results. For each type obtained from type_iterator->second,
value_iterator->first produces the same list as what
type_iterator->first would produce. Also, an attempt to access
value_iterator->second->getName() produces junk names.
value_iterator->second->hasName() causes segfaults. All this is
compiled fine. Following is sample code and output.

Code:
-------------
for ( SymbolTable::type_iterator itr1 = symbTab.type_begin(), itrend1
= symbTab.type_end(); itr1 != itrend1; itr1++ ) {
  string typeName = itr1->first;
  if ( lldbprfx.compare( typeName.substr( 0, 5 ) ) != 0 ) {
    cerr << typeName << endl;
    for ( SymbolTable::value_const_iterator itr2 =
symbTab.value_begin( itr1->second ),
     itrend2 = symbTab.value_end( itr1->second ); itr2 != itrend2; itr2++ ) {
      if ( lldbprfx.compare( ( itr2->first ).substr( 0, 5 ) ) != 0 ) {
        cerr << "\t" << itr2->first << endl; //Produces same list that
is produced by itr1
// cerr << itr2->second->getName() << endl;
// This outputs junk (including unprintablecharacters)
      }
    }
  }
}

Please note that LLVM doesn't require *anything* to be named. Its
entirely optional. So, using the symbol table as the main construct for
accessing "all globals" isn't going to get you very far in many
instances. The correct iterator to use is in the Module class,
const_global_iterator. See include/llvm/Module.h for details. What you
want is something like this:

for (Module::const_global_iterator GI = M->global_begin(), GE = M-

global_end();

     GI != GE; ++GI) {
  if (GI->hasName())
    cerr << VI->getName() << endl; // Print name
  cerr << GV->getType()->getElementType() << endl; // Print Type
}

-----------
Output: for a small sample file
-----------
struct.TiXmlString
        struct.TiXmlString
        struct.TiXmlString::Rep
struct.TiXmlString::Rep
        struct.TiXmlString
        struct.TiXmlString::Rep

2. To retrieve subtypes of a given type, subtype_iterator can be used,
but how to obtain type names from the subtype pointers obtained is not
clear.

You've wandered into one of LLVM's grittier bits :slight_smile: Note that
subtype_iterator iterates over a vector of PATypeHandle, that is a
"Potentially Abstract Type Handle" which is defined in
AbstractTypeUser.h. This class is used to keep the use list of abstract
and potentially abstract types up-to-date. What you want to do is call
the "get" method on this to get the actually corresponding Type*. Once
you have that, you can traverse the types in the SymbolTable to see if
it has a name. Alternatively, you might just want the type description
(generally more useful) which can be obtained via the
Type::getDescription() method. Or, you could print it directly to an
ostream with the print method.

In general, for an analyzed C++ code, is it possible to obtain
inheritance information?

Yes, but you pretty much have to do-it-yourself. LLVM knows nothing
about inheritance per se. So, however your front-end chooses to
represent inheritance could be detected, but its entirely front-end
specific.

TIA

You're welcome.

Please note that sometime in the near future (probably 1.8), the whole
Type Plane construct of the symbol table will go away. That is, there
will be two symbol tables, one for values and one for types. This will
make using the symbol table much easier (and much faster).
Unfortunately, there are significant severe design impacts that such a
change has and I haven't been able to devote enough time recently to
resolving all the issues. You can read up more on this in PR411
(411 – [SymbolTable] Reconsider one symbol table for each type) if you're interested.

Reid.