enumerating the SBTypes from an SBModule

Hi,

Is there a way to enumerate all the SBTypes that are defined inside an SBModule? I see that lldb_private::Module has a method TypeList* GetTypeList(); but it is not exposed by the official API. Is there a workaround or did I miss something? (I have tried using SBModule::FindTypes with "*", "" or NULL but it doesn't work that way apparently as it strictly searches for an explicit type name).

Cheers,

S.

I have investigated a bit and added a hack to SBModule::FindTypes(..) to get all the types when the requested type name is null. (I have also changed lldb_private::TypeList in order to fetch all its contents in a std::vector in one go instead of calling GetTypeAtIndex which is way to slow when the list grows big).

I works in a strange way: the first time I call it I only get about 10 types, but each time I request the list the number of type gets bigger (much bigger, after about 10 call I reach a maximum of ~64K types). My guess is that it's due to lazy evaluation/parsing. I tried to understand the type parsing code but it's fairly complex and I don't really want to break anything with my little experiments.

I have attached a patch showing the changes I did.

I would welcome any help, thanks in advance,

S.

GetAllTypes.diff (1.56 KB)

Types are parsed lazily in modules so you will only get what types have been parsed up to the time you make the function call. My issue with trying to enumerate types in a module are:
1 - it isn't always easy to ask a large DWARF object "get me the number of unique types you contain". Often DWARF has duplicate type info that we realize can be coalesced when we parse a type.
2 - it is hard to force each symbol file parser to support "get type at index"

The best we can probably do is to add a

SBTypeList
SBModule::GetAllTypes()

Then lldb_private::SymbolVendor and lldb_private::SymboFile would need to have a new virtual functions added to them:

virtual size_t
lldb_private::SymbolVendor::GetAllTypes(lldb_private::TypeList &type_list);

virtual size_t
lldb_private::SymboFile::GetAllTypes(lldb_private::TypeList &type_list);

Then each symbol file parser would be responsible for parsing all types and returning a uniqued type list.

As for the performance issue you solved with your std::vector, I would change this patch to do the following:

1 - modify lldb_private::TypeListImpl to have a new Append function that takes a "const lldb_private::TypeList &type_list". You then might need to add a function to lldb_private::TypeList like:

void
lldb_private::TypeList (std::function <bool(lldb::TypeSP &type_sp)> const &callback);

We use this trick in the "BreakpointSiteList::ForEach" to provide an efficient way to iterate over a collection. An example of using a lambda function to iterate over the breakpoint site list can be seen in Process::DisableAllBreakpointSites(). The bool return value for the callback function indicates whether to continue iterating over the list.

The trickiest part of this will be the SymbolFileDWARF::GetAllTypes(). If you need help with this let me know.

Greg Clayton

Types are parsed lazily in modules so you will only get what types have been parsed up to the time you make the function call. My issue with trying to enumerate types in a module are:
1 - it isn't always easy to ask a large DWARF object "get me the number of unique types you contain". Often DWARF has duplicate type info that we realize can be coalesced when we parse a type.
2 - it is hard to force each symbol file parser to support "get type at index"

Watching the result I got made me realise that (some obscure duplicates).

The best we can probably do is to add a

SBTypeList
SBModule::GetAllTypes()

Then lldb_private::SymbolVendor and lldb_private::SymboFile would need to have a new virtual functions added to them:

virtual size_t
lldb_private::SymbolVendor::GetAllTypes(lldb_private::TypeList &type_list);

virtual size_t
lldb_private::SymboFile::GetAllTypes(lldb_private::TypeList &type_list);

Ok. Adding new method was the easy part :-).

Then each symbol file parser would be responsible for parsing all types and returning a uniqued type list.

As for the performance issue you solved with your std::vector, I would change this patch to do the following:

1 - modify lldb_private::TypeListImpl to have a new Append function that takes a "const lldb_private::TypeList &type_list". You then might need to add a function to lldb_private::TypeList like:

void
lldb_private::TypeList (std::function <bool(lldb::TypeSP &type_sp)> const &callback);

We use this trick in the "BreakpointSiteList::ForEach" to provide an efficient way to iterate over a collection. An example of using a lambda function to iterate over the breakpoint site list can be seen in Process::DisableAllBreakpointSites(). The bool return value for the callback function indicates whether to continue iterating over the list.

I did that but don't I need to change SBTypeList to add some way to construct it from an existing TypeListImpl (or to access its TypeListImpl member? Or at least make SBModule a friend of SBTypeList?

The trickiest part of this will be the SymbolFileDWARF::GetAllTypes(). If you need help with this let me know.

Ok, indeed that's the scary part for a newbie. What is the best place to look in order to get some ideas of how to implement that?

Many thanks for your very helpful answers again!

S.

Types are parsed lazily in modules so you will only get what types have been parsed up to the time you make the function call. My issue with trying to enumerate types in a module are:
1 - it isn't always easy to ask a large DWARF object "get me the number of unique types you contain". Often DWARF has duplicate type info that we realize can be coalesced when we parse a type.
2 - it is hard to force each symbol file parser to support "get type at index"

Watching the result I got made me realise that (some obscure duplicates).

The best we can probably do is to add a

SBTypeList
SBModule::GetAllTypes()

Then lldb_private::SymbolVendor and lldb_private::SymboFile would need to have a new virtual functions added to them:

virtual size_t
lldb_private::SymbolVendor::GetAllTypes(lldb_private::TypeList &type_list);

virtual size_t
lldb_private::SymboFile::GetAllTypes(lldb_private::TypeList &type_list);

Ok. Adding new method was the easy part :-).

Then each symbol file parser would be responsible for parsing all types and returning a uniqued type list.

As for the performance issue you solved with your std::vector, I would change this patch to do the following:

1 - modify lldb_private::TypeListImpl to have a new Append function that takes a "const lldb_private::TypeList &type_list". You then might need to add a function to lldb_private::TypeList like:

void
lldb_private::TypeList (std::function <bool(lldb::TypeSP &type_sp)> const &callback);

We use this trick in the "BreakpointSiteList::ForEach" to provide an efficient way to iterate over a collection. An example of using a lambda function to iterate over the breakpoint site list can be seen in Process::DisableAllBreakpointSites(). The bool return value for the callback function indicates whether to continue iterating over the list.

I did that but don't I need to change SBTypeList to add some way to construct it from an existing TypeListImpl (or to access its TypeListImpl member? Or at least make SBModule a friend of SBTypeList?

The trickiest part of this will be the SymbolFileDWARF::GetAllTypes(). If you need help with this let me know.

Ok, indeed that's the scary part for a newbie. What is the best place to look in order to get some ideas of how to implement that?

Send a patch with this function hollowed out and I will fill it in for you.

Awesome, here is my tentative patch:

GetAllTypes.diff (7.46 KB)

I added the ability to get a list of types from a SBModule or SBCompileUnit.

% svn commit
Sending include/lldb/API/SBCompileUnit.h
Sending include/lldb/API/SBModule.h
Sending include/lldb/API/SBType.h
Sending include/lldb/Core/MappedHash.h
Sending include/lldb/Core/UniqueCStringMap.h
Sending include/lldb/Symbol/SymbolFile.h
Sending include/lldb/Symbol/SymbolVendor.h
Sending include/lldb/Symbol/Type.h
Sending include/lldb/Symbol/TypeList.h
Sending include/lldb/lldb-enumerations.h
Sending scripts/Python/interface/SBCompileUnit.i
Sending scripts/Python/interface/SBModule.i
Sending source/API/SBCompileUnit.cpp
Sending source/API/SBModule.cpp
Sending source/Plugins/SymbolFile/DWARF/HashedNameToDIE.h
Sending source/Plugins/SymbolFile/DWARF/NameToDIE.cpp
Sending source/Plugins/SymbolFile/DWARF/NameToDIE.h
Sending source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
Sending source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
Sending source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp
Sending source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.h
Sending source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.cpp
Sending source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.h
Sending source/Symbol/SymbolVendor.cpp
Sending source/Symbol/Type.cpp
Sending source/Symbol/TypeList.cpp
Transmitting file data ..........................
Committed revision 184251.

The new functions are:

//------------------------------------------------------------------
/// Get all types matching \a type_mask from debug info in this
/// module.
///
/// @param[in] type_mask
/// A bitfield that consists of one or more bits logically OR'ed
/// together from the lldb::TypeClass enumeration. This allows
/// you to request only structure types, or only class, struct
/// and union types. Passing in lldb::eTypeClassAny will return
/// all types found in the debug information for this module.
///
/// @return
/// A list of types in this module that match \a type_mask
//------------------------------------------------------------------
lldb::SBTypeList
SBModule::GetTypes (uint32_t type_mask)

//------------------------------------------------------------------
/// Get all types matching \a type_mask from debug info in this
/// compile unit.
///
/// @param[in] type_mask
/// A bitfield that consists of one or more bits logically OR'ed
/// together from the lldb::TypeClass enumeration. This allows
/// you to request only structure types, or only class, struct
/// and union types. Passing in lldb::eTypeClassAny will return
/// all types found in the debug information for this compile
/// unit.
///
/// @return
/// A list of types in this compile unit that match \a type_mask
//------------------------------------------------------------------
lldb::SBTypeList
SBCompileUnit::GetTypes (uint32_t type_mask = lldb::eTypeClassAny);

This lets you request types by filling out a mask that contains one or more bits from the lldb::TypeClass enumerations, so you can only get the types you really want.

Now you can do things like:

(lldb) script
module = lldb.target.module['a.out']
types = module.GetTypes()
for t in types:
  print t

Or you can iterate through the compile units and get the types for each individual compile unit:

(lldb) script
module = lldb.target.module['a.out']
for i in range(module.GetNumCompileUnits()):
  cu = module.GetCompileUnitAtIndex(i)
  types = cu.GetTypes()
  for t in types:
    print t

Excellent! This is exactly what I needed and the addition of the class mask fits my needs perfectly!

Thanks so much Greg!

S.