Finding Things In SymbolTable

The SymbolTable is a map of Type* to a map of names to Value*. This
means that in order to lookup something by name you first have to know
what type it is. For the basic types this is fine because you generally
always know which basic type you're after. But, for derived types, this
can be quite complicated. It isn't always possible to know the exact
type.

In XPL, types are just names. That is there's no such thing as a type
expression. The name of the type _is_ the expression. So, in XPL, we
declare something like:

<Signature name="My_Signature" result="void"></Signature>

(that's a signature type, or FunctionType in LLVM speak, that has no
arguments and returns nothing).

Now, we define a function like this:

<Function name="My_Function" signature="My_Signature">
...
</Function>

The problem is that when I hit the Function definition, all I have is
"My_Signature" to identify the function's type. LLVM currently doesn't
let me find this signature because I'd have to know its type first. The
whole point of putting it in the symbol table is so that I can look up
its type later on with only its name.

Here's some things I'm thinking about changing in LLVM to make it easier
to support this kind of query:

(1) Make it a map of Type::PrimitiveID instead of a map of Type*. The
same interface could be retained by simply obtaining the primitive id of
of any Type* passed into the SymbolTable's interface. For the basic
types, this probably is a wash because nearly every sane complier will
use the static constants defined in Type so there won't be any bloat in
the symbol table. That is, every "int" is an instance of Type::IntTy.
For derived types, lookups could take a little longer because the first
plane (of PrimitiveID) is now less specialized than the existing plane
(of Type*).

(2) Insert another plane, making the symbol table a map of PrimitiveId
to a map of Type* to a map of names to Value*. That is, we categorize
the various types by their PrimitiveId as well as just their Type*
value. This might actually speed things up a bit for derived types. It
would definitely speed things up for (3) ...

(3) Adding various methods to the interface of SymbolTable to look up
names by their PrimitiveID. This can be done regardless of implementing
(1) or (2) above. However, if neither (1) or (2) is done, the algorithm
for these methods is O(N) .. a linear search for the matching item.

Any thoughts on this? Am I missing something?

Reid.

<sorry for the delay, things have been crazy here moving between

The SymbolTable is a map of Type* to a map of names to Value*. This
means that in order to lookup something by name you first have to know
what type it is. For the basic types this is fine because you generally
always know which basic type you're after. But, for derived types, this
can be quite complicated. It isn't always possible to know the exact
type.

Ok, the first thing to realize is that the current symbol table really
can't change in any meaningful way. The SymbolTable class represents the
_LLVM_ symbol table, which is used to figure out how to print stuff out
and how to resolve references, in an entirely LLVM specific way. As such,
any changes to the SymbolTable class should be extensions on what we have:
eliminating the current SymbolTable class would break too many ingrained
LLVM semantics.

The problem is that when I hit the Function definition, all I have is
"My_Signature" to identify the function's type. LLVM currently doesn't
let me find this signature because I'd have to know its type first. The
whole point of putting it in the symbol table is so that I can look up
its type later on with only its name.

So there are two possible solutions to this problem. When you are
actually generating LLVM code it probably makes sense to keep an extra
XPL-specific symbol table on the side, which you use to keep track of the
information that you need.

On the other hand, it might be reasonable to add a simple multimap from
names to LLVM values to the LLVM SymbolTable. This would be occasionally
useful for things like looking up function names, but would add a serious
amount of bloat to the SymbolTable for something that is infrequently
useful.

Any thoughts on this? Am I missing something?

It sounds feasible, but I'm not sure it is something that should be added
to the LLVM SymbolTable class. This class really isn't meant to be the
"one true solution" for all possible name lookup requests, it's really
meant to support to LLVM core. Transformations, backends, and front-ends
that need something different should probably build support on top of the
LLVM symbol table stuff to support their needs (e.g., the llvm::Mangler
interface that Brian wrote for the backends).

Is this reasonable?

-Chris