StructType field names

Hello,

I'm trying to construct a string like "a[1].y" in an optimization pass by digging deeper and deeper into a GetElementPtrInst. I can successfully deal with the array/pointer part, but when it comes to the structure field name "y", I cannot figure out how to get anything but the index into the structure. Is there a way to do that, or is this information discarded by llvm?

thanks,
Anthony

Anthony Danalis wrote:

Hello,

I'm trying to construct a string like "a[1].y" in an optimization pass by digging deeper and deeper into a GetElementPtrInst. I can successfully deal with the array/pointer part, but when it comes to the structure field name "y", I cannot figure out how to get anything but the index into the structure. Is there a way to do that, or is this information discarded by llvm?

LLVM uses structural equivalence and uniqueing for types, so the information that you are looking for isn't really relevant. For example, if I have the following two types:

  struct A {
    int x, y;
  };

  struct B {
    int z, w;
  };

They'll both become something like:

  { int, int }

As you noticed, all the gep encodes is the offset into the type, so all you know is that the access is to the first/second field of the type. It wouldn't make sense to have a name for it because you wouldn't know if the first field was called "x" or "z" anyway. I could imagine some front end work that might embed information that you would need to reconstruct this, but I imagine it would take some doing.

Hope this helps,
Luke

Thanks Luke,

I was afraid that this would be the case. I can see why this information is useless for most people/optimizations.
However, it is still useful if you are writing an analysis pass that is supposed to tell the developer things about her code, and you want the output messages to be human readable.

Regarding the "x" or "z" dilemma, I noticed that when you print the gep instruction, the generated string says "struct A" (or whatever the struct name really is), so all I would need beyond that is a list of structs and the names of their fields. I do understand though that this is a very low priority feature for anybody to implement, and I since do not want to get involved with the front-end myself I will probably write an external perl script that given a struct name and an offset it returns a field name.

My question for the list is this:
Are there any optimizations that could change a struct type? That is, could llvm deduce that the first field of a given struct is never used/defined and change the struct type (and all the offsets wherever they appear) so that this field does not exist anymore? In other words, is it safe to do an external, trivial mapping from struct name+offset to field name, or are there optimizations that would break it?

thanks,
Anthony

If you want to give programmers feedback at the source level, CLang is almost certainly a better choice.

–Vikram
Associate Professor, Computer Science
University of Illinois at Urbana-Champaign
http://llvm.org/~vadve

Thanks Luke,

I was afraid that this would be the case. I can see why this
information is useless for most people/optimizations.
However, it is still useful if you are writing an analysis pass that
is supposed to tell the developer things about her code, and you want
the output messages to be human readable.

Debugging information (i.e., the llvm.dbg.* intrinsics) is the way that
LLVM IR is correlated with the user's source code. If debugging
information for a variable is available, it will provide all the field
names, without being subject to type uniquing or name mangling.

However, support in LLVM for debug information in optimized code
is currently a work in progress. This area is under active development.

My question for the list is this:
Are there any optimizations that could change a struct type? That is,
could llvm deduce that the first field of a given struct is never used/
defined and change the struct type (and all the offsets wherever they
appear) so that this field does not exist anymore?

Basically, any transformation which preserves program behavior is
valid. However, debugging information is not left inconsistent.
Transformations must either update debugging information
to describe how the code after the transformation corresponds with
the user's code, or they must delete it.

Dan