PROPOSAL : Introduce NamedMetadata

In LLVM IR metadata is used to attach auxiliary information with
various IR constructs. Currently metadata information is represented
using MDNode and MDString. The metadata can refer to LLVM values but
these references are not counted as regular "uses" of these values
because metadata is maintained 'on the side'. This ensures that the
optimizer is not influenced by auxiliary information. For example,

!1 = { i32 42 }

define void @foo() {
  %x = call i8 @llvm.something(metadata !1)
}

See http://nondot.org/~sabre/LLVMNotes/EmbeddedMetadata.txt for more
information.

This metadata support is not useful if the auxiliary information is
not referenced by any LLVM IR entity. This is a limiting factor. The
proposed solution is to introduced NamedMetadata. The NamedMetadata is
derived from GlobalValue. The NamedMetadata is an array of metadata
nodes and metadata strings.

  @class.42 = ![ !5, !11, !2 ] ; NamedMetadata

  !5 = metadata !"int"
  !11 = metadata !"class.21 pointer"
  !2 = metadata ! {I32 8 } ; size

The NamedMetadata element list contains MDString and MDNodes only. The
NamedMetadata is always implicitly typed as metadata. LLVM bitcode
reader and writer will keep track of NamedMetadata in a Module. A
module can use metadata that is not listed in any NamedMetadata value.
A module can have multiple NamedMetadata values. NamedMetadata
implicitly uses AppendingLinkage. A NamedMetadata value has zero
uses.

The NamedMetadata can be used to describe Front End specific types for
the optimizer's use. Another potential use is to encode debug
information for a global variable. I do not intend to make @llvm.used
a NamedMetadata value.

In LLVM IR metadata is used to attach auxiliary information with
various IR constructs. Currently metadata information is represented
using MDNode and MDString. The metadata can refer to LLVM values but
these references are not counted as regular "uses" of these values
because metadata is maintained 'on the side'. This ensures that the
optimizer is not influenced by auxiliary information. For example,

!1 = { i32 42 }

define void @foo() {
%x = call i8 @llvm.something(metadata !1)
}

See http://nondot.org/~sabre/LLVMNotes/EmbeddedMetadata.txt for more
information.

This metadata support is not useful if the auxiliary information is
not referenced by any LLVM IR entity. This is a limiting factor. The
proposed solution is to introduced NamedMetadata. The NamedMetadata is
derived from GlobalValue.

So, the idea is that some Analysis can take an LLVM IR entity, mangle
information about that entity into a string, and then use that string
to look up the associated NamedMetadata?

An alternative to this would be to introduce a form of MDNode that
has a designated Value that it's associated with. Then, for any
given Value, one could look up all the Metadatas(?) associated
with it.

The NamedMetadata is an array of metadata
nodes and metadata strings.

  @class.42 = ![ !5, !11, !2 ] ; NamedMetadata

  !5 = metadata !"int"
  !11 = metadata !"class.21 pointer"
  !2 = metadata ! {I32 8 } ; size

The NamedMetadata element list contains MDString and MDNodes only. The
NamedMetadata is always implicitly typed as metadata. LLVM bitcode
reader and writer will keep track of NamedMetadata in a Module. A
module can use metadata that is not listed in any NamedMetadata value.
A module can have multiple NamedMetadata values. NamedMetadata
implicitly uses AppendingLinkage.

And DefaultVisibility and a default Section and an Alignment of 1?
GlobalValues carry some amount of baggage here.

Dan

No. I think, I used confusing example.

The idea is to provide a way to have metadata that is not directly
used by any LLVM IR entity. Here the metadata entity can itself use
any other LLVM IR entity. For example

; ModuleID = 'blah.c'

!1 = { i32 459008, metadata !"blah.c", "b", i32 4, i32 6 }
!2 = { i32 458804, metadata !"blah.c", "a", i32 4, i32 6, i32 %a }
a = global i32 42

define void @foo() {
  call void @llvm.dbg.declare(!1)
}

I talked with Devang about this and we came up with this syntax:

!llvm.dbg.gvs = { !2 }

which is nicely analogous with the syntax for named vs. unnamed values,
and with having NamedMetadata inherit directly from Value, rather than
from GlobalValue, since the only thing it needs is a name, not all
the other stuff from User + Constant + GlobalValue.

Dan

Devang Patel wrote:

In LLVM IR metadata is used to attach auxiliary information with
various IR constructs. Currently metadata information is represented
using MDNode and MDString. The metadata can refer to LLVM values but
these references are not counted as regular "uses" of these values
because metadata is maintained 'on the side'. This ensures that the
optimizer is not influenced by auxiliary information. For example,

!1 = { i32 42 }

define void @foo() {
  %x = call i8 @llvm.something(metadata !1)
}

See http://nondot.org/~sabre/LLVMNotes/EmbeddedMetadata.txt for more
information.

This metadata support is not useful if the auxiliary information is
not referenced by any LLVM IR entity. This is a limiting factor. The
proposed solution is to introduced NamedMetadata. The NamedMetadata is
derived from GlobalValue. The NamedMetadata is an array of metadata
nodes and metadata strings.

  @class.42 = ![ !5, !11, !2 ] ; NamedMetadata

  !5 = metadata !"int"
  !11 = metadata !"class.21 pointer"
  !2 = metadata ! {I32 8 } ; size

The NamedMetadata element list contains MDString and MDNodes only. The
NamedMetadata is always implicitly typed as metadata. LLVM bitcode
reader and writer will keep track of NamedMetadata in a Module. A
module can use metadata that is not listed in any NamedMetadata value.
A module can have multiple NamedMetadata values. NamedMetadata
implicitly uses AppendingLinkage. A NamedMetadata value has zero
uses.

The NamedMetadata can be used to describe Front End specific types for
the optimizer's use. Another potential use is to encode debug
information for a global variable. I do not intend to make @llvm.used
a NamedMetadata value.

Why not have a named GlobalValue with an MDNode initializer? How is this different from what we had before?

Nick

GlobalValue initializer accepts only Constants.

Devang Patel wrote:

Why not have a named GlobalValue with an MDNode initializer? How is this
different from what we had before?

GlobalValue initializer accepts only Constants.

So in my initial implementation, before I submitted it for review, I tried making them not be Constants, only to discover that if I did that then I couldn't use then as GV initializers.

And it's not preposterous for them to be Constants. Functions are an example of a nameable zero-operand non-uniqued constant.

Is there any particular benefit to keeping them out of Constant and adding NamedMetadata?

Nick

There are many reasons why MDNodes are not Constant. Metadata can be recursive.
Metadata can refer to instructions, basic blocks or any other
non-Constant Values. Metadata are not stored in the Module directly.
However, someone may want to maintain a collection of selected MDNodes
in a module. And that's where NamedMetadata is used.

I like Dan's suggestion to not inherit NamedMetadata from constant.

This is a great idea.

-Chris