Metadata in LLVM back-end

DoktorC · July 29, 2020, 7:33am

Hi everyone,

I'm trying to answer to each of these questions; it is likely the answers won't be
exhaustive, but I hope they will serve as a starting point for an interesting
proposal (from my point of view and the one of ``Son Tuan VU and ``David Greene):

- "What does it mean?": it means to preserve specific information, represented as
metadata assigned to instructions, from the IR level, down to the codegen phases.

- "How does it work?": metadata should be preserved during the several
back-end transformations; for instance, during the lowering phase, DAGCombine
performs several optimization to the IR, potentially combining several
instructions. The new instruction should, then, assigned with metadata obtained
as a proper combination of the original ones (e.g., a union of metadata
information).

It might be possible to have a dedicated data-structure for such metadata info,
and an instance of such structure assigned to each instruction.

- "What is it useful for?": I think it is quite context-specific; but,
in general, it is useful when some "higher-level"
information (e.g., that can`` be discovered only before the back-end
stage of the compiler) are required in the back-end to perform "semantic"-related
optimizations.

To give an (quite generic) example where such codegen metadata may be useful: in the field
of "secure compilation", preservation of security properties during the compilation
phases is essential; such properties are specified in the high-level specifications of
the program, and may be expressed with IR metadata. The possibility to keep such IR
metadata in the codegen phases may allow preservation of properties that may be invalidated
by codegen phases.

Cheers,
-- Lorenzo

David_Greene · July 31, 2020, 8:47pm

Thanks for keeping this going, Lorenzo.

Lorenzo Casalino via llvm-dev <llvm-dev@lists.llvm.org> writes:

The first questions need to be “what does it mean?”, “how does it
work?”, and “what is it useful for?”. It is hard to evaluate a
proposal without that.

Hi everyone,

- "What does it mean?": it means to preserve specific information,
represented as metadata assigned to instructions, from the IR level,
down to the codegen phases.

An important part of the definition is "how late?" For my particular
uses it would be right up until lowering of asm pseudo-instructions,
even after regalloc and scheduling. I don't know whether someone might
need metadata even later than that (at asm/obj emission time?) but if
metadata is supported on Machine IR then it shouldn't be an issue.

As with IR-level metadata, there should be no guarantee that metadata is
preserved and that it's a best-effort thing. In other words, relying on
metadata for correctness is probably not the thing to do.

- "How does it work?": metadata should be preserved during the several
back-end transformations; for instance, during the lowering phase,
DAGCombine performs several optimization to the IR, potentially
combining several instructions. The new instruction should, then,
assigned with metadata obtained as a proper combination of the
original ones (e.g., a union of metadata information).

I want to make it clear that this is expensive to do, in that the number
of changes to the codegen pipeline is quite extensive and widespread. I
know because I've done it*. It will help if there are utilities
people can use to merge metadata during DAG transformation and the more
we make such transfers and combinations "automatic" the easier it will
be to preserve metadata.

Once the mechanisms are there it also takes effort to keep them going.
For example if a new DAG transformation is done people need to think
about metadata. This is where "automatic" help makes a real difference.

* By "it" I mean communicate information down to late phases of codegen.
I don't have a "metadata in codegen" patch as such. I simply cobbled
something together in our downstream fork that works for some very
specific use-cases.

It might be possible to have a dedicated data-structure for such
metadata info, and an instance of such structure assigned to each
instruction.

I'm not entirely sure what you mean by this.

- "What is it useful for?": I think it is quite context-specific; but,
in general, it is useful when some "higher-level" information
(e.g., that canbe discovered only before the back-end stage of the
compiler) are required in the back-end to perform "semantic"-related
optimizations.

That's my use-case. There's semantic information codegen would like to
know but is really much more practical to discover at the LLVM IR level
or even passed from the frontend. Much information is lost by the time
codegen is hit and it's often impractical or impossible for codegen to
derive it from first principles.

To give an (quite generic) example where such codegen metadata may be
useful: in the field of "secure compilation", preservation of security
properties during the compilation phases is essential; such properties
are specified in the high-level specifications of the program, and may
be expressed with IR metadata. The possibility to keep such IR
metadata in the codegen phases may allow preservation of properties
that may be invalidated by codegen phases.

That's a great use-case. I do wonder about your use of "essential"
though. Is it needed for correctness? If so an intrinsics-based
solution may be better.

My use-cases mostly revolve around communication with a proprietary
frontend and thus aren't useful to the community, which is why I haven't
pursued this with any great vigor before this.

I do have uses that convey information from LLVM analyses but
unfortunately I can't share them for now.

All of my use-cases are related to optimization. No "metadata" is
needed for correctness.

I have pondered whether intrinsics might work for my use-cases. My fear
with intrinsics is that they will interfere with other codegen analyses
and transformations. For example they could be a scheduling barrier.

I also have wondered about how intrinsics work within SelectionDAG. Do
they impact dagcombine and other transformations? The reason I call out
SelectionDAG specifically is that most of our downstream changes related
to conveying information are in DAG-related files (dagcombine, legalize,
etc.). Perhaps intrinsics could suffice for the purposes of getting
metadata through SelectionDAG with conversion to "first-class" metadata
at the Machine IR level. Maybe this is even an intermediate step toward
"full metadata" throughout the compilation.

-David

clattner · August 2, 2020, 7:37pm

Thanks Lorenzo,

I was looking for a ‘one level deeper’ analysis of how this works.

The issue is this: either information is preserved across certain sorts of transformations or it is not. If not, it either goes stale (problematic for anything that looks at it later) or is invalidated/removed.

The fundamental issue in IR design is factoring the representation of information from the code that needs to inspect and update it. “Metadata” designs try to make it easy to add out of band information to the IR in various ways, with a goal of reducing the impact on the rest of the compiler.

However, I’ve never seen them work out well. Either the data becomes stale, or you end up changing a lot of the compiler to support it. Look at debug info metadata in LLVM for example, it has both problems :-). This is why MLIR has moved to make source location information and attributes a first class part of the IR.

-Chris

DoktorC · August 6, 2020, 2:47pm

@David

Thanks for keeping this going, Lorenzo.

Lorenzo Casalino via llvm-dev <llvm-dev@lists.llvm.org> writes:

The first questions need to be “what does it mean?”, “how does it
work?”, and “what is it useful for?”. It is hard to evaluate a
proposal without that.

Hi everyone,

- "What does it mean?": it means to preserve specific information,
represented as metadata assigned to instructions, from the IR level,
down to the codegen phases.

An important part of the definition is "how late?" For my particular
uses it would be right up until lowering of asm pseudo-instructions,
even after regalloc and scheduling. I don't know whether someone might
need metadata even later than that (at asm/obj emission time?) but if
metadata is supported on Machine IR then it shouldn't be an issue.

"How late" it is context-specific: even in my case, I required such
information
to be preserved until pseudo instruction expansion. Conservatively, they
could be
preserved until the last pass of codegen pipeline.

Regarding their employment in the later steps, I would not say they are not
required, sinceI worked on a specific topic of secure compilation, and I do
not have the wholepicture in mind; nonetheless, it would be possible to
test how
things work out withthe codegen and later reason on future developments.

As with IR-level metadata, there should be no guarantee that metadata is
preserved and that it's a best-effort thing. In other words, relying on
metadata for correctness is probably not the thing to do.

Ok, I made a mistake stating that metadata should be *preserved*; what
I really meant is to preserve the *information* that such metadata
represent.

- "How does it work?": metadata should be preserved during the several
back-end transformations; for instance, during the lowering phase,
DAGCombine performs several optimization to the IR, potentially
combining several instructions. The new instruction should, then,
assigned with metadata obtained as a proper combination of the
original ones (e.g., a union of metadata information).

I want to make it clear that this is expensive to do, in that the number
of changes to the codegen pipeline is quite extensive and widespread. I
know because I've done it*. It will help if there are utilities
people can use to merge metadata during DAG transformation and the more
we make such transfers and combinations "automatic" the easier it will
be to preserve metadata.

Once the mechanisms are there it also takes effort to keep them going.
For example if a new DAG transformation is done people need to think
about metadata. This is where "automatic" help makes a real difference.

* By "it" I mean communicate information down to late phases of codegen.
I don't have a "metadata in codegen" patch as such. I simply cobbled
something together in our downstream fork that works for some very
specific use-cases.

I know what you have been through, and I can only agree with you: for the
project I mentioned above, I had to perform several changes to the whole IR
lowering phase in order to correctly propagate high-level information;
it wasn't
cheap and required a lot of effort.

It might be possible to have a dedicated data-structure for such
metadata info, and an instance of such structure assigned to each
instruction.

I'm not entirely sure what you mean by this.

I was imagining a per-instruction data-structure collecting metadata info
related to that specific instruction, instead of having several metadata info
directly embedded in each instruction.

- "What is it useful for?": I think it is quite context-specific; but,
in general, it is useful when some "higher-level" information
(e.g., that canbe discovered only before the back-end stage of the
compiler) are required in the back-end to perform "semantic"-related
optimizations.

That's my use-case. There's semantic information codegen would like to
know but is really much more practical to discover at the LLVM IR level
or even passed from the frontend. Much information is lost by the time
codegen is hit and it's often impractical or impossible for codegen to
derive it from first principles.

To give an (quite generic) example where such codegen metadata may be
useful: in the field of "secure compilation", preservation of security
properties during the compilation phases is essential; such properties
are specified in the high-level specifications of the program, and may
be expressed with IR metadata. The possibility to keep such IR
metadata in the codegen phases may allow preservation of properties
that may be invalidated by codegen phases.

That's a great use-case. I do wonder about your use of "essential"
though.

With *essential* I mean fundamental for satisfying a specific target
security property.

Is it needed for correctness? If so an intrinsics-based
solution may be better.

Uhm...it might sound as a naive question, but what do you mean with
*correctness*?

My use-cases mostly revolve around communication with a proprietary
frontend and thus aren't useful to the community, which is why I haven't
pursued this with any great vigor before this.

I do have uses that convey information from LLVM analyses but
unfortunately I can't share them for now.

All of my use-cases are related to optimization. No "metadata" is
needed for correctness.

I have pondered whether intrinsics might work for my use-cases. My fear
with intrinsics is that they will interfere with other codegen analyses
and transformations. For example they could be a scheduling barrier.

I also have wondered about how intrinsics work within SelectionDAG. Do
they impact dagcombine and other transformations? The reason I call out
SelectionDAG specifically is that most of our downstream changes related
to conveying information are in DAG-related files (dagcombine, legalize,
etc.). Perhaps intrinsics could suffice for the purposes of getting
metadata through SelectionDAG with conversion to "first-class" metadata
at the Machine IR level. Maybe this is even an intermediate step toward
"full metadata" throughout the compilation.

I employed intrinsics as a mean for carrying metadata, but,
by my experience, I am not sure they can be resorted as a valid alternative:

- For each llvm-ir instruction employed in my project (e.g., store), a
semantically
equivalent intrinsic is declared, with particular parameters representing
metadata (i.e., first-class metadata are represented by specific
intrinsic's
parameters).

- During the lowering, each ad-hoc intrinsic must be properly handled,
manually
adding the proper legalization operations, DAG combinations and so on.

- During MIR conversion of the llvm-ir (i.e., mapping intrinsics to
pseudo-instructions),
metadata are passed to the MIR representation of the program.

In particular, the second point rises a critical problem in terms of
optimizations
(e.g., intrinsic store + intrinsic trunc are not automatically converted
into a
intrinsic truncated store).Then, the backend must be instructed to
perform such
optimizations, which are actually already performed on non-intrinsic
instructions
(e.g., store + trunc is already converted into a truncated store).

Instead of re-inventing the wheel, and since the backend should be
nonetheless
modified in order to support optimizations on intrinsics, I would rather
prefer to
insert some sort of mechanism to support metadata attachment as
first-class elements
of the IR/MIR, and automatic merging of metadata, for instance.

David_Greene · August 7, 2020, 8:54pm

Lorenzo Casalino via llvm-dev <llvm-dev@lists.llvm.org> writes:

As with IR-level metadata, there should be no guarantee that metadata is
preserved and that it's a best-effort thing. In other words, relying on
metadata for correctness is probably not the thing to do.

Ok, I made a mistake stating that metadata should be *preserved*; what
I really meant is to preserve the *information* that such metadata
represent.

We do have one way of doing that now that's nearly foolproof in terms of
accidental loss: intrinsics. Intrinsics AFAIK are never just deleted
and have to be explicitly handled at some point. Intrinsics may not
work well for your use-case for a variety of reasons but they are an
option.

I'm mostly just writing this to get thoughts in my head organized.

* By "it" I mean communicate information down to late phases of codegen.
I don't have a "metadata in codegen" patch as such. I simply cobbled
something together in our downstream fork that works for some very
specific use-cases.

I know what you have been through, and I can only agree with you: for
the project I mentioned above, I had to perform several changes to the
whole IR lowering phase in order to correctly propagate high-level
information; it wasn't cheap and required a lot of effort.

I know your pain.

It might be possible to have a dedicated data-structure for such
metadata info, and an instance of such structure assigned to each
instruction.

I'm not entirely sure what you mean by this.

I was imagining a per-instruction data-structure collecting metadata info
related to that specific instruction, instead of having several metadata info
directly embedded in each instruction.

Interesting. At the IR level metadata isn't necessarily unique, though
it can be made so. If multiple pieces of information were amalgamated
into one structure that might reduce the ability to share the in-memory
representation, which has a cost. I like the ability of IR metadata to
be very flexible while at the same time being relatively cheap in terms
of resource utilization.

I don't always like that IR metadata is not scoped. It makes it more
difficult to process the IR for a Function in isolation. But that's a
relatively minor quibble for me. It's a tradeoff between convenience
and resource utilization.

That's a great use-case. I do wonder about your use of "essential"
though.

With *essential* I mean fundamental for satisfying a specific target
security property.

Is it needed for correctness? If so an intrinsics-based solution
may be better.

Uhm...it might sound as a naive question, but what do you mean with
*correctness*?

I mean will the compiler generate incorrect code or otherwise violate
some contract. In your secure compilation example, if the compiler
*promises* that the generated code will be "secure" then that's a
contract that would be violated if the metadata were lost.

I employed intrinsics as a mean for carrying metadata, but, by my
experience, I am not sure they can be resorted as a valid alternative:

- For each llvm-ir instruction employed in my project (e.g., store),
a semantically equivalent intrinsic is declared, with particular
parameters representing metadata (i.e., first-class metadata are
represented by specific intrinsic's parameters).

- During the lowering, each ad-hoc intrinsic must be properly
handled, manually adding the proper legalization operations, DAG
combinations and so on.

- During MIR conversion of the llvm-ir (i.e., mapping intrinsics to
pseudo-instructions), metadata are passed to the MIR representation
of the program.

In particular, the second point rises a critical problem in terms of
optimizations (e.g., intrinsic store + intrinsic trunc are not
automatically converted into a intrinsic truncated store).Then, the
backend must be instructed to perform such optimizations, which are
actually already performed on non-intrinsic instructions (e.g., store
+ trunc is already converted into a truncated store).

Gotcha. That certainly is a lot of burden. Do the intrinsics *have to*
mirror the existing instructions exactly or could a more generic
intrinsic be defined that took some data as an argument, for example a
pointer to a static string? Then each intrinsic instance could
reference a static string unique to its context.

I have not really thought this through, just throwing out ideas in a
devil's advocate sort of way.

In my case using intrinsics would have to tie the intrinsic to the
instruction it is annotating. This seems similar to your use-case.
This is straightforward to do if everything is SSA but once we've gone
beyond that things get a lot more complicated. The mapping of
information to specific instructions really does seem like the most
difficult bit.

Instead of re-inventing the wheel, and since the backend should be
nonetheless modified in order to support optimizations on intrinsics,
I would rather prefer to insert some sort of mechanism to support
metadata attachment as first-class elements of the IR/MIR, and
automatic merging of metadata, for instance.

Can you explain a bit more what you mean by "first-class?"

In any case, I wonder if metadata at codegen level is actually a thing
that the community would benefit (then, justifying a potentially huge
and/or long serie of patches), or it is something in which only a
small group would be interested in.

I would also like to know this. Have others found the need to convey
information down to codegen and if so, what approaches were considered
and tried?

Maybe this is a niche requirement but I really don't think it is. I
think it more likely that various hacks/modifications have been made
over the years to sufficiently approximate a desired outcome and that
this has led to not insignificant technical debt.

Or maybe I just think that because I've worked on a 40-year-old compiler
for my entire career.

-David

David_Greene · August 7, 2020, 9:09pm

Chris Lattner via llvm-dev <llvm-dev@lists.llvm.org> writes:

The issue is this: either information is preserved across certain
sorts of transformations or it is not. If not, it either goes stale
(problematic for anything that looks at it later) or is
invalidated/removed.

The fundamental issue in IR design is factoring the representation of
information from the code that needs to inspect and update it.
“Metadata” designs try to make it easy to add out of band information
to the IR in various ways, with a goal of reducing the impact on the
rest of the compiler.

However, I’ve never seen them work out well. Either the data becomes
stale, or you end up changing a lot of the compiler to support it.
Look at debug info metadata in LLVM for example, it has both problems
:-). This is why MLIR has moved to make source location information
and attributes a first class part of the IR.

I basically agree with your analysis. Some information is so pervasive
that it really should be a part of the IR proper. But other information
may not be. The kind of information I'm thinking of basically boils
down to optimization hints. It's fine and semantically sound to drop
it, though not ideal if it can be avoided.

I see debug info as being in a quite different class. With the -g
option we are making a promise to our users. So using a mechanism that
by design doesn't make promises seems a poor fit.

A long long time ago in the dark ages before git and Phabricator I
submitted a patch for review that would have added comment information
to machine instructions. It was basically a string member on every
MachineInstr. At the time it was deemed too expensive and rightly so.
Instead I ended up adding some flag values that the AsmPrinter uses as a
hint to generate various comments. I'm still not very happy with that
"solution" and a more general-purpose mechanism for annotating
IR/SelectionDAG/MIR objects would be quite welcome.

A generic first-class annotation construct would cover both use-cases.
If you and the wider community are open to adding first-class generic
information annotation, I'm eager to work on it!

-David

DoktorC · August 18, 2020, 6:27am

Lorenzo Casalino via llvm-dev <llvm-dev@lists.llvm.org> writes:

As with IR-level metadata, there should be no guarantee that metadata is
preserved and that it's a best-effort thing. In other words, relying on
metadata for correctness is probably not the thing to do.

Ok, I made a mistake stating that metadata should be *preserved*; what
I really meant is to preserve the *information* that such metadata
represent.

We do have one way of doing that now that's nearly foolproof in terms of
accidental loss: intrinsics. Intrinsics AFAIK are never just deleted
and have to be explicitly handled at some point. Intrinsics may not
work well for your use-case for a variety of reasons but they are an
option.

I'm mostly just writing this to get thoughts in my head organized.

The only problem with intrinsics, for me, was the need to mirror the
already existing instructions. As you pointed out, if there's a way to map
intrinsics and instructions, there would be no reason to mirror the latter,
andjust use the former to carry metadata.

It might be possible to have a dedicated data-structure for such
metadata info, and an instance of such structure assigned to each
instruction.

I'm not entirely sure what you mean by this.

I was imagining a per-instruction data-structure collecting metadata info
related to that specific instruction, instead of having several metadata info
directly embedded in each instruction.

Interesting. At the IR level metadata isn't necessarily unique, though
it can be made so. If multiple pieces of information were amalgamated
into one structure that might reduce the ability to share the in-memory
representation, which has a cost. I like the ability of IR metadata to
be very flexible while at the same time being relatively cheap in terms
of resource utilization.

I don't always like that IR metadata is not scoped. It makes it more
difficult to process the IR for a Function in isolation. But that's a
relatively minor quibble for me. It's a tradeoff between convenience
and resource utilization.

Uhm...could I ask you to elaborate a bit more on the "limitation on
in-memory
representation sharing"? It is not clear to me how this would cause a
problem.

That's a great use-case. I do wonder about your use of "essential"
though.

With *essential* I mean fundamental for satisfying a specific target
security property.

Is it needed for correctness? If so an intrinsics-based solution
may be better.

Uhm...it might sound as a naive question, but what do you mean with
*correctness*?

I mean will the compiler generate incorrect code or otherwise violate
some contract. In your secure compilation example, if the compiler
*promises* that the generated code will be "secure" then that's a
contract that would be violated if the metadata were lost.

You got the point: if no metadata are provided/lost, the codegen phase
is not
able to fulfill the contract (in my use case, generate code that is
"secure").

I employed intrinsics as a mean for carrying metadata, but, by my
experience, I am not sure they can be resorted as a valid alternative:

- For each llvm-ir instruction employed in my project (e.g., store),
a semantically equivalent intrinsic is declared, with particular
parameters representing metadata (i.e., first-class metadata are
represented by specific intrinsic's parameters).

- During the lowering, each ad-hoc intrinsic must be properly
handled, manually adding the proper legalization operations, DAG
combinations and so on.

- During MIR conversion of the llvm-ir (i.e., mapping intrinsics to
pseudo-instructions), metadata are passed to the MIR representation
of the program.

In particular, the second point rises a critical problem in terms of
optimizations (e.g., intrinsic store + intrinsic trunc are not
automatically converted into a intrinsic truncated store).Then, the
backend must be instructed to perform such optimizations, which are
actually already performed on non-intrinsic instructions (e.g., store
+ trunc is already converted into a truncated store).

Gotcha. That certainly is a lot of burden. Do the intrinsics *have to*
mirror the existing instructions exactly or could a more generic
intrinsic be defined that took some data as an argument, for example a
pointer to a static string? Then each intrinsic instance could
reference a static string unique to its context.

I have not really thought this through, just throwing out ideas in a
devil's advocate sort of way.

I like brainstorming

In my case using intrinsics would have to tie the intrinsic to the
instruction it is annotating. This seems similar to your use-case.
This is straightforward to do if everything is SSA but once we've gone
beyond that things get a lot more complicated. The mapping of
information to specific instructions really does seem like the most
difficult bit.

No, intrinsics does not have to mirror existing instructions; yes, they
can be used just to carry around specific data as arguments.
Nonetheless, there
we have our (implementation) problem: how to map info (e.g., intrinsics) to
instruction, and viceversa?

I am really curious on how would you perform it in the pre-RA phase

Instead of re-inventing the wheel, and since the backend should be
nonetheless modified in order to support optimizations on intrinsics,
I would rather prefer to insert some sort of mechanism to support
metadata attachment as first-class elements of the IR/MIR, and
automatic merging of metadata, for instance.

Can you explain a bit more what you mean by "first-class?"

Never mind, I used the wrong terminology: I just meant to directly
embed metadata in the IR/MIR.

In any case, I wonder if metadata at codegen level is actually a thing
that the community would benefit (then, justifying a potentially huge
and/or long serie of patches), or it is something in which only a
small group would be interested in.

I would also like to know this. Have others found the need to convey
information down to codegen and if so, what approaches were considered
and tried?

Maybe this is a niche requirement but I really don't think it is. I
think it more likely that various hacks/modifications have been made
over the years to sufficiently approximate a desired outcome and that
this has led to not insignificant technical debt.

Or maybe I just think that because I've worked on a 40-year-old compiler
for my entire career.

-David

Best regards,
Lorenzo

David_Greene · August 19, 2020, 8:37pm

Lorenzo Casalino via llvm-dev <llvm-dev@lists.llvm.org> writes:

I was imagining a per-instruction data-structure collecting metadata info
related to that specific instruction, instead of having several metadata info
directly embedded in each instruction.

Interesting. At the IR level metadata isn't necessarily unique, though
it can be made so. If multiple pieces of information were amalgamated
into one structure that might reduce the ability to share the in-memory
representation, which has a cost.

Uhm...could I ask you to elaborate a bit more on the "limitation on
in-memory representation sharing"? It is not clear to me how this
would cause a problem.

I just mean that at the IR level, if you have a metadata node with, say,
a string "foo bar" and another one with "foo" and put one on an
instruction and the other on another instruction, they won't share an
in-memory representation, whereas if you had separate nodes with "foo"
and "bar" and put both on a single instruction and just "foo" on another
instruction the "foo" metadata would be shared.

In my case using intrinsics would have to tie the intrinsic to the
instruction it is annotating. This seems similar to your use-case.
This is straightforward to do if everything is SSA but once we've gone
beyond that things get a lot more complicated. The mapping of
information to specific instructions really does seem like the most
difficult bit.

No, intrinsics does not have to mirror existing instructions; yes,
they can be used just to carry around specific data as arguments.
Nonetheless, there we have our (implementation) problem: how to map
info (e.g., intrinsics) to instruction, and viceversa?

I am really curious on how would you perform it in the pre-RA phase

Pre-RA it's relatively easy as long as we're still in SSA. The
intrinsic would simply take the instruction it should annotate as an
operand. After SSA it obviously becomes more difficult. I don't have a
lot of good answers for that right now. The live range for the value
defined by the annotated instruction and used the intrinsic would
contain both instructions so maybe that could be used to connect them.

If the annotated instruction doesn't have an output value (like a store
on machine architectures) you would use the chain output in SelectionDAG
but there's no analogue in the MachineInstr representation.

-David

DoktorC · August 31, 2020, 8:01am

Lorenzo Casalino via llvm-dev <llvm-dev@lists.llvm.org> writes:

I was imagining a per-instruction data-structure collecting metadata info
related to that specific instruction, instead of having several metadata info
directly embedded in each instruction.

Interesting. At the IR level metadata isn't necessarily unique, though
it can be made so. If multiple pieces of information were amalgamated
into one structure that might reduce the ability to share the in-memory
representation, which has a cost.

Uhm...could I ask you to elaborate a bit more on the "limitation on
in-memory representation sharing"? It is not clear to me how this
would cause a problem.

I just mean that at the IR level, if you have a metadata node with, say,
a string "foo bar" and another one with "foo" and put one on an
instruction and the other on another instruction, they won't share an
in-memory representation, whereas if you had separate nodes with "foo"
and "bar" and put both on a single instruction and just "foo" on another
instruction the "foo" metadata would be shared.

But isn't it an implementation aspect? I mean, you can have a metadata
nodes which members are pointers; if two nodes have to share the same
member instance, they can share the same pointer.

After all, even when two instructions refer to a structurally equivalent
Constant object
(LLVM: llvm::Constant Class Reference),
they actually share the same pointer to the same Constant object.

Pre-RA it's relatively easy as long as we're still in SSA. The
intrinsic would simply take the instruction it should annotate as an
operand. After SSA it obviously becomes more difficult. I don't have a
lot of good answers for that right now. The live range for the value
defined by the annotated instruction and used the intrinsic would
contain both instructions so maybe that could be used to connect them.

If the annotated instruction doesn't have an output value (like a store
on machine architectures) you would use the chain output in SelectionDAG
but there's no analogue in the MachineInstr representation.

The usage of intrinsics as wrapper for instructions to be annotated is a
really nice idea! Although this would require to instruct almost all
passes of the codegen pipeline to skip them (which, for instance, is already
done for llvm.dbg.* intrinsics).

Nonetheless, although I like the idea, without a strategy to track
output-less
MachineInstructions, it won't go really far

Furthermore, after register allocation there is a non-negligible effort
to properly annotate instructions which share the same output register...

Concerning the usage of the live ranges to tie annotated instruction and
intrinsic, I have some doubts:

1. After register allocation, since metadata intrinsics are skipped
(otherwise,
they would be involved in the register allocation process,
increasing the
register pressure), the instruction stream would present both
virtual and
physical registers, which I am not sure it is totally ok.

2. Liveness information are still available after register allocation?
Assuming
a positive answer, live intervals may be split due to register
allocation, making
connection between intrinsic and annotated instruction really difficult.

An enumeration of the MachineInstrucions, which is preserved through the
codegen
passes, would allow the creation of a 1:1 map between intrinsic and
annotated instruction;
but, unfortunately, there seems to not be such kind of enumeration in LLVM
(maybe, SlotIndexes could might be used in a creative way).

Sorry for the long delay!

-- Lorenzo

David_Greene · August 31, 2020, 12:10pm

Lorenzo Casalino via llvm-dev <llvm-dev@lists.llvm.org> writes:

If the annotated instruction doesn't have an output value (like a store
on machine architectures) you would use the chain output in SelectionDAG
but there's no analogue in the MachineInstr representation.

The usage of intrinsics as wrapper for instructions to be annotated is
a really nice idea! Although this would require to instruct almost all
passes of the codegen pipeline to skip them (which, for instance, is
already done for llvm.dbg.* intrinsics).

It's not free, certainly.

Nonetheless, although I like the idea, without a strategy to track
output-less MachineInstructions, it won't go really far

Agreed. There are probably ways to hack it in, but true metadata would
b e much better.

Furthermore, after register allocation there is a non-negligible effort
to properly annotate instructions which share the same output register...

Concerning the usage of the live ranges to tie annotated instruction and
intrinsic, I have some doubts:

1. After register allocation, since metadata intrinsics are skipped
(otherwise, they would be involved in the register allocation
process, increasing the register pressure), the instruction stream
would present both virtual and physical registers, which I am not
sure it is totally ok.

They would have to participate in register allocation. I think the only
downside would be an intrinsic that artificially extends the live range
of a value by using it past its true dead point, either because the use
really is the "last" one or because it fills a "hole" in the live range
that otherwise would exist (for example a use in one of the if-then-else
branches that would otherwise not exist).

If the intrinsics really shadow "real" instructions then it should be
possible to place them such that this is not an issue; for example, you
could place them immediately before the "real" instruction.

It's possible they could introduce extra spills and reloads, in that if
a value is spilled it would be reloaded before the intrinsic. If the
intrinsic were placed immediately before the "real" instruction then the
reload would very likely be re-used for the "real" instruction so this
is probably not an issue in practice.

2. Liveness information are still available after register
allocation? Assuming a positive answer, live intervals may be
split due to register allocation, making connection between
intrinsic and annotated instruction really difficult.

Intervals are available post-RA. They still contain information about
defs so it is *possible* to track things back though the information
tends to degrade.

An enumeration of the MachineInstrucions, which is preserved through
the codegen passes, would allow the creation of a 1:1 map between
intrinsic and annotated instruction; but, unfortunately, there seems
to not be such kind of enumeration in LLVM (maybe, SlotIndexes could
might be used in a creative way).

Yeah, SlotIndexes are what is used in the live ranges.

Sorry for the long delay!

No problem. It's good to hash these things out and identify areas of
weakness that metadata could fill.

-David

DoktorC · September 7, 2020, 8:26am

Lorenzo Casalino via llvm-dev <llvm-dev@lists.llvm.org> writes:

Furthermore, after register allocation there is a non-negligible effort
to properly annotate instructions which share the same output register...

Concerning the usage of the live ranges to tie annotated instruction and
intrinsic, I have some doubts:

1. After register allocation, since metadata intrinsics are skipped
(otherwise, they would be involved in the register allocation
process, increasing the register pressure), the instruction stream
would present both virtual and physical registers, which I am not
sure it is totally ok.

They would have to participate in register allocation.

Should they? I mean: the register allocation "simply" creates a map
(VirtReg -> PhysReg),
and actual register re-writing takes place in a subsequent machine pass.

So, we could avoid their partecipation in register allocation, reducing
register
pressure and spill/reload work. As a downside, we would have
intrinsics with virtual registers as outputs, but it is not a problem,
since they do
not perform any real computation.

I think the only
downside would be an intrinsic that artificially extends the live range
of a value by using it past its true dead point, either because the use
really is the "last" one or because it fills a "hole" in the live range
that otherwise would exist (for example a use in one of the if-then-else
branches that would otherwise not exist).

If the intrinsics really shadow "real" instructions then it should be
possible to place them such that this is not an issue; for example, you
could place them immediately before the "real" instruction.

I do not think this would be possible: before register allocation, code is
SSA form, thus the annotated instruction *must* preceeds the intrinsic
annotating it. An alternative is to place the annotating intrinsic before
the instruction who ends the specific live-range (not necessarely be an
immediate predecessor).

Just to point out a problem to cope with: instruction scheduling must be
aware of this particular positioning of annotation intrinsics.

It's possible they could introduce extra spills and reloads, in that if
a value is spilled it would be reloaded before the intrinsic. If the
intrinsic were placed immediately before the "real" instruction then the
reload would very likely be re-used for the "real" instruction so this
is probably not an issue in practice.

Yes, I agree

Kind regards,
-- Lorenzo

David_Greene · September 8, 2020, 3:57pm

Lorenzo Casalino <lorenzo.casalino93@gmail.com> writes:

Lorenzo Casalino via llvm-dev <llvm-dev@lists.llvm.org> writes:

Furthermore, after register allocation there is a non-negligible effort
to properly annotate instructions which share the same output register...

Concerning the usage of the live ranges to tie annotated instruction and
intrinsic, I have some doubts:

1. After register allocation, since metadata intrinsics are skipped
(otherwise, they would be involved in the register allocation
process, increasing the register pressure), the instruction stream
would present both virtual and physical registers, which I am not
sure it is totally ok.

They would have to participate in register allocation.

Should they? I mean: the register allocation "simply" creates a map
(VirtReg -> PhysReg), and actual register re-writing takes place in a
subsequent machine pass.

Maybe they could be skipped? I don't know if there's any precedent for
that.

So, we could avoid their partecipation in register allocation,
reducing register pressure and spill/reload work. As a downside, we
would have intrinsics with virtual registers as outputs, but it is not
a problem, since they do not perform any real computation.

If we can get that to work, yes I guess having no-op intrinsics with
virtual registers would be ok. I don't know how the backend post-RA
would cope with that though. There might be lots of asserts that assume
physical registers.

If the intrinsics really shadow "real" instructions then it should be
possible to place them such that this is not an issue; for example, you
could place them immediately before the "real" instruction.

I do not think this would be possible: before register allocation, code is
SSA form, thus the annotated instruction *must* preceeds the intrinsic
annotating it.

Oh yes of course. Duh.

An alternative is to place the annotating intrinsic before the
instruction who ends the specific live-range (not necessarely be an
immediate predecessor).

I'm not sure exactly what you mean, but it strikes me just now that if
the intrinsic is connected to the target instruction via the target
instruction's output value, then putting the intrinsic right after the
target instruction should not have any live range issues, unless the
target instruction were truly dead, in which case the intrinsic would
keep it alive. But since the intrinsic would eventually go away, I
assume we could eliminate the target instruction at the same time.

If the target instruction output is used *somewhere* it has a live range
and adding another use just after the def should not affect register
allocation appreciably. It could of course affect spill choice
heuristics like number of uses of a value but that's probably in the
noise.

It could, however, affect folding (e.g. mem operands) because a single
use of a load would turn into two uses, preventing folding. It's not
clear to me whether you would *want* folding in your use-case since you
apparently need to do something special with the load anyway.

Just to point out a problem to cope with: instruction scheduling must be
aware of this particular positioning of annotation intrinsics.

Probably true. This is a difficult problem, one I have dealt with. If
you want to keep two instructions "close" during scheduling it is a real
pain. ScheduleDAG has a concept for "glue" nodes but it's pretty hacky
and difficult to maintain in the presence of upstream churn. My initial
attempt to avoid the need for codegen metadata took this approach and it
was quite infeasible. My second approach to hack in the information in
other ways wasn't much more successful.

I think we've uncovered a number of tricky issues when trying to encode
metadata via intrinsics. To me, at least, they clearly point to the
need for a first-class solution and I think you agree with that too.
Chris also seemed to at least give tentative support to the idea.

I wonder if we're at the point of drafting an initial RFC for review.

-David

DoktorC · September 15, 2020, 9:31am

Lorenzo Casalino <lorenzo.casalino93@gmail.com> writes:

Lorenzo Casalino via llvm-dev <llvm-dev@lists.llvm.org> writes:

Furthermore, after register allocation there is a non-negligible effort
to properly annotate instructions which share the same output register...

Concerning the usage of the live ranges to tie annotated instruction and
intrinsic, I have some doubts:

1. After register allocation, since metadata intrinsics are skipped
(otherwise, they would be involved in the register allocation
process, increasing the register pressure), the instruction stream
would present both virtual and physical registers, which I am not
sure it is totally ok.

They would have to participate in register allocation.

Should they? I mean: the register allocation "simply" creates a map
(VirtReg -> PhysReg), and actual register re-writing takes place in a
subsequent machine pass.

Maybe they could be skipped? I don't know if there's any precedent for
that.

I think that they could be neglected, since they just carry information;
there's
no point in allocating physical registers for their unused output.

So, we could avoid their partecipation in register allocation,
reducing register pressure and spill/reload work. As a downside, we
would have intrinsics with virtual registers as outputs, but it is not
a problem, since they do not perform any real computation.

If we can get that to work, yes I guess having no-op intrinsics with
virtual registers would be ok. I don't know how the backend post-RA
would cope with that though. There might be lots of asserts that assume
physical registers.

Yes, if I recall correctly, there are a lot of check of the type of
register.

If the intrinsics really shadow "real" instructions then it should be
possible to place them such that this is not an issue; for example, you
could place them immediately before the "real" instruction.

I do not think this would be possible: before register allocation, code is
SSA form, thus the annotated instruction *must* preceeds the intrinsic
annotating it.

Oh yes of course. Duh.

An alternative is to place the annotating intrinsic before the
instruction who ends the specific live-range (not necessarely be an
immediate predecessor).

I'm not sure exactly what you mean

I mean, to avoid artificial extension of the live-range, place the
annotating
intrinsic (I) before the instruction (K) that kills the live-range (but the
intrinsic (I) does not have to be an *immediate* predecessor of (K) in the
instruction stream).

For instance, assume to have the following SSA stream (I am using the
ARM Thumb2
MIR since I've been working mainly on that backend):

#i %res = t2ANDrr %src_1_i, %src_2_i
...
#j %null = llvm.metadata %a, (some metadata)
...
#l %c = t2STRi12 %res, %stack_slot_res

Where instruction #l kills the live-range representing %res, and
instructions
#j is covered by the live-range of %res, which spans from #i to #l.

Giving a total ordering to the stream of instructions, #i <= #j <= #l.
As you can infer, intrinsic represented by instruction #j does not have
to be immediate predecessor of #l (that is, there can exist an
instruction #k
such that #j < #k < #l).

In such way, the live-range won't be extended (at least, in this trivial
case...)

but it strikes me just now that if
the intrinsic is connected to the target instruction via the target
instruction's output value, then putting the intrinsic right after the
target instruction should not have any live range issues, unless the
target instruction were truly dead, in which case the intrinsic would
keep it alive. But since the intrinsic would eventually go away, I
assume we could eliminate the target instruction at the same time.

If the target instruction output is used *somewhere* it has a live range
and adding another use just after the def should not affect register
allocation appreciably.

Yes!

It could of course affect spill choice
heuristics like number of uses of a value but that's probably in the
noise.

It could, however, affect folding (e.g. mem operands) because a single
use of a load would turn into two uses, preventing folding. It's not
clear to me whether you would *want* folding in your use-case since you
apparently need to do something special with the load anyway.

Uhm...yes, folding requires particular attention; but, in my project, I
avoided the problem by "disabling" folding, so I didn't care really much
about that aspect.

Just to point out a problem to cope with: instruction scheduling must be
aware of this particular positioning of annotation intrinsics.

Probably true. This is a difficult problem, one I have dealt with. If
you want to keep two instructions "close" during scheduling it is a real
pain. ScheduleDAG has a concept for "glue" nodes but it's pretty hacky
and difficult to maintain in the presence of upstream churn. My initial
attempt to avoid the need for codegen metadata took this approach and it
was quite infeasible. My second approach to hack in the information in
other ways wasn't much more successful.

It is just only an idea, but could MI Bundles be profitably employed?

I think we've uncovered a number of tricky issues when trying to encode
metadata via intrinsics. To me, at least, they clearly point to the
need for a first-class solution and I think you agree with that too.
Chris also seemed to at least give tentative support to the idea.

Yep!

I wonder if we're at the point of drafting an initial RFC for review.

Uh, this a good question. To be honest, it would the first time for me.
For sure, we could start by pinpointing the main problems and challenges
-- that we identified -- that the employment of intrinsics would face.

David_Greene · September 15, 2020, 2:58pm

Lorenzo Casalino <lorenzo.casalino93@gmail.com> writes:

An alternative is to place the annotating intrinsic before the
instruction who ends the specific live-range (not necessarely be an
immediate predecessor).

I'm not sure exactly what you mean

I mean, to avoid artificial extension of the live-range, place the
annotating intrinsic (I) before the instruction (K) that kills the
live-range (but the intrinsic (I) does not have to be an *immediate*
predecessor of (K) in the instruction stream).

Ok, got it. Thanks!

It could, however, affect folding (e.g. mem operands) because a single
use of a load would turn into two uses, preventing folding. It's not
clear to me whether you would *want* folding in your use-case since you
apparently need to do something special with the load anyway.

Uhm...yes, folding requires particular attention; but, in my project, I
avoided the problem by "disabling" folding, so I didn't care really much
about that aspect.

That makes sense for your project but it is another case of intrinsics
causing problems for general use.

Just to point out a problem to cope with: instruction scheduling must be
aware of this particular positioning of annotation intrinsics.

Probably true. This is a difficult problem, one I have dealt with. If
you want to keep two instructions "close" during scheduling it is a real
pain. ScheduleDAG has a concept for "glue" nodes but it's pretty hacky
and difficult to maintain in the presence of upstream churn. My initial
attempt to avoid the need for codegen metadata took this approach and it
was quite infeasible. My second approach to hack in the information in
other ways wasn't much more successful.

It is just only an idea, but could MI Bundles be profitably employed?

Possibly. Those didn't exist when I did my work.

I wonder if we're at the point of drafting an initial RFC for review.

Uh, this a good question. To be honest, it would the first time for
me. For sure, we could start by pinpointing the main problems and
challenges -- that we identified -- that the employment of intrinsics
would face.

That's the place to start, I think. Gather a list of requirements/use
cases along with the challenges we've discussed. Then it's a matter of
engineering a solution that fulfills the requirements while hitting as
few of the challenges as possible. Let's start by simply gathering some
lists. I'll take a quick stab and you and others can add to/edit it.

Requirements

DoktorC · October 10, 2020, 11:13am

That's the place to start, I think. Gather a list of requirements/use
cases along with the challenges we've discussed. Then it's a matter of
engineering a solution that fulfills the requirements while hitting as
few of the challenges as possible. Let's start by simply gathering some
lists. I'll take a quick stab and you and others can add to/edit it.

Requirements
------------
- Convey information not readily available in existing IR constructs to
very late-stage codegen (after regalloc/scheduling, right through
asm/object emission)

I see this more as the GOAL of the RFC, rather than a requirement.

- Flexible format - it should be as simple as possible to express the
desired information while minimizing changes to APIs

I do not want to raise a philosophical discussion (although, I would
find it quite interesting), but "flexible" does not necessarely mean
"simple".

We could split this requirement as:

- Flexible format - the format should be expressive enough to enable
modelization
of *virtually* any kind of information type.

- Simple interface - expressing information and attaching them to MIR
elements (e.g.,
instructions) should be "easy" (what does it mean *easy*?)

- Preserve information by default, only drop if explicitly told (I'm
trying to capture the requirements for your use-case here and this
differs from IR-level metadata)

What about giving to end-users the possibility to define a custom
defaultpolicy, as
well as the possibility to define different type of policies.

Further, we must cope with the combination of instructions: the
information associated
to two instructions eligible for combination, how are combined?

- Information transformation - the information associated to two
instruction A, B, which
are combined into an instruction C, should be properly transformed
according to a
user-specific policy.

A default policy may be "assign both information of A and B to C"
(gather-all/assign-all
policy?)

- No bifurcation between "well-known"/"built-in" information and things
added later/locally

May I ask you to elaborate a bit more about this point?

- Should not impact compile time excessively (what is "excessive?")

Probably, such estimation should be performed on

What about the granularity level?

- Granularity level - metadata information should be attachable with
different
level of granularity:

- *Coarse*: MachineFunction level
- *Medium*: MachineBasicBlock level
- *Fine*: MachineInstruction level

Clearly, there are other degree of granularity and/or dimensions to be
considered
(e.g., LiveInterval, MIBundles, Loops, ...).

Challenges of using intrinsics and other alternatives
-----------------------------------------------------
- Post-SSA annotation/how to associate intrinsics with
  instructions/registers/types

- Instruction selection fallout (inhibiting folding, etc.)

- Register allocation impacts (extending live ranges, etc.)

- Scheduling challenges (ensuring intrinsics can be found
  post-scheduling, etc.)

- Extending existing constructs (which ones?) requires hard-coding
  aspects of information, reducing flexibility

This is currently rather weasily-worded, because I didn't want to impose
too many restrictions right off the bat.

                  -David

Sorry for the long delay!

-- Lorenzo

David_Greene · October 20, 2020, 4:36pm

Lorenzo Casalino <lorenzo.casalino93@gmail.com> writes:

Requirements
------------
- Convey information not readily available in existing IR constructs to
very late-stage codegen (after regalloc/scheduling, right through
asm/object emission)

I see this more as the GOAL of the RFC, rather than a requirement.

Fair enough.

- Flexible format - it should be as simple as possible to express the
desired information while minimizing changes to APIs

I do not want to raise a philosophical discussion (although, I would
find it quite interesting), but "flexible" does not necessarely mean
"simple".

We could split this requirement as:

Good idea to separate these.

- Flexible format - the format should be expressive enough to enable
modelization
of *virtually* any kind of information type.

- Simple interface - expressing information and attaching them to MIR
elements (e.g.,
instructions) should be "easy" (what does it mean *easy*?)

I would say "easy" means:

- Utilities are available to make maintaining information as transparent
(automatic) as possible.

- When not automatic, it is straightforward to apply the necessary APIs
to keep information updated.

- Preserve information by default, only drop if explicitly told (I'm
trying to capture the requirements for your use-case here and this
differs from IR-level metadata)

What about giving to end-users the possibility to define a custom
defaultpolicy, as
well as the possibility to define different type of policies.

Possibly, though that might be overkill. We don't want to bog this down
so much that it doesn't make progress. I would lean toward picking a
policy and then incrementally adding features as needed.

Further, we must cope with the combination of instructions: the
information associated to two instructions eligible for combination,
how are combined?

- Information transformation - the information associated to two
instruction A, B, which are combined into an instruction C, should
be properly transformed according to a user-specific policy.

A default policy may be "assign both information of A and B to C"
(gather-all/assign-all policy?)

Again, I would lean toward just assign both pieces of information and
rpvode utilities to scrub the result if necessary. If it turns out
that other cases are common, we can add other default policies.

- No bifurcation between "well-known"/"built-in" information and things
added later/locally

May I ask you to elaborate a bit more about this point?

Sure. The current IR metadata is bifurcated. Some pieces of
information are more "first-class" than others. For example there are
specialized metadata nodes
(LLVM Language Reference Manual — LLVM 18.0.0git documentation) while
other pieces of metadata are simple strings or numbers.

It would be simplest/easiest if metadata were handled uniformly.

- Should not impact compile time excessively (what is "excessive?")

Probably, such estimation should be performed on

Did something get cut off here?

What about the granularity level?

- Granularity level - metadata information should be attachable with
different
level of granularity:

- *Coarse*: MachineFunction level
- *Medium*: MachineBasicBlock level
- *Fine*: MachineInstruction level

Clearly, there are other degree of granularity and/or dimensions to be
considered
(e.g., LiveInterval, MIBundles, Loops, ...).

It's probably a good idea to list at least the levels of granularity we
expect to need. I'd start with function/block/instruction as I can
imagine uses for all three. I am less sure about the other levels you
mention. We can add more capability later if needed.

Sorry for the long delay!

No problem! I know I'm extremely busy as I'm sure we all are.

Since you initially raised the topic, do you want to take the lead in
writing up a RFC? I can certainly do it too but I want to give you
right of first refusal.

-David

DoktorC · October 21, 2020, 8:49am

Lorenzo Casalino <lorenzo.casalino93@gmail.com> writes:

- Flexible format - it should be as simple as possible to express the
desired information while minimizing changes to APIs

I do not want to raise a philosophical discussion (although, I would
find it quite interesting), but "flexible" does not necessarely mean
"simple".

We could split this requirement as:

Good idea to separate these.

- Flexible format - the format should be expressive enough to enable
modelization
of *virtually* any kind of information type.

- Simple interface - expressing information and attaching them to MIR
elements (e.g.,
instructions) should be "easy" (what does it mean *easy*?)

I would say "easy" means:

- Utilities are available to make maintaining information as transparent
(automatic) as possible.

- When not automatic, it is straightforward to apply the necessary APIs
to keep information updated.

Ok, perfect!

- Preserve information by default, only drop if explicitly told (I'm
trying to capture the requirements for your use-case here and this
differs from IR-level metadata)

What about giving to end-users the possibility to define a custom
defaultpolicy, as
well as the possibility to define different type of policies.

Possibly, though that might be overkill. We don't want to bog this down
so much that it doesn't make progress. I would lean toward picking a
policy and then incrementally adding features as needed.

Further, we must cope with the combination of instructions: the
information associated to two instructions eligible for combination,
how are combined?

- Information transformation - the information associated to two
instruction A, B, which are combined into an instruction C, should
be properly transformed according to a user-specific policy.

A default policy may be "assign both information of A and B to C"
(gather-all/assign-all policy?)

Again, I would lean toward just assign both pieces of information and
rpvode utilities to scrub the result if necessary. If it turns out
that other cases are common, we can add other default policies.

I agree!

- No bifurcation between "well-known"/"built-in" information and things
added later/locally

May I ask you to elaborate a bit more about this point?

Sure. The current IR metadata is bifurcated. Some pieces of
information are more "first-class" than others. For example there are
specialized metadata nodes
(LLVM Language Reference Manual — LLVM 18.0.0git documentation) while
other pieces of metadata are simple strings or numbers.

It would be simplest/easiest if metadata were handled uniformly.

Ok, so this boils down to a uniform usage of the metadata.

- Should not impact compile time excessively (what is "excessive?")

Probably, such estimation should be performed on

Did something get cut off here?

Uops. Yep, I removed a paragraph, but, apparentely I forgot the first
period. In any case, we should discuss about how to quantitatively
determine an acceptable upper-bound on the overhead on the compilation
time and give a motivation for it. For instance, max n% overhead on the
compilation time must be guaranteed, because ** list of reasons **.

Of course, first we should identify the worst-case scenario; probably
the case where all the MIR elements are decorated with metadata, and all
the API functionalities are employed?

What about the granularity level?

- Granularity level - metadata information should be attachable with
different
  level of granularity:

  - *Coarse*: MachineFunction level
  - *Medium*: MachineBasicBlock level
  - *Fine*: MachineInstruction level

Clearly, there are other degree of granularity and/or dimensions to be
considered
(e.g., LiveInterval, MIBundles, Loops, ...).

It's probably a good idea to list at least the levels of granularity we
expect to need. I'd start with function/block/instruction as I can
imagine uses for all three. I am less sure about the other levels you
mention. We can add more capability later if needed.

Sorry for the long delay!

No problem! I know I'm extremely busy as I'm sure we all are.

Since you initially raised the topic, do you want to take the lead in
writing up a RFC? I can certainly do it too but I want to give you
right of first refusal.
                   -David

Uhm...actually, it wasn't me but Son Tuan, so the right of refusal
should be granted to him And I noticed now that he wasn't included in
CC of all our mails; I hope he was able to follow our discussion
anyways. I am adding him in this mail and let us wait if he has any
critical feature or point to discuss.

Thank you, David

-- Lorenzo

David_Greene · November 4, 2020, 4:40pm

Sorry about the late reply.

Lorenzo Casalino <lorenzo.casalino93@gmail.com> writes:

- Should not impact compile time excessively (what is "excessive?")

Probably, such estimation should be performed on

Did something get cut off here?

Uops. Yep, I removed a paragraph, but, apparentely I forgot the first
period. In any case, we should discuss about how to quantitatively
determine an acceptable upper-bound on the overhead on the compilation
time and give a motivation for it. For instance, max n% overhead on the
compilation time must be guaranteed, because ** list of reasons **.

I am not sure how we'd arrive at such a number or motivate/defend it.
Do we have any sense of the impact of the existing metadata
infrastructure? If not I'm not sure we can do it for something
completely new. I think we can set a goal but we'd have to revise it as
we gain experience.

Since you initially raised the topic, do you want to take the lead in
writing up a RFC? I can certainly do it too but I want to give you
right of first refusal.
-David

Uhm...actually, it wasn't me but Son Tuan, so the right of refusal
should be granted to him And I noticed now that he wasn't included in
CC of all our mails; I hope he was able to follow our discussion
anyways. I am adding him in this mail and let us wait if he has any
critical feature or point to discuss.

Fair enough! I have recently taken on a lot more work so unfortunately
I can't devote a lot of time to this at the moment. I've got to clear
out my pipeline first. I'd be very happy to help review text, etc.

-David

DoktorC · November 4, 2020, 5:30pm

Sorry about the late reply.

Lorenzo Casalino <lorenzo.casalino93@gmail.com> writes:

- Should not impact compile time excessively (what is "excessive?")

Probably, such estimation should be performed on

Did something get cut off here?

Uops. Yep, I removed a paragraph, but, apparentely I forgot the first
period. In any case, we should discuss about how to quantitatively
determine an acceptable upper-bound on the overhead on the compilation
time and give a motivation for it. For instance, max n% overhead on the
compilation time must be guaranteed, because ** list of reasons **.

I am not sure how we'd arrive at such a number or motivate/defend it.
Do we have any sense of the impact of the existing metadata
infrastructure? If not I'm not sure we can do it for something
completely new. I think we can set a goal but we'd have to revise it as
we gain experience.

I think it is the best approach to employ

Since you initially raised the topic, do you want to take the lead in
writing up a RFC? I can certainly do it too but I want to give you
right of first refusal.
-David

Uhm...actually, it wasn't me but Son Tuan, so the right of refusal
should be granted to him And I noticed now that he wasn't included in
CC of all our mails; I hope he was able to follow our discussion
anyways. I am adding him in this mail and let us wait if he has any
critical feature or point to discuss.

Fair enough! I have recently taken on a lot more work so unfortunately
I can't devote a lot of time to this at the moment. I've got to clear
out my pipeline first. I'd be very happy to help review text, etc.

Do not worry, it is ok Meanwhile we wait for any feedback/input from Son,
I'll try to prepare a draft of RFC and publish it here.

Thank you David, and have a nice day

-- Lorenzo

Son_Tuan_VU · November 8, 2020, 11:30pm

Hi,

Thank you all for keeping this going. Indeed I was not aware that the discussion was going on, I am really sorry for this late reply.

I understand Chris’ point about metadata design. Either the metadata becomes stale or removed (if we do not teach transformations to preserve it), or we end up modifying many (if not all) transformations to keep the data intact.
Currently in the IR, I feel like the default behavior is to ignore/remove the metadata, and only a limited number of transformations know how to maintain and update it, which is a best-effort approach.
That being said, my initial thought was to adopt this approach to the MIR, so that we can at least have a minimal mechanism to communicate additional information to various transformations, or even dump it to the asm/object file.
In other words, it is the responsibility of the users who introduce/use the metadata in the MIR to teach the transformations they selected how to preserve their metadata. A common API to abstract this would definitely help, just as combineMetadata() from lib/Transforms/Utils/Local.cpp does.

As for my use case, it is also security-related. However, I do not consider the metadata to be a compilation “correctness” criteria: metadata, by definition (from the LLVM IR), can be safely removed without affecting the program’s correctness.
If possible, I would like to have more details on Lorenzo’s use case in order to see how metadata would interfere with program’s correctness.

As for the RFC, I can definitely try to write one, but this would be my first time doing so. But maybe it is better to start with Lorenzo’s proposal, as you have already been working on this? Please tell me if you prefer me to start the RFC though.

Thank you again for keeping this going.

Sincerely,

Son

Topic		Replies	Views
Metadata in LLVM back-end LLVM Dev List Archives	3	160	July 28, 2020
Help with Metadata LLVM Dev List Archives	1	64	March 21, 2010
Metadata in the backend LLVM Dev List Archives	1	143	August 8, 2013
More metadata questions LLVM Dev List Archives	2	107	November 8, 2010
Using llvm Metadata inside llc LLVM Dev List Archives	4	135	April 18, 2013

Metadata in LLVM back-end

Related Topics