[RFC] Introduce Dump Accumulator

Introduction

I’m not a fan of keeping important data outside the IR in an analysis. If we’re planning to emit it, it should be represented directly in the IR. Is there some reason we can’t just stick the data in a global variable?

I’m not sure it’s helpful to have a generic mechanism for this; it’s not clear how this would work if multiple different features were trying to emit data into the llvm_dump section at the same time.

-Eli

I like the ability, not sure about the proposed implementation though.

Did you consider a flag that redirects `llvm::outs()` and `llvm::errs()`

into sections of the object file instead? So, you'd say:

`clang ... -mllvm -debug-only=inline ... -mllvm -dump-section=.dump`

and you'd get the regular debug output nicely ordered in the `.dump` section.

I mainly want to avoid even more output code in the passes but also be able

to collect at least that information. That doesn't mean we couldn't add another

output stream that would always/only redirect into the sections.

~ Johannes

I’m not a fan of keeping important data outside the IR in an analysis. If we’re planning to emit it, it should be represented directly in the IR. Is there some reason we can’t just stick the data in a global variable?

The analysis in the scenarios here is external to LLVM - ML training, for example. It’s really a way to do printf, but where the data could be large (challenging in a distributed build env, where IO may be throttled), or non-textual (for instance, capture IR right before a pass). An alternative would be to produce a side-file, but then (again, distributed build), you have to collect those files and concatenate them, and modify the build system to be aware of all that.

I’m not sure it’s helpful to have a generic mechanism for this; it’s not clear how this would work if multiple different features were trying to emit data into the llvm_dump section at the same time.

You could layer the approach: the one llvm_dump section has a pluggable reader.

I think that we should think about the relationship between this proposed mechanism and the existing mechanism that we have for emitting and capturing optimization remarks. In some sense, I feel like we already have a lot of this capability (e.g., llc has -remarks-section).

-Hal

Does this also support having relocations in the .llvm_dump section,
like a .rela.llvm_dump? Because we (Rahman) are now prototyping
dumping function and basic block symbols in a designated section,
which seems similar to what you propose except that we need to embed
relocations in the section and let the link do the relocation
resolving.

Thanks,
Han

I don't understand why we'd add a functionality to LLVM that is
external to LLVM and not want to change the build system accordingly.

All of the needs you describe on your email can already be done using
optimization remarks and dumped into a file.

The main difference for you, in the distributed build, is that you
rely on the linker to do the concatenation, and for that you change
the compiler and the ELF files you produce.

Seems to me like a roundabout way of doing concatenation of debug
messages that could easily be done by a simple script and added to
your build system.

In the past, when I wanted something similar, I wrote a small script
that would be called as the compiler (CMAKE_C_COMPILER=myscript) and
it would do the fiddling with return values and the outputs onto local
fast storage. For a distributed build you'd need another script to
copy them into your dispatcher (or something), and that's it.

What am I missing?

cheers,
--renato

One area the proposal doesn’t cover is the ELF file format properties. What section type will this be? What flags will it have? Is the output intended to be part of the executable loadable image or not? Does it even need to be in the final executable output, or is it just being in the object sufficient? The same or similar questions arise if you are considering this for other output formats (e.g. COFF, Mach-O etc) for each of them too.

Could this be emitted as debug info? CodeView debug info (used with COFF/Windows) already supports an (almost) equivalent mechanism called annotations where the programmer can emit arbitrary data directly into the debug info. If you could piggyback off of this same infrastructure, it could avoid reinventing many parts of the same wheel. I can’t recall if clang-cl ever added support for this, but I’m pretty sure it did.

The analysis in the scenarios here is external to LLVM - ML training, for example. It’s really a way to do printf, but where the data could be large (challenging in a distributed build env, where IO may be throttled), or non-textual (for instance, capture IR right before a pass). An alternative would be to produce a side-file, but then (again, distributed build), you have to collect those files and concatenate them, and modify the build system to be aware of all that.

I don’t understand why we’d add a functionality to LLVM that is
external to LLVM and not want to change the build system accordingly.

All of the needs you describe on your email can already be done using
optimization remarks and dumped into a file.

The main difference for you, in the distributed build, is that you
rely on the linker to do the concatenation, and for that you change
the compiler and the ELF files you produce.

At a high level, you are right, the 2 alternatives are similar, but the devil is in the details. The build system (basel-based) is hermetic, and needs to be aware of all such extra files, and have a separate rule to copy and concatenate them. This solution turned out to be much cleaner.

From Hal’s earlier message, it seems we already have something along the lines of what we need in the “-remarks-section” (llvm/lib/Remarks/RemarkStreamer.cpp or llvm/docs/Remarks.rst), so I think we need to investigate what, if anything, needs to be added to fit our scenarios.

Thanks - it’s a llvm-wide feature, not just llc-only, from what I can tell - which is good (llc-only wouldn’t have helped us much, we don’t build with llc :slight_smile: ).

We should take a look, indeed.

I was scanning the documentation, maybe I missed it - so we better understand the feature, do you happen to know what the motivating scenarios were for this, and how would one read the section in the resulting binary (i.e. custom binary reader, lllvm-readobj?)

Thanks!

Cleaner to your build system is not necessarily cleaner to LLVM and
all its users (downstream and upstream) and sub-projects.

What I'm trying to avoid is a custom-built solution injecting random
sections in the ELF binary to fix a problem that a specific build
system has for a specific project. That doesn't scale.

Like Hal, I'm trying to get you to use existing solutions in LLVM to
suit your needs, as the cost of keeping dangling features (ones used
by a single project) in a code as large as LLVM usually doesn't pay
off.

Even if that makes your build system slightly worse, the benefit of
not adding bespoke features, to all LLVM users (including you), is
still very much positive.

--renato

No disagreement there. We illustrate with scenarios we have only because we can speak confidently about them, and the hope was they may resonate with others’ (we didn’t expected them to be sufficient motivation, had they been ours alone). If it turned out that no one had similar scenarios, or the design was too narrow, we would have sought an outside-llvm alternative, for all the reasons you mention. In this case, RemarksEmitter seems to suggest related problems were addressed already.

Awesome, thanks!

Thanks - it's a llvm-wide feature, not just llc-only, from what I can tell - which is good (llc-only wouldn't have helped us much, we don't build with llc :slight_smile: ).

Exactly. The optimization-remark infrastructure was designed to be flexible in this regard. Remarks are classes that can be directly fed to a frontend handler (for direct interpretation and/or for display to the user), can be serialized in YAML and read later, etc.

We should take a look, indeed.

I was scanning the documentation, maybe I missed it - so we better understand the feature, do you happen to know what the motivating scenarios were for this, and how would one read the section in the resulting binary (i.e. custom binary reader, lllvm-readobj?)

My recollection is that the motivation is so that optimization remarks, in their machine-readable/serialized form, can be collected, stored in the binary, and then displayed, along with profiling information, in tools that help with performance profiling and analysis.

-Hal

Hi,

This is definitely something that can go through the remarks infrastructure. This is how it works now on darwin (mach-o) platforms:

* the remarks are streamed in a separate file
* the object file will contain a section with metadata and the path to the separate remarks file
* the linker (ld64) ignores the segment that contains the remarks section (__LLVM,__remarks)
* dsymutil (which is in charge of merging debug info from all .o's) will pick the remarks and generate a standalone merged remark file

From there dsymutil can be told to generate YAML, or we can use libRemarks to work with the binary format. The stuff in llvm/lib/Remarks should be easy enough to use to write tools for this. A simple one that I always have handy is to convert between file formats, or extract from object files to a standalone file (I should put this upstream someday).

This is all based on the fact that the mach-o linker (ld64) adds a link back to all the object files, which then dsymutil (or any other tool that needs debug info, like lldb) uses to find and merge things together.

On the ELF side, I can imagine that before we go into teaching linkers to understand the sections, we can teach the remark processing tools to understand the concatenation of all the section contents. When using the bitstream-based format, just pointing the parser at the metadata is enough to start looping over all the remarks, wether it’s a standalone file or something baked into a section. From there, it’s just a matter of adding an outer loop to that.

I’m happy to review any changes there!

Adding support to llvm-readobj? That would be also useful!

What’s the easiest way to read the output today? Feed the YAML file (from dsymutil, or whatever tool was used to merge them together) to llvm/tools/opt-viewer.

One thing that might be worth looking into is better LTO support, as we lose the remarks that get generated before the intermediate bitcode emission (we need to track them in the IR somehow).

— Francis