DebugInfo and Metadata Store

Hi All,

Now, in llvm trunk we support custom metadata, which can be attached
with llvm instructions. One of the user is "debugging information".
This user is special (not just because it is in llvm svn tree:)
because whenever debug info is available, it is likely that
corresponding metadata is attached with almost all instructions. In
other words, usually it is all or nothing. This means, using Metadata
Store to access debugging information attached with each llvm
instruction is expensive, in terms of compile time. If debugging
information is directly stored inside llvm::Instruction then it speeds
up compile time. In my measurements this speeds up LLVMContext.cpp
compile time (-O0 -g) by almost 10%. I do not see any strong reason to
give up this gain.

Thoughts ? I am attaching tentative patch with this email.

smd.diff (6.74 KB)

Hi Devang,

Before I get to the contents of this particular patch, I have a higher-level design question. Why is metadata stored off the LLVMContext? That seems like the wrong place to me, this data is all specific to the particular module. Specifically, a single MDNode may not cross modules, and if a module is destroyed we should eliminate all the metadata that came with it.

So, why MetadataContext at all?

Nick

Devang Patel wrote:

Hi Devang,

Before I get to the contents of this particular patch, I have a higher-level
design question. Why is metadata stored off the LLVMContext?

Why do we have FPConstants, IntConstants etc.. in LLVMContext ? :slight_smile:

That seems like
the wrong place to me, this data is all specific to the particular module.
Specifically, a single MDNode may not cross modules, and if a module is
destroyed we should eliminate all the metadata that came with it.

Metadata is not connected with a Module. Module does not own metadata.
Metadata is on the side, observing Values.

2009/11/30 Devang Patel <devang.patel@gmail.com>

Hi Devang,

Before I get to the contents of this particular patch, I have a higher-level
design question. Why is metadata stored off the LLVMContext?

Why do we have FPConstants, IntConstants etc… in LLVMContext ? :slight_smile:

So you can build up Constants and Instructions before inserting them where they’ll be used. I suppose that sounds reasonable to do for metadata too.

That seems like
the wrong place to me, this data is all specific to the particular module.
Specifically, a single MDNode may not cross modules, and if a module is
destroyed we should eliminate all the metadata that came with it.

Metadata is not connected with a Module. Module does not own metadata.
Metadata is on the side, observing Values.

But your patch changes that for debug info. You’ve added a pointer to Instruction (a WeakVH DbgInfo). What’s the memory usage penalty of that on a large program?

If we’re willing to pay this memory cost, even for programs that aren’t using debug info, why not just store pointer to a linked list of metadata info in the Instruction and remove the MetadataContext? If it’s a 10% performance win for debug info we should be able to get the exact same win across the board for all metadata with no cost beyond what your patch already proposes.

Nick

What's the memory usage penalty of that on a
large program?

I have not measured that yet.

If we're willing to pay this memory cost, even for programs that aren't
using debug info, why not just store pointer to a linked list of metadata
info in the Instruction and remove the MetadataContext? If it's a 10%
performance win for debug info we should be able to get the exact same win
across the board for all metadata with no cost beyond what your patch
already proposes.

It all depends on how metadata is used. I am myself not very thrilled
about this patch (hence requesting feedback first), but I'd try every
possible opportunity to speedup -O0 -g compile time :slight_smile:

2009/11/30 Devang Patel <devang.patel@gmail.com>

What’s the memory usage penalty of that on a
large program?

I have not measured that yet.

I think that’s the reason we chose the “on the side” design that we did. Please measure it that so we know what tradeoff we’d be making.

If we’re willing to pay this memory cost, even for programs that aren’t
using debug info, why not just store pointer to a linked list of metadata
info in the Instruction and remove the MetadataContext? If it’s a 10%
performance win for debug info we should be able to get the exact same win
across the board for all metadata with no cost beyond what your patch
already proposes.

It all depends on how metadata is used. I am myself not very thrilled
about this patch (hence requesting feedback first), but I’d try every
possible opportunity to speedup -O0 -g compile time :slight_smile:

Sure, I was just thinking it could be made more general. Sparse metadata with the MetadataContext that we have now, dense metadata with a linked-list on the Instruction. If we did this, we could require MDKinds to pick sparse or dense so we don’t have to play mix-and-match all the time.

Nick

Hi Devang,

+ WeakVH DbgInfo;

This holds an MDNode* which contains information which is used
to look up a DebugLoc? It sounds like there's an unnecessary layer
of indirection here. Would it make sense to make Instruction just
have a DebugLoc member instead? I don't know what that would mean
for the getMD API, but my understanding of the Metadata design
was that things like nsw, tail, and align could eventually be
considered Metadata, so perhaps there's a way to handle this.

Also, please mention the issue of increasing the size of
Instruction for non-debug compilation and indicate what the
plan is.

A few misc comments:

- return MDHandlerNames[Name] = Count + 1;
+ return MDHandlerNames[Name] = Count + 1 + 1;

This code is non-obvious enough to justify a comment.

- // MD Handlers are numbered from 1.
+ // MD Handlers are numbered from 1.

Please don't add trailing whitespace, especially to lines which
aren't otherwise changed.

+#include "llvm/Support/Timer.h"

...

+ Timer *MDTimer;

This isn't related to the rest of the patch.

Dan