RFC: Adding a string table to the bitcode format

Hi,

As part of PR27551 I want to add a string table to the bitcode format to allow global value and comdat names to be shared with the proposed symbol table (and, as side effects, allow comdat names to be shared with value names, make bitcode files more compressible and make bitcode easier to parse). The format of the string table would be a top-level block containing a blob containing null-terminated strings [0] similar to the string table format used in most object files.

The format of MODULE_CODE_{FUNCTION,GLOBALVAR,ALIAS,IFUNC,COMDAT} records would change so that their first operand would specify their names with a byte offset into the string table. (To allow for backwards compatibility, I would increment the bitcode version.) Here is what it would look like as bcanalyzer output:

<MODULE_BLOCK>

<COMDAT op0=0 …> ; name = foo
<FUNCTION op0=0 …> ; name = foo
<GLOBALVAR op0=4 …> ; name = bar
<ALIAS op0=8 …> ; name = baz
; function bodies, etc.

</MODULE_BLOCK>
<STRTAB_BLOCK>
<STRTAB_BLOB blob=“foo\0bar\0baz\0”>
</STRTAB_BLOCK>

Each STRTAB_BLOCK would apply to all preceding MODULE_BLOCKs. This means that bitcode files can continue to be concatenated with “llvm-cat -b”. (Normally bitcode files would contain a single string table, which in multi-module bitcode files would be shared between modules.)

This almost allows us to remove the global (top-level) VST entirely, if not for the function offset in the FNENTRY record. However, this offset is not actually required because we can scan the module’s FUNCTION_BLOCK_IDs as we were doing before http://reviews.llvm.org/D12536 (this may have a performance impact, so I’ll measure it first).

Assuming that performance looks good, does this seem reasonable to folks?

Thanks,

Can this be merged somehow with the METADATA_STRINGS record? (Perhaps, by having METADATA_STRINGS refer to this?) I suspect there is significant duplication between the two. There might be a nice space optimization here.

Note that the strings in METADATA_STRINGS can contain null characters. You’d need a different storage format.

Hi,

As part of PR27551 I want to add a string table to the bitcode format to allow global value and comdat names to be shared with the proposed symbol table (and, as side effects, allow comdat names to be shared with value names, make bitcode files more compressible and make bitcode easier to parse). The format of the string table would be a top-level block containing a blob containing null-terminated strings [0] similar to the string table format used in most object files.

I’m in favor of this, but note that currently string can be encoded with less than 8 bits / char in some cases (there might some size increase because of this).
That said we already paid this with the metadata table in the recent past for example.

The format of MODULE_CODE_{FUNCTION,GLOBALVAR,ALIAS,IFUNC,COMDAT} records would change so that their first operand would specify their names with a byte offset into the string table. (To allow for backwards compatibility, I would increment the bitcode version.)

I assume you mean the EPOCH?

Here is what it would look like as bcanalyzer output:

<MODULE_BLOCK>

<COMDAT op0=0 …> ; name = foo
<FUNCTION op0=0 …> ; name = foo
<GLOBALVAR op0=4 …> ; name = bar
<ALIAS op0=8 …> ; name = baz
; function bodies, etc.

</MODULE_BLOCK>
<STRTAB_BLOCK>
<STRTAB_BLOB blob=“foo\0bar\0baz\0”>
</STRTAB_BLOCK>

Why is the string table after the module instead of before?

Each STRTAB_BLOCK would apply to all preceding MODULE_BLOCKs. This means that bitcode files can continue to be concatenated with “llvm-cat -b”. (Normally bitcode files would contain a single string table, which in multi-module bitcode files would be shared between modules.)

This almost allows us to remove the global (top-level) VST entirely, if not for the function offset in the FNENTRY record. However, this offset is not actually required because we can scan the module’s FUNCTION_BLOCK_IDs as we were doing before http://reviews.llvm.org/D12536 (this may have a performance impact, so I’ll measure it first).

Assuming that performance looks good, does this seem reasonable to folks?

I rather seek to have a symbol table that entirely replace the VST, kee. If there is a perf impact with the FNENTRY offset, why can’t it be replicated in the symbol table?

Thanks for driving this,

Hi,

As part of PR27551 I want to add a string table to the bitcode format to
allow global value and comdat names to be shared with the proposed symbol
table (and, as side effects, allow comdat names to be shared with value
names, make bitcode files more compressible and make bitcode easier to
parse). The format of the string table would be a top-level block
containing a blob containing null-terminated strings [0] similar to the
string table format used in most object files.

I’m in favor of this, but note that currently string can be encoded with
less than 8 bits / char in some cases (there might some size increase
because of this).
That said we already paid this with the metadata table in the recent past
for example.

The format of MODULE_CODE_{FUNCTION,GLOBALVAR,ALIAS,IFUNC,COMDAT}
records would change so that their first operand would specify their names
with a byte offset into the string table. (To allow for backwards
compatibility, I would increment the bitcode version.)

I assume you mean the EPOCH?

Here is what it would look like as bcanalyzer output:

<MODULE_BLOCK>
  <VERSION op0=2>
  <COMDAT op0=0 ...> ; name = foo
  <FUNCTION op0=0 ...> ; name = foo
  <GLOBALVAR op0=4 ...> ; name = bar
  <ALIAS op0=8 ...> ; name = baz
; function bodies, etc.
</MODULE_BLOCK>
<STRTAB_BLOCK>
  <STRTAB_BLOB blob="foo\0bar\0baz\0">
</STRTAB_BLOCK>

Why is the string table after the module instead of before?

Each STRTAB_BLOCK would apply to all preceding MODULE_BLOCKs. This means
that bitcode files can continue to be concatenated with "llvm-cat -b".

Do you mean "apply to all preceding MODULE_BLOCKs that aren't followed by

an intervening STRTAB_BLOCK"? I.e. when bitcode files are concatenated you
presumably don't want to apply a STRTAB_BLOCK to a MODULE_BLOCK from a
different input bitcode file that has its own STRTAB_BLOCK.

(Normally bitcode files would contain a single string table, which in
multi-module bitcode files would be shared between modules.)

This *almost* allows us to remove the global (top-level) VST entirely, if
not for the function offset in the FNENTRY record. However, this offset is
not actually required because we can scan the module's FUNCTION_BLOCK_IDs
as we were doing before http://reviews.llvm.org/D12536 (this may have a
performance impact, so I'll measure it first).

Assuming that performance looks good, does this seem reasonable to folks?

I rather seek to have a symbol table that entirely replace the VST, kee.
If there is a perf impact with the FNENTRY offset, why can’t it be
replicated in the symbol table?

Won't the new symbol table be added before the top-level VST can be
removed, i.e. you need the linkage types etc right? In that case, can the
offset just be added to the new symbol table? That would be more analogous
to object file symbol tables which also have an offset anyway.

Thanks,
Teresa

I’m not sure I read you correctly, isn’t it what I suggested?

Hi,

As part of PR27551 I want to add a string table to the bitcode format to
allow global value and comdat names to be shared with the proposed symbol
table (and, as side effects, allow comdat names to be shared with value
names, make bitcode files more compressible and make bitcode easier to
parse). The format of the string table would be a top-level block
containing a blob containing null-terminated strings [0] similar to the
string table format used in most object files.

I’m in favor of this, but note that currently string can be encoded with
less than 8 bits / char in some cases (there might some size increase
because of this).
That said we already paid this with the metadata table in the recent past
for example.

The format of MODULE_CODE_{FUNCTION,GLOBALVAR,ALIAS,IFUNC,COMDAT}
records would change so that their first operand would specify their names
with a byte offset into the string table. (To allow for backwards
compatibility, I would increment the bitcode version.)

I assume you mean the EPOCH?

Here is what it would look like as bcanalyzer output:

<MODULE_BLOCK>
  <VERSION op0=2>
  <COMDAT op0=0 ...> ; name = foo
  <FUNCTION op0=0 ...> ; name = foo
  <GLOBALVAR op0=4 ...> ; name = bar
  <ALIAS op0=8 ...> ; name = baz
; function bodies, etc.
</MODULE_BLOCK>
<STRTAB_BLOCK>
  <STRTAB_BLOB blob="foo\0bar\0baz\0">
</STRTAB_BLOCK>

Why is the string table after the module instead of before?

Each STRTAB_BLOCK would apply to all preceding MODULE_BLOCKs. This means
that bitcode files can continue to be concatenated with "llvm-cat -b".

Do you mean "apply to all preceding MODULE_BLOCKs that aren't followed by

an intervening STRTAB_BLOCK"? I.e. when bitcode files are concatenated you
presumably don't want to apply a STRTAB_BLOCK to a MODULE_BLOCK from a
different input bitcode file that has its own STRTAB_BLOCK.

(Normally bitcode files would contain a single string table, which in
multi-module bitcode files would be shared between modules.)

This *almost* allows us to remove the global (top-level) VST entirely, if
not for the function offset in the FNENTRY record. However, this offset is
not actually required because we can scan the module's FUNCTION_BLOCK_IDs
as we were doing before http://reviews.llvm.org/D12536 (this may have a
performance impact, so I'll measure it first).

Assuming that performance looks good, does this seem reasonable to folks?

I rather seek to have a symbol table that entirely replace the VST, kee.
If there is a perf impact with the FNENTRY offset, why can’t it be
replicated in the symbol table?

Won't the new symbol table be added before the top-level VST can be
removed, i.e. you need the linkage types etc right? In that case, can the
offset just be added to the new symbol table? That would be more analogous
to object file symbol tables which also have an offset anyway.

I’m not sure I read you correctly, isn’t it what I suggested?

It is - I'm just wondering why we would consider removing the offset since
other things have to be moved from the VST to a new symbol table anyway.
I.e., confused by pcc's comment that this " *almost* allows us to remove
the global (top-level) VST entirely, if not for the function offset in the
FNENTRY record" - there are currently other things in the VST that we need
to a new symbol before it can be removed, and I'm not sure why this is any
different.

Teresa

Maybe, and I think it can be done while still allowing symbol names to be null terminated. (Basically, we would allow strings in the string table to be optionally null terminated, and allow null terminated strings to stand in for non-null-terminated strings if they are present in the string table.)

I think this is something that can be done orthogonally, though.

Peter

Hi,

As part of PR27551 I want to add a string table to the bitcode format to
allow global value and comdat names to be shared with the proposed symbol
table (and, as side effects, allow comdat names to be shared with value
names, make bitcode files more compressible and make bitcode easier to
parse). The format of the string table would be a top-level block
containing a blob containing null-terminated strings [0] similar to the
string table format used in most object files.

I’m in favor of this, but note that currently string can be encoded with
less than 8 bits / char in some cases (there might some size increase
because of this).

Sure, but I think we need to make the right tradeoff between making data
more efficient to read and using fewer bits. In this case I think the right
tradeoff is clearly in favour of being efficient to read, because accessing
it is in the critical path of a consumer (i.e. a linker), and the part that
needs to be efficient to read is a relatively small part of the data in the
bitcode file. The same logic applies to the symbol table (note that we use
support::ulittle32_t instead of a bit encoding).

That said we already paid this with the metadata table in the recent past

for example.

The format of MODULE_CODE_{FUNCTION,GLOBALVAR,ALIAS,IFUNC,COMDAT}
records would change so that their first operand would specify their names
with a byte offset into the string table. (To allow for backwards
compatibility, I would increment the bitcode version.)

I assume you mean the EPOCH?

No, the MODULE_CODE_VERSION.
http://llvm-cs.pcc.me.uk/lib/Bitcode/Writer/BitcodeWriter.cpp#3822
It isn't clear to me why we have both.

Here is what it would look like as bcanalyzer output:

<MODULE_BLOCK>
  <VERSION op0=2>
  <COMDAT op0=0 ...> ; name = foo
  <FUNCTION op0=0 ...> ; name = foo
  <GLOBALVAR op0=4 ...> ; name = bar
  <ALIAS op0=8 ...> ; name = baz
; function bodies, etc.
</MODULE_BLOCK>
<STRTAB_BLOCK>
  <STRTAB_BLOB blob="foo\0bar\0baz\0">
</STRTAB_BLOCK>

Why is the string table after the module instead of before?

For implementation simplicity. The idea is that the BitcodeWriter would
have a member of type StringTableBuilder which would accumulate strings
while writing the bitcode module(s) (and symtab in the future). At the end,
the client would call something like BitcodeWriter::writeStrtab() which
would write out the string table.

Each STRTAB_BLOCK would apply to all preceding MODULE_BLOCKs. This means
that bitcode files can continue to be concatenated with "llvm-cat -b".
(Normally bitcode files would contain a single string table, which in
multi-module bitcode files would be shared between modules.)

This *almost* allows us to remove the global (top-level) VST entirely, if
not for the function offset in the FNENTRY record. However, this offset is
not actually required because we can scan the module's FUNCTION_BLOCK_IDs
as we were doing before http://reviews.llvm.org/D12536 (this may have a
performance impact, so I'll measure it first).

Assuming that performance looks good, does this seem reasonable to folks?

I rather seek to have a symbol table that entirely replace the VST, kee.
If there is a perf impact with the FNENTRY offset, why can’t it be
replicated in the symbol table?

Sure, we could in principle store function offsets in the symbol table as
well, if that helps with performance. But I want to measure the impact and
find out whether that is actually the case first.

Thanks,

Hi,

As part of PR27551 I want to add a string table to the bitcode format to
allow global value and comdat names to be shared with the proposed symbol
table (and, as side effects, allow comdat names to be shared with value
names, make bitcode files more compressible and make bitcode easier to
parse). The format of the string table would be a top-level block
containing a blob containing null-terminated strings [0] similar to the
string table format used in most object files.

I’m in favor of this, but note that currently string can be encoded with
less than 8 bits / char in some cases (there might some size increase
because of this).
That said we already paid this with the metadata table in the recent past
for example.

The format of MODULE_CODE_{FUNCTION,GLOBALVAR,ALIAS,IFUNC,COMDAT}
records would change so that their first operand would specify their names
with a byte offset into the string table. (To allow for backwards
compatibility, I would increment the bitcode version.)

I assume you mean the EPOCH?

Here is what it would look like as bcanalyzer output:

<MODULE_BLOCK>
  <VERSION op0=2>
  <COMDAT op0=0 ...> ; name = foo
  <FUNCTION op0=0 ...> ; name = foo
  <GLOBALVAR op0=4 ...> ; name = bar
  <ALIAS op0=8 ...> ; name = baz
; function bodies, etc.
</MODULE_BLOCK>
<STRTAB_BLOCK>
  <STRTAB_BLOB blob="foo\0bar\0baz\0">
</STRTAB_BLOCK>

Why is the string table after the module instead of before?

Each STRTAB_BLOCK would apply to all preceding MODULE_BLOCKs. This means
that bitcode files can continue to be concatenated with "llvm-cat -b".

Do you mean "apply to all preceding MODULE_BLOCKs that aren't followed by

an intervening STRTAB_BLOCK"? I.e. when bitcode files are concatenated you
presumably don't want to apply a STRTAB_BLOCK to a MODULE_BLOCK from a
different input bitcode file that has its own STRTAB_BLOCK.

Yes, sorry, that is exactly what I meant.

(Normally bitcode files would contain a single string table, which in

multi-module bitcode files would be shared between modules.)

This *almost* allows us to remove the global (top-level) VST entirely, if
not for the function offset in the FNENTRY record. However, this offset is
not actually required because we can scan the module's FUNCTION_BLOCK_IDs
as we were doing before http://reviews.llvm.org/D12536 (this may have a
performance impact, so I'll measure it first).

Assuming that performance looks good, does this seem reasonable to folks?

I rather seek to have a symbol table that entirely replace the VST, kee.
If there is a perf impact with the FNENTRY offset, why can’t it be
replicated in the symbol table?

Won't the new symbol table be added before the top-level VST can be
removed, i.e. you need the linkage types etc right? In that case, can the
offset just be added to the new symbol table? That would be more analogous
to object file symbol tables which also have an offset anyway.

The VST only stores names (and function offsets). The other attributes are
stored on the MODULE_CODE_{FUNCTION,GLOBALVAR,ALIAS,IFUNC} records. So once
we move the names elsewhere, the VST isn't really storing much data at all.

As I mentioned to Mehdi, we could indeed store the function offset in the
symbol table. That would be done in a separate step to this change, which
is just about string tables.

Thanks,

The same logic might suggest storing string sizes (as a prefix) to avoid scanning for string lengths.

METADATA_STRINGS is quite large and reading it should be fast (had an impact on LTO compile time IIRC). It might be worth thinking about the design of the new symbol table up front and confirming that it’s fast enough to support METADATA_STRINGS.

Hi,

As part of PR27551 I want to add a string table to the bitcode format to
allow global value and comdat names to be shared with the proposed symbol
table (and, as side effects, allow comdat names to be shared with value
names, make bitcode files more compressible and make bitcode easier to
parse). The format of the string table would be a top-level block
containing a blob containing null-terminated strings [0] similar to the
string table format used in most object files.

I’m in favor of this, but note that currently string can be encoded with
less than 8 bits / char in some cases (there might some size increase
because of this).

Sure, but I think we need to make the right tradeoff between making data
more efficient to read and using fewer bits. In this case I think the right
tradeoff is clearly in favour of being efficient to read, because accessing
it is in the critical path of a consumer (i.e. a linker), and the part that
needs to be efficient to read is a relatively small part of the data in the
bitcode file. The same logic applies to the symbol table (note that we use
support::ulittle32_t instead of a bit encoding).

The same logic might suggest storing string sizes (as a prefix) to avoid
scanning for string lengths.

It might do, keeping in mind that reading pretty much every existing object
file format already requires scanning for string lengths. Certainly
something to try and evaluate, at least.

Peter

Hi,

As part of PR27551 I want to add a string table to the bitcode format to
allow global value and comdat names to be shared with the proposed symbol
table (and, as side effects, allow comdat names to be shared with value
names, make bitcode files more compressible and make bitcode easier to
parse). The format of the string table would be a top-level block
containing a blob containing null-terminated strings [0] similar to the
string table format used in most object files.

I’m in favor of this, but note that currently string can be encoded with
less than 8 bits / char in some cases (there might some size increase
because of this).
That said we already paid this with the metadata table in the recent
past for example.

The format of MODULE_CODE_{FUNCTION,GLOBALVAR,ALIAS,IFUNC,COMDAT}
records would change so that their first operand would specify their names
with a byte offset into the string table. (To allow for backwards
compatibility, I would increment the bitcode version.)

I assume you mean the EPOCH?

Here is what it would look like as bcanalyzer output:

<MODULE_BLOCK>
  <VERSION op0=2>
  <COMDAT op0=0 ...> ; name = foo
  <FUNCTION op0=0 ...> ; name = foo
  <GLOBALVAR op0=4 ...> ; name = bar
  <ALIAS op0=8 ...> ; name = baz
; function bodies, etc.
</MODULE_BLOCK>
<STRTAB_BLOCK>
  <STRTAB_BLOB blob="foo\0bar\0baz\0">
</STRTAB_BLOCK>

Why is the string table after the module instead of before?

Each STRTAB_BLOCK would apply to all preceding MODULE_BLOCKs. This means
that bitcode files can continue to be concatenated with "llvm-cat -b".

Do you mean "apply to all preceding MODULE_BLOCKs that aren't followed

by an intervening STRTAB_BLOCK"? I.e. when bitcode files are concatenated
you presumably don't want to apply a STRTAB_BLOCK to a MODULE_BLOCK from a
different input bitcode file that has its own STRTAB_BLOCK.

Yes, sorry, that is exactly what I meant.

(Normally bitcode files would contain a single string table, which in

multi-module bitcode files would be shared between modules.)

This *almost* allows us to remove the global (top-level) VST entirely,
if not for the function offset in the FNENTRY record. However, this offset
is not actually required because we can scan the module's
FUNCTION_BLOCK_IDs as we were doing before http://reviews.llvm.org
/D12536 (this may have a performance impact, so I'll measure it first).

Assuming that performance looks good, does this seem reasonable to folks?

I rather seek to have a symbol table that entirely replace the VST, kee.
If there is a perf impact with the FNENTRY offset, why can’t it be
replicated in the symbol table?

Won't the new symbol table be added before the top-level VST can be
removed, i.e. you need the linkage types etc right? In that case, can the
offset just be added to the new symbol table? That would be more analogous
to object file symbol tables which also have an offset anyway.

The VST only stores names (and function offsets). The other attributes are
stored on the MODULE_CODE_{FUNCTION,GLOBALVAR,ALIAS,IFUNC} records. So
once we move the names elsewhere, the VST isn't really storing much data at
all.

Ok right, that's true... We could probably benchmark the removal of the
offsets on a clang ThinLTO bootstrap. As mentioned off-list to pcc, the
theoretical benefit when I added those offsets was largely because we were
planning to do iterative importing in the ThinLTO backends, which of course
we don't do anymore.

Teresa

There is already a traversal of the module for value numbering, building the StringTable at the same time seems quite natural to me.

Hi,

As part of PR27551 I want to add a string table to the bitcode format to
allow global value and comdat names to be shared with the proposed symbol
table (and, as side effects, allow comdat names to be shared with value
names, make bitcode files more compressible and make bitcode easier to
parse). The format of the string table would be a top-level block
containing a blob containing null-terminated strings [0] similar to the
string table format used in most object files.

I’m in favor of this, but note that currently string can be encoded with
less than 8 bits / char in some cases (there might some size increase
because of this).

Sure, but I think we need to make the right tradeoff between making data
more efficient to read and using fewer bits. In this case I think the right
tradeoff is clearly in favour of being efficient to read, because accessing
it is in the critical path of a consumer (i.e. a linker), and the part that
needs to be efficient to read is a relatively small part of the data in the
bitcode file. The same logic applies to the symbol table (note that we use
support::ulittle32_t instead of a bit encoding).

That said we already paid this with the metadata table in the recent past

for example.

The format of MODULE_CODE_{FUNCTION,GLOBALVAR,ALIAS,IFUNC,COMDAT}
records would change so that their first operand would specify their names
with a byte offset into the string table. (To allow for backwards
compatibility, I would increment the bitcode version.)

I assume you mean the EPOCH?

No, the MODULE_CODE_VERSION.
http://llvm-cs.pcc.me.uk/lib/Bitcode/Writer/BitcodeWriter.cpp#3822
It isn't clear to me why we have both.

Here is what it would look like as bcanalyzer output:

<MODULE_BLOCK>
  <VERSION op0=2>
  <COMDAT op0=0 ...> ; name = foo
  <FUNCTION op0=0 ...> ; name = foo
  <GLOBALVAR op0=4 ...> ; name = bar
  <ALIAS op0=8 ...> ; name = baz
; function bodies, etc.
</MODULE_BLOCK>
<STRTAB_BLOCK>
  <STRTAB_BLOB blob="foo\0bar\0baz\0">
</STRTAB_BLOCK>

Why is the string table after the module instead of before?

For implementation simplicity. The idea is that the BitcodeWriter would
have a member of type StringTableBuilder which would accumulate strings
while writing the bitcode module(s) (and symtab in the future). At the end,
the client would call something like BitcodeWriter::writeStrtab() which
would write out the string table.

There is already a traversal of the module for value numbering, building
the StringTable at the same time seems quite natural to me.

Other modules in the same bitcode file may need to add names to the string
table, and the symbol table builder may also need to add mangled names.
Trying to impose an ordering on all of those components doesn't seem worth
it in my opinion.

Peter

I’d stick with a single table per module, to be able to preserve the ability to perform binary split of modules.

We can still extract individual modules by concatenating the module and the
string table.

Thanks,

It might do, keeping in mind that reading pretty much every existing object
file format already requires scanning for string lengths. Certainly
something to try and evaluate, at least.

In lld strlen does show up in the profile. I haven't benchmarked it
recently enough to remember exactly how much.

The string table builder code currently supports tail merging. On ELF
at least that is a very modest size saving. If the size is written as
a prefix to the string, that doesn't work. If the size is stored in
the reference (like a stringef) we would be able to merge any
substring (not sure if it is profitable).

Cheers,
Rafael

> It might do, keeping in mind that reading pretty much every existing
object
> file format already requires scanning for string lengths. Certainly
> something to try and evaluate, at least.

In lld strlen does show up in the profile. I haven't benchmarked it
recently enough to remember exactly how much.

It used to take substantial share of the total execution time (IIRC ~10%),
but since we read only global symbol names, that isn't taking that much
time now.

The string table builder code currently supports tail merging. On ELF

> It might do, keeping in mind that reading pretty much every existing
object
> file format already requires scanning for string lengths. Certainly
> something to try and evaluate, at least.

In lld strlen does show up in the profile. I haven't benchmarked it
recently enough to remember exactly how much.

So I have spent some time evaluating the various options that have been
discussed on this thread. They are:
1) adding the string table itself (this involves moving all data in the VST
to either the string table or elsewhere in the bitcode, with the exception
of function offsets)
2) representing string references as offset/size
3) removing VST function offsets (and the VST itself)

I am attaching three patches with my prototypes, where t1 implements the
string table, t2 is string table + remove function offsets and t3 is string
table + remove function offsets + offset/size. (They are all against
r299588.) I used these patches to evaluate the perf impact when linking
Chromium with ThinLTO.

Median time elapsed for a no-op incremental link (#runs = 100):
- trunk: 51.71s
- string table: 48.66s (-5.9%)
- string table + remove function offsets: 49.12s (-5.0%)
- string table + remove function offsets + offset/size: 48.10s (-7.0%)

And for a full link (#runs = 13):
- trunk: 605.24s
- string table: 597.88s (-1.3%)
- string table + remove function offsets: 598.99s (-1.1%)
- string table + remove function offsets + offset/size: 598.97s (-1.1%)

It is hard to draw any conclusions from the full ThinLTO links (partly due
to the small #runs) other than that adding the string table is a clear win.
But I think it is clear enough from the incremental results that
offset/size is a win. And as Duncan mentioned it should allow us to more
easily share strings with the metadata string table.

Function offsets do appear to be a win, so it seems likely that we will
want something like them, either in the symbol table or elsewhere. My
conclusion is that there should be no great harm in storing just the
function offsets in the VST for now because, after we add the string table,
the data in the VST will no longer be needed for correctness and any
potential future change that moves the function offsets elsewhere can
simply start ignoring the VST in the reader.

I also measured the impact on file size by taking the total sum of object
file sizes for each of the variants. They are:
- trunk: 1183510504 bytes
- string table: 1151533664 bytes (-2.7%)
- string table + remove function offsets: 1147981736 bytes (-3.0%)
- string table + remove function offsets + offset/size: 1149361632 bytes
(-2.9%)
The decrease in size vs trunk is expected because of sharing of strings
between comdats and globals as well as between modules. But the differences
in sizes between the different options does not seem as important.

So here is what I plan to do:
- Create a new patch that implements string table + offset/size and take
further measurements, to make sure that removing function offsets is not a
confounding factor for performance.
- Clean up the patch and send it for review.

The string table builder code currently supports tail merging. On ELF

at least that is a very modest size saving. If the size is written as
a prefix to the string, that doesn't work. If the size is stored in
the reference (like a stringef) we would be able to merge any
substring (not sure if it is profitable).

I agree that we should store the size in the reference. Tail merging should
help on targets that require IR name mangling because the mangled name
should almost always be a suffix of the unmangled name or vice versa.

Thanks,

t1.diff (67.2 KB)

t2.diff (66.2 KB)

t3.diff (67 KB)