[RFC] CFI for indirect calls with ThinLTO

Hi,

this is a proposal for the implementation of CFI-icall [1] with ThinLTO.

Jumptables are generated in the merged module. To generate a
jumptable, we need a list of functions with !type annotations,
including (in non-cross-dso mode) external functions. Unfortunately,
LLVM IR does not preserve unused function declarations, and we don’t
want to copy the actual function bodies to the merged module.

Indirect call targets can be represented in the following way using
named metadata:

void foo() {}
int bar() { return 0; }

# Merged module
!cfi.functions = !{!1, !3}
!1 = !{!"bar", i8 0, !2}
!2 = !{i64 0, !"_ZTSFiE"}
!3 = !{!"foo", i8 0, !4}
!4 = !{i64 0, !"_ZTSFvE"}

Each function is described by a tuple of
* Promoted name as a string
* Linkage (see below)
* Type(s)

A function can have multiple types. In the Cross-DSO mode each
function has a second “external” numeric type, and we might want to
allow “relaxed” type checking in the future where a function could
conform to multiple types. In that case the metadata would look like
this:

!4 = !{!"bar", i8 0, !5, !6}
!5 = !{i64 0, !"_ZTSFiE"}
!6 = !{i64 0, i64 751454132325070187}

“Linkage” is one of: definition, external declaration, external weak
declaration.

In the merged “merged” module, !cfi.functions may contain multiple
entries for each function. We pick one with the strongest linkage
(i.e. the definition, if it is available) in LowerTypeTests.

The LTO step emits, for a defined function named “f”:
declare void f.cfi()
.jumptable:
    …
    call f.cfi
    ...
f.cfi-jt = alias .jumptable + offset
f = alias f.cfi-jt

The same for an external (either weak or strong) declaration of a
function named “f”:
.jumptable:
    …
    call f
    ...
f.cfi-jt = alias .jumptable + offset

Weak external linkage is used in the lowering of uses of @f. This is
done both in the merged module and in ThinLTO backends. Uses of strong
definitions and declarations are replaced with f.cfi-jt. Uses of weak
external declarations a replaced with (f ? f.cfi-jt : 0) instead.

ThinLTO backends need to know which functions have jumptable entries
created for them (they will need to be RAUWed with f.cfi-jt). In the
Cross-DSO mode, external functions don’t get jumptable entries. This
information is passed back from the LTO step through combined summary.
The current idea is to add a new record, FunctionTypeResolution, which
would contain a set of function names in the jumptable.

== Alternatives

Function type information can be passed in the summary, as a list of
records (name, linkage, type(, type)*).
* Type can be either a string or a number. This complicates the encoding.
* The code in LowerTypeTests works with !type metadata in the same
format as described above. It would need to either recreate the
metadata from the summary, or deal with different input formats.
I don’t see any advantages to this encoding. Could it be more compact
than the metadata approach?

[1] https://clang.llvm.org/docs/ControlFlowIntegrity.html#indirect-function-call-checking

Thanks for sending this out. A few comments below.

Hi,

this is a proposal for the implementation of CFI-icall [1] with ThinLTO.

Jumptables are generated in the merged module. To generate a
jumptable, we need a list of functions with !type annotations,
including (in non-cross-dso mode) external functions. Unfortunately,
LLVM IR does not preserve unused function declarations, and we don’t
want to copy the actual function bodies to the merged module.

Indirect call targets can be represented in the following way using
named metadata:

void foo() {}
int bar() { return 0; }

# Merged module
!cfi.functions = !{!1, !3}
!1 = !{!"bar", i8 0, !2}
!2 = !{i64 0, !"_ZTSFiE"}
!3 = !{!"foo", i8 0, !4}
!4 = !{i64 0, !"_ZTSFvE"}

Presumably there would be no entries in !cfi.functions for functions
defined in the merged module, as the type metadata would come from the
module itself.

Each function is described by a tuple of
* Promoted name as a string

I imagine that we would only promote a function if it is address-taken.
Otherwise we could be inhibiting optimization significantly.

* Linkage (see below)

* Type(s)

A function can have multiple types. In the Cross-DSO mode each
function has a second “external” numeric type, and we might want to
allow “relaxed” type checking in the future where a function could
conform to multiple types. In that case the metadata would look like
this:

!4 = !{!"bar", i8 0, !5, !6}
!5 = !{i64 0, !"_ZTSFiE"}
!6 = !{i64 0, i64 751454132325070187}

“Linkage” is one of: definition, external declaration, external weak
declaration.

In the merged “merged” module, !cfi.functions may contain multiple
entries for each function. We pick one with the strongest linkage
(i.e. the definition, if it is available) in LowerTypeTests.

It's unfortunate that this design effectively requires that the
LowerTypeTests pass recompute the linkage for each symbol, as the linker
already knows this information (and could, in principle, provide it to the
pass). But I'm not sure if there's a better way to do it.

The LTO step emits, for a defined function named “f”:
declare void f.cfi()
.jumptable:
    …
    call f.cfi
    ...
f.cfi-jt = alias .jumptable + offset
f = alias f.cfi-jt

The same for an external (either weak or strong) declaration of a
function named “f”:
.jumptable:
    …
    call f
    ...
f.cfi-jt = alias .jumptable + offset

One thing to be careful about is summary-based dead stripping: the pass
needs to be able to query whether any specific function is still live in
order to avoid introducing undefined symbol references. I think we can do
that by adding a Live flag to GlobalValueSummaryInfo (which I think should
also let us fix a number of FIXMEs elsewhere, e.g. http://llvm-cs.pcc.me.uk/
lib/Transforms/IPO/LowerTypeTests.cpp#1447 http://llvm-cs.pc
c.me.uk/lib/Transforms/IPO/WholeProgramDevirt.cpp#1329), and have the pass
check the flag for each function.

Weak external linkage is used in the lowering of uses of @f. This is

done both in the merged module and in ThinLTO backends. Uses of strong
definitions and declarations are replaced with f.cfi-jt. Uses of weak
external declarations a replaced with (f ? f.cfi-jt : 0) instead.

ThinLTO backends need to know which functions have jumptable entries
created for them (they will need to be RAUWed with f.cfi-jt). In the
Cross-DSO mode, external functions don’t get jumptable entries. This
information is passed back from the LTO step through combined summary.
The current idea is to add a new record, FunctionTypeResolution, which
would contain a set of function names in the jumptable.

It occurred to me that this design could prevent inlining of indirect calls
via constant propagation. For example, suppose that we have a module that
looks like this:

define void @f() {
  ret void
}

define void @g() {
  %fp = call i8* @identity(i8* @f)
  call void %fp()
}

and a second module:

define i8* @identity(i8* %ptr) {
  return %ptr
}

and @identity is imported into the first module. Now I think the first
module would look like this after optimization:

define void @f.cfi() {
  ret void
}

declare void @f.cfi-jt()

define void @g.cfi() {
  call void @f.cfi-jt()
}

So we cannot inline f.cfi into g.cfi, as the optimizer does not know that
f.cfi-jt can be replaced with f.cfi. I'm not sure how likely this would be
in practice, but something to keep in mind.

Peter

Thanks for sending this out. A few comments below.

Hi,

this is a proposal for the implementation of CFI-icall [1] with ThinLTO.

Jumptables are generated in the merged module. To generate a
jumptable, we need a list of functions with !type annotations,
including (in non-cross-dso mode) external functions. Unfortunately,
LLVM IR does not preserve unused function declarations, and we don’t
want to copy the actual function bodies to the merged module.

Indirect call targets can be represented in the following way using
named metadata:

void foo() {}
int bar() { return 0; }

# Merged module
!cfi.functions = !{!1, !3}
!1 = !{!"bar", i8 0, !2}
!2 = !{i64 0, !"_ZTSFiE"}
!3 = !{!"foo", i8 0, !4}
!4 = !{i64 0, !"_ZTSFvE"}

Presumably there would be no entries in !cfi.functions for functions defined
in the merged module, as the type metadata would come from the module
itself.

Right. The same as with vtable CFI, LowerTypeTests will use
!cfi.functions in addition to the regular logic.

Each function is described by a tuple of
* Promoted name as a string

I imagine that we would only promote a function if it is address-taken.
Otherwise we could be inhibiting optimization significantly.

Yes. Cfi.functions would include all external functions +
address-taken internal functions. We could also do global analysis
(i.e. skip jumptable for hidden non-address-taken functions), but that
needs more information passed to the combined module (or summary).

* Linkage (see below)
* Type(s)

A function can have multiple types. In the Cross-DSO mode each
function has a second “external” numeric type, and we might want to
allow “relaxed” type checking in the future where a function could
conform to multiple types. In that case the metadata would look like
this:

!4 = !{!"bar", i8 0, !5, !6}
!5 = !{i64 0, !"_ZTSFiE"}
!6 = !{i64 0, i64 751454132325070187}

“Linkage” is one of: definition, external declaration, external weak
declaration.

In the merged “merged” module, !cfi.functions may contain multiple
entries for each function. We pick one with the strongest linkage
(i.e. the definition, if it is available) in LowerTypeTests.

It's unfortunate that this design effectively requires that the
LowerTypeTests pass recompute the linkage for each symbol, as the linker
already knows this information (and could, in principle, provide it to the
pass). But I'm not sure if there's a better way to do it.

The LTO step emits, for a defined function named “f”:
declare void f.cfi()
.jumptable:
    …
    call f.cfi
    ...
f.cfi-jt = alias .jumptable + offset
f = alias f.cfi-jt

The same for an external (either weak or strong) declaration of a
function named “f”:
.jumptable:
    …
    call f
    ...
f.cfi-jt = alias .jumptable + offset

One thing to be careful about is summary-based dead stripping: the pass
needs to be able to query whether any specific function is still live in
order to avoid introducing undefined symbol references. I think we can do
that by adding a Live flag to GlobalValueSummaryInfo (which I think should
also let us fix a number of FIXMEs elsewhere, e.g.
lib/Transforms/IPO/LowerTypeTests.cpp
lib/Transforms/IPO/WholeProgramDevirt.cpp),
and have the pass check the flag for each function.

Sounds good.

Thanks for sending this out. A few comments below.

Hi,

this is a proposal for the implementation of CFI-icall [1] with ThinLTO.

Jumptables are generated in the merged module. To generate a
jumptable, we need a list of functions with !type annotations,
including (in non-cross-dso mode) external functions. Unfortunately,
LLVM IR does not preserve unused function declarations, and we don’t
want to copy the actual function bodies to the merged module.

Indirect call targets can be represented in the following way using
named metadata:

void foo() {}
int bar() { return 0; }

# Merged module
!cfi.functions = !{!1, !3}
!1 = !{!"bar", i8 0, !2}
!2 = !{i64 0, !"_ZTSFiE"}
!3 = !{!"foo", i8 0, !4}
!4 = !{i64 0, !"_ZTSFvE"}

Presumably there would be no entries in !cfi.functions for functions defined
in the merged module, as the type metadata would come from the module
itself.

Right. The same as with vtable CFI, LowerTypeTests will use
!cfi.functions in addition to the regular logic.

Each function is described by a tuple of
* Promoted name as a string

I imagine that we would only promote a function if it is address-taken.
Otherwise we could be inhibiting optimization significantly.

Yes. Cfi.functions would include all external functions +
address-taken internal functions. We could also do global analysis
(i.e. skip jumptable for hidden non-address-taken functions), but that
needs more information passed to the combined module (or summary).

* Linkage (see below)
* Type(s)

A function can have multiple types. In the Cross-DSO mode each
function has a second “external” numeric type, and we might want to
allow “relaxed” type checking in the future where a function could
conform to multiple types. In that case the metadata would look like
this:

!4 = !{!"bar", i8 0, !5, !6}
!5 = !{i64 0, !"_ZTSFiE"}
!6 = !{i64 0, i64 751454132325070187}

“Linkage” is one of: definition, external declaration, external weak
declaration.

In the merged “merged” module, !cfi.functions may contain multiple
entries for each function. We pick one with the strongest linkage
(i.e. the definition, if it is available) in LowerTypeTests.

It's unfortunate that this design effectively requires that the
LowerTypeTests pass recompute the linkage for each symbol, as the linker
already knows this information (and could, in principle, provide it to the
pass). But I'm not sure if there's a better way to do it.

The LTO step emits, for a defined function named “f”:
declare void f.cfi()
.jumptable:
    …
    call f.cfi
    ...
f.cfi-jt = alias .jumptable + offset
f = alias f.cfi-jt

The same for an external (either weak or strong) declaration of a
function named “f”:
.jumptable:
    …
    call f
    ...
f.cfi-jt = alias .jumptable + offset

One thing to be careful about is summary-based dead stripping: the pass
needs to be able to query whether any specific function is still live in
order to avoid introducing undefined symbol references. I think we can do
that by adding a Live flag to GlobalValueSummaryInfo (which I think should
also let us fix a number of FIXMEs elsewhere, e.g.
lib/Transforms/IPO/LowerTypeTests.cpp
lib/Transforms/IPO/WholeProgramDevirt.cpp),
and have the pass check the flag for each function.

One thing I've noticed is that the regular LTO pipeline runs with the
merged module before summary based dead stripping. This way jumptables
generation in LowerTypeTests can not skip dead functions, which
effectively disables dead stripping of address-taken functions. This
sounds backwards. Per Peter's advice I've swapped the order with a
trivial patch, and it does not seem to break anything.

Another thing I've noticed is all the extra cfi symbols in thinlto
modules (like __typeid_ZZZ_global_addr) hang around in the final
binary as .hidden symbols in the regular (non-dynamic) symbol table.
This is bad for binary size, and also confuses the symbolizer, because
f and f.cfi-jt have the same address (unless f is undefined) and there
is basically a 50% chance to see f.cfi-jt instead of f in cfi error
messages.

>> Thanks for sending this out. A few comments below.
>>
>>>
>>> Hi,
>>>
>>> this is a proposal for the implementation of CFI-icall [1] with
ThinLTO.
>>>
>>> Jumptables are generated in the merged module. To generate a
>>> jumptable, we need a list of functions with !type annotations,
>>> including (in non-cross-dso mode) external functions. Unfortunately,
>>> LLVM IR does not preserve unused function declarations, and we don’t
>>> want to copy the actual function bodies to the merged module.
>>>
>>> Indirect call targets can be represented in the following way using
>>> named metadata:
>>>
>>> void foo() {}
>>> int bar() { return 0; }
>>>
>>> # Merged module
>>> !cfi.functions = !{!1, !3}
>>> !1 = !{!"bar", i8 0, !2}
>>> !2 = !{i64 0, !"_ZTSFiE"}
>>> !3 = !{!"foo", i8 0, !4}
>>> !4 = !{i64 0, !"_ZTSFvE"}
>>
>>
>> Presumably there would be no entries in !cfi.functions for functions
defined
>> in the merged module, as the type metadata would come from the module
>> itself.
>
> Right. The same as with vtable CFI, LowerTypeTests will use
> !cfi.functions in addition to the regular logic.
>
>>>
>>>
>>> Each function is described by a tuple of
>>> * Promoted name as a string
>>
>>
>> I imagine that we would only promote a function if it is address-taken.
>> Otherwise we could be inhibiting optimization significantly.
>
> Yes. Cfi.functions would include all external functions +
> address-taken internal functions. We could also do global analysis
> (i.e. skip jumptable for hidden non-address-taken functions), but that
> needs more information passed to the combined module (or summary).
>
>>
>>> * Linkage (see below)
>>> * Type(s)
>>>
>>>
>>> A function can have multiple types. In the Cross-DSO mode each
>>> function has a second “external” numeric type, and we might want to
>>> allow “relaxed” type checking in the future where a function could
>>> conform to multiple types. In that case the metadata would look like
>>> this:
>>>
>>> !4 = !{!"bar", i8 0, !5, !6}
>>> !5 = !{i64 0, !"_ZTSFiE"}
>>> !6 = !{i64 0, i64 751454132325070187}
>>>
>>> “Linkage” is one of: definition, external declaration, external weak
>>> declaration.
>>>
>>> In the merged “merged” module, !cfi.functions may contain multiple
>>> entries for each function. We pick one with the strongest linkage
>>> (i.e. the definition, if it is available) in LowerTypeTests.
>>
>>
>> It's unfortunate that this design effectively requires that the
>> LowerTypeTests pass recompute the linkage for each symbol, as the linker
>> already knows this information (and could, in principle, provide it to
the
>> pass). But I'm not sure if there's a better way to do it.
>>
>>>
>>>
>>> The LTO step emits, for a defined function named “f”:
>>> declare void f.cfi()
>>> .jumptable:
>>> …
>>> call f.cfi
>>> ...
>>> f.cfi-jt = alias .jumptable + offset
>>> f = alias f.cfi-jt
>>>
>>> The same for an external (either weak or strong) declaration of a
>>> function named “f”:
>>> .jumptable:
>>> …
>>> call f
>>> ...
>>> f.cfi-jt = alias .jumptable + offset
>>>
>>
>> One thing to be careful about is summary-based dead stripping: the pass
>> needs to be able to query whether any specific function is still live in
>> order to avoid introducing undefined symbol references. I think we can
do
>> that by adding a Live flag to GlobalValueSummaryInfo (which I think
should
>> also let us fix a number of FIXMEs elsewhere, e.g.
>> lib/Transforms/IPO/LowerTypeTests.cpp
>> lib/Transforms/IPO/WholeProgramDevirt.cpp
),
>> and have the pass check the flag for each function.

One thing I've noticed is that the regular LTO pipeline runs with the
merged module before summary based dead stripping. This way jumptables
generation in LowerTypeTests can not skip dead functions, which
effectively disables dead stripping of address-taken functions. This
sounds backwards. Per Peter's advice I've swapped the order with a
trivial patch, and it does not seem to break anything.

Another thing I've noticed is all the extra cfi symbols in thinlto
modules (like __typeid_ZZZ_global_addr) hang around in the final
binary as .hidden symbols in the regular (non-dynamic) symbol table.
This is bad for binary size, and also confuses the symbolizer, because
f and f.cfi-jt have the same address (unless f is undefined) and there
is basically a 50% chance to see f.cfi-jt instead of f in cfi error
messages.

Function names can also receive other suffixes, such as "$hex_digits" or
".llvm.hex_digits" for promoted local symbols, and I don't see a way around
at least those two. We may need to teach the symbolizer to strip the
suffixes.

We may be able to avoid having both the non-.cfi-jt and .cfi-jt symbol by
emitting only the .cfi-jt symbol for symbols defined locally, and only the
non-.cfi-jt symbol for symbols defined externally, and make the .cfi-jt
rewrite conditional on whether the symbol is defined externally.

Regarding the __typeid_* symbols, I don't have a good solution. At least
LLD will add all non-GC'd global symbols in the symbol table to .symtab.
There may be some semantics in bfd or gold that we can take advantage of
somehow and implement in lld as well. Or we may want to extend ELF somehow,
but I'm not sure whether that would be worth it.

Peter

>> Thanks for sending this out. A few comments below.
>>
>>>
>>> Hi,
>>>
>>> this is a proposal for the implementation of CFI-icall [1] with
ThinLTO.
>>>
>>> Jumptables are generated in the merged module. To generate a
>>> jumptable, we need a list of functions with !type annotations,
>>> including (in non-cross-dso mode) external functions. Unfortunately,
>>> LLVM IR does not preserve unused function declarations, and we don’t
>>> want to copy the actual function bodies to the merged module.
>>>
>>> Indirect call targets can be represented in the following way using
>>> named metadata:
>>>
>>> void foo() {}
>>> int bar() { return 0; }
>>>
>>> # Merged module
>>> !cfi.functions = !{!1, !3}
>>> !1 = !{!"bar", i8 0, !2}
>>> !2 = !{i64 0, !"_ZTSFiE"}
>>> !3 = !{!"foo", i8 0, !4}
>>> !4 = !{i64 0, !"_ZTSFvE"}
>>
>>
>> Presumably there would be no entries in !cfi.functions for functions
defined
>> in the merged module, as the type metadata would come from the module
>> itself.
>
> Right. The same as with vtable CFI, LowerTypeTests will use
> !cfi.functions in addition to the regular logic.
>
>>>
>>>
>>> Each function is described by a tuple of
>>> * Promoted name as a string
>>
>>
>> I imagine that we would only promote a function if it is address-taken.
>> Otherwise we could be inhibiting optimization significantly.
>
> Yes. Cfi.functions would include all external functions +
> address-taken internal functions. We could also do global analysis
> (i.e. skip jumptable for hidden non-address-taken functions), but that
> needs more information passed to the combined module (or summary).
>
>>
>>> * Linkage (see below)
>>> * Type(s)
>>>
>>>
>>> A function can have multiple types. In the Cross-DSO mode each
>>> function has a second “external” numeric type, and we might want to
>>> allow “relaxed” type checking in the future where a function could
>>> conform to multiple types. In that case the metadata would look like
>>> this:
>>>
>>> !4 = !{!"bar", i8 0, !5, !6}
>>> !5 = !{i64 0, !"_ZTSFiE"}
>>> !6 = !{i64 0, i64 751454132325070187}
>>>
>>> “Linkage” is one of: definition, external declaration, external weak
>>> declaration.
>>>
>>> In the merged “merged” module, !cfi.functions may contain multiple
>>> entries for each function. We pick one with the strongest linkage
>>> (i.e. the definition, if it is available) in LowerTypeTests.
>>
>>
>> It's unfortunate that this design effectively requires that the
>> LowerTypeTests pass recompute the linkage for each symbol, as the
linker
>> already knows this information (and could, in principle, provide it to
the
>> pass). But I'm not sure if there's a better way to do it.
>>
>>>
>>>
>>> The LTO step emits, for a defined function named “f”:
>>> declare void f.cfi()
>>> .jumptable:
>>> …
>>> call f.cfi
>>> ...
>>> f.cfi-jt = alias .jumptable + offset
>>> f = alias f.cfi-jt
>>>
>>> The same for an external (either weak or strong) declaration of a
>>> function named “f”:
>>> .jumptable:
>>> …
>>> call f
>>> ...
>>> f.cfi-jt = alias .jumptable + offset
>>>
>>
>> One thing to be careful about is summary-based dead stripping: the pass
>> needs to be able to query whether any specific function is still live
in
>> order to avoid introducing undefined symbol references. I think we can
do
>> that by adding a Live flag to GlobalValueSummaryInfo (which I think
should
>> also let us fix a number of FIXMEs elsewhere, e.g.
>> lib/Transforms/IPO/LowerTypeTests.cpp
>> Search results for lib/Transforms/IPO/WholeProgramDevi
rt.cpp#1329),
>> and have the pass check the flag for each function.

One thing I've noticed is that the regular LTO pipeline runs with the
merged module before summary based dead stripping. This way jumptables
generation in LowerTypeTests can not skip dead functions, which
effectively disables dead stripping of address-taken functions. This
sounds backwards. Per Peter's advice I've swapped the order with a
trivial patch, and it does not seem to break anything.

Another thing I've noticed is all the extra cfi symbols in thinlto
modules (like __typeid_ZZZ_global_addr) hang around in the final
binary as .hidden symbols in the regular (non-dynamic) symbol table.
This is bad for binary size, and also confuses the symbolizer, because
f and f.cfi-jt have the same address (unless f is undefined) and there
is basically a 50% chance to see f.cfi-jt instead of f in cfi error
messages.

Function names can also receive other suffixes, such as "$hex_digits" or
".llvm.hex_digits" for promoted local symbols, and I don't see a way around
at least those two. We may need to teach the symbolizer to strip the
suffixes.

We may be able to avoid having both the non-.cfi-jt and .cfi-jt symbol by
emitting only the .cfi-jt symbol for symbols defined locally, and only the
non-.cfi-jt symbol for symbols defined externally, and make the .cfi-jt
rewrite conditional on whether the symbol is defined externally.

I meant: "emitting only the .cfi-jt symbol for symbols defined externally,
and only the non-.cfi-jt symbol for symbols defined locally"

>> Thanks for sending this out. A few comments below.
>>
>>>
>>> Hi,
>>>
>>> this is a proposal for the implementation of CFI-icall [1] with
>>> ThinLTO.
>>>
>>> Jumptables are generated in the merged module. To generate a
>>> jumptable, we need a list of functions with !type annotations,
>>> including (in non-cross-dso mode) external functions. Unfortunately,
>>> LLVM IR does not preserve unused function declarations, and we don’t
>>> want to copy the actual function bodies to the merged module.
>>>
>>> Indirect call targets can be represented in the following way using
>>> named metadata:
>>>
>>> void foo() {}
>>> int bar() { return 0; }
>>>
>>> # Merged module
>>> !cfi.functions = !{!1, !3}
>>> !1 = !{!"bar", i8 0, !2}
>>> !2 = !{i64 0, !"_ZTSFiE"}
>>> !3 = !{!"foo", i8 0, !4}
>>> !4 = !{i64 0, !"_ZTSFvE"}
>>
>>
>> Presumably there would be no entries in !cfi.functions for functions
>> defined
>> in the merged module, as the type metadata would come from the module
>> itself.
>
> Right. The same as with vtable CFI, LowerTypeTests will use
> !cfi.functions in addition to the regular logic.
>
>>>
>>>
>>> Each function is described by a tuple of
>>> * Promoted name as a string
>>
>>
>> I imagine that we would only promote a function if it is
>> address-taken.
>> Otherwise we could be inhibiting optimization significantly.
>
> Yes. Cfi.functions would include all external functions +
> address-taken internal functions. We could also do global analysis
> (i.e. skip jumptable for hidden non-address-taken functions), but that
> needs more information passed to the combined module (or summary).
>
>>
>>> * Linkage (see below)
>>> * Type(s)
>>>
>>>
>>> A function can have multiple types. In the Cross-DSO mode each
>>> function has a second “external” numeric type, and we might want to
>>> allow “relaxed” type checking in the future where a function could
>>> conform to multiple types. In that case the metadata would look like
>>> this:
>>>
>>> !4 = !{!"bar", i8 0, !5, !6}
>>> !5 = !{i64 0, !"_ZTSFiE"}
>>> !6 = !{i64 0, i64 751454132325070187}
>>>
>>> “Linkage” is one of: definition, external declaration, external weak
>>> declaration.
>>>
>>> In the merged “merged” module, !cfi.functions may contain multiple
>>> entries for each function. We pick one with the strongest linkage
>>> (i.e. the definition, if it is available) in LowerTypeTests.
>>
>>
>> It's unfortunate that this design effectively requires that the
>> LowerTypeTests pass recompute the linkage for each symbol, as the
>> linker
>> already knows this information (and could, in principle, provide it to
>> the
>> pass). But I'm not sure if there's a better way to do it.
>>
>>>
>>>
>>> The LTO step emits, for a defined function named “f”:
>>> declare void f.cfi()
>>> .jumptable:
>>> …
>>> call f.cfi
>>> ...
>>> f.cfi-jt = alias .jumptable + offset
>>> f = alias f.cfi-jt
>>>
>>> The same for an external (either weak or strong) declaration of a
>>> function named “f”:
>>> .jumptable:
>>> …
>>> call f
>>> ...
>>> f.cfi-jt = alias .jumptable + offset
>>>
>>
>> One thing to be careful about is summary-based dead stripping: the
>> pass
>> needs to be able to query whether any specific function is still live
>> in
>> order to avoid introducing undefined symbol references. I think we can
>> do
>> that by adding a Live flag to GlobalValueSummaryInfo (which I think
>> should
>> also let us fix a number of FIXMEs elsewhere, e.g.
>> lib/Transforms/IPO/LowerTypeTests.cpp
>>
>> lib/Transforms/IPO/WholeProgramDevirt.cpp),
>> and have the pass check the flag for each function.

One thing I've noticed is that the regular LTO pipeline runs with the
merged module before summary based dead stripping. This way jumptables
generation in LowerTypeTests can not skip dead functions, which
effectively disables dead stripping of address-taken functions. This
sounds backwards. Per Peter's advice I've swapped the order with a
trivial patch, and it does not seem to break anything.

Another thing I've noticed is all the extra cfi symbols in thinlto
modules (like __typeid_ZZZ_global_addr) hang around in the final
binary as .hidden symbols in the regular (non-dynamic) symbol table.
This is bad for binary size, and also confuses the symbolizer, because
f and f.cfi-jt have the same address (unless f is undefined) and there
is basically a 50% chance to see f.cfi-jt instead of f in cfi error
messages.

Function names can also receive other suffixes, such as "$hex_digits" or
".llvm.hex_digits" for promoted local symbols, and I don't see a way around
at least those two. We may need to teach the symbolizer to strip the
suffixes.

In fact, ".something" is not the worst suffix. C++filt refers to such
names as [clone .something] which is only a minor annoyance. Does not
work when the suffix contains "-", or for $digits.

We may be able to avoid having both the non-.cfi-jt and .cfi-jt symbol by
emitting only the .cfi-jt symbol for symbols defined locally, and only the
non-.cfi-jt symbol for symbols defined externally, and make the .cfi-jt
rewrite conditional on whether the symbol is defined externally.

I meant: "emitting only the .cfi-jt symbol for symbols defined externally,
and only the non-.cfi-jt symbol for symbols defined locally"

Yes, and communicate the choice though combined summary. That would
fix jumptable symbolization.