"Unusual" linkage inhibits interprocedural constant propagation?

Dear LLVM developers and adopters!

$ cat ipcp-1.ll
define
;linkonce_odr
dso_local i32 @f() noinline {
    ret i32 123
}
define dso_local i32 @g()
{
    %res = call i32 @f()
    ret i32 %res
}
$ opt-10 -S -ipconstprop ipcp-1.ll
; ModuleID = 'ipcp-1.ll'
source_filename = "ipcp-1.ll"

; Function Attrs: noinline
define dso_local i32 @f() #0 {
   ret i32 123
}

define dso_local i32 @g() {
   %res = call i32 @f()
   ret i32 123 <========== note the result
}

attributes #0 = { noinline }

BUT:

$ cat ipcp-2.ll
define
linkonce_odr
dso_local i32 @f() noinline {
    ret i32 123
}
define dso_local i32 @g()
{
    %res = call i32 @f()
    ret i32 %res
}
$ opt-10 -S -ipconstprop ipcp-2.ll
; ModuleID = 'ipcp-2.ll'
source_filename = "ipcp-2.ll"

; Function Attrs: noinline
define linkonce_odr dso_local i32 @f() #0 {
   ret i32 123
}

define dso_local i32 @g() {
   %res = call i32 @f()
   ret i32 %res <========== note the (lack of) result
}

attributes #0 = { noinline }

WHY? It this a bug?

I observe the same behavior if I replace "-ipconstprop" with "-O3" or replace "linkonce_odr" with "available_externally", and if I use an equivalent testcase in C++ (compiled with the clang++ frontend). No problem with "external", "private" or "hidden" linkages. Also note that those "linkonce_odr"/"available_externally" do not inhibit, e.g., inlining (if I remove "noinline"), that is, as implied from the IR documentation.

I am using LLVM version 10.0.0.

This is a showstopper for my project (actually trying to use LLVM as an affordable static type inferer for a dynamically typed PL).

Thanks for any help

Hi Alex,

this is a "bug", as far as I can tell.

`_odr` linkage should allow inter-procedural propagation of constant returns,
though prevent other inter-procedural deductions. This is why we are a bit
cautious with these things.

I won't fix ipconstprop because we actually removed it but I will look into an
extension of the Attributor to allow this. IPSCCP can probably also be taught to
do this.

~ Johannes

Hi Johannes, thanks for reply. I suspected that ipconstprop was not active in -O3 mode, but I did not know it was deprecated at all. However, either -O3 or -ipsccp behave the same way.

BTW what other inter-procedural deductions should not apply for _odr linkage? As far as I understand, an _odr definition is quite similar to an extern definition semantically (well, according to C++'s definition of ODR rule)...

IPConstProp was not in the default optimization pipeline for a long time
and has been removed in LLVM11 (or shortly after).

Both the Attributor nor IPSCCP perform the transformations IPConstProp
did, though neither handles your case right now. The Attributor will not
propagate information inter-procedurally, the relevant code in
Attrinbutor.h (line 2190) describes the "problem" already:

  bool IsFnInterface = IRP\.isFnInterfaceKind\(\);
  const Function \*FnScope = IRP\.getAnchorScope\(\);
  // TODO: Not all attributes require an exact definition\. Find a way to
  //       enable deduction for some but not all attributes in case the
  //       definition might be changed at runtime, see also
  // http://lists.llvm.org/pipermail/llvm-dev/2018-February/121275.html.
  // TODO: We could always determine abstract attributes and if sufficient
  //       information was found we could duplicate the functions that do not
  //       have an exact definition\.
  if \(IsFnInterface &amp;&amp; \(\!FnScope || \!A\.isFunctionIPOAmendable\(\*FnScope\)\)\)
   this\-&gt;getState\(\)\.indicatePessimisticFixpoint\(\);

Note that we actually have code to do the duplication, though I need to
push some fixes for this "deep wrapper" generation I have prepared
locally.

What you cannot do is, just as a simple example, derive readnone for
a function, e.g.,
int f(int *a) { return 123; }

While it clearly doesn't read or write any memory, a less
optimized equivalent version could, e.g., the original code might have
looked like this:

int f(int *a) { return *a ? 123 : *a + 123; }

which clearly reads memory. You can play this game with various other
properties as well. However, the observed return value should never be
different between equivalent versions of the function (up to
non-deterministic choices) and I therefore think the return value can be
propagated.

If you want to get your hands dirty and teach the Attributor about it,
that would be great. I would probably go with a method in
AbstractAttribute that can be overwritten if the Attribute is OK with
_odr linkage on function interface positions. The only time we overwrite
would be in AAReturnedValues for now.

Let me know what you think.

~ Johannes

P.S. After I wrote this I wanted to make sure the information is
correct. Turns out, AAReturnedValuesImpl::initialize does not call
IRAttribute::initialize but instead basically duplicates the check. In
llvm/lib/Transforms/IPO/AttributorAttributes.cpp line 821
it says
if (!A.isFunctionIPOAmendable(*F))
indicatePessimisticFixpoint();
which is equivalent to the above because AAReturnedValues only exist for
function interface positions anyway. So maybe we can for now just look
for _odr linkage there. Or better, provide an argument to
isFunctionIPOAmendable that determines if _odr is OK or not.

> Hi Johannes, thanks for reply. I suspected that ipconstprop was not active in -O3 mode, but I did not know it was deprecated at all. However, either -O3 or -ipsccp behave the same way.
>
> BTW what other inter-procedural deductions should not apply for _odr linkage? As far as I understand, an _odr definition is quite similar to an extern definition semantically (well, according to C++'s definition of ODR rule)...
>
>> Hi Alex,
>>
>> this is a "bug", as far as I can tell.
>>
>> `_odr` linkage should allow inter-procedural propagation of constant returns,
>> though prevent other inter-procedural deductions. This is why we are a bit
>> cautious with these things.
>>
>> I won't fix ipconstprop because we actually removed it but I will look into an
>> extension of the Attributor to allow this. IPSCCP can probably also be taught to
>> do this.
>>
>> ~ Johannes
>>
>>> Dear LLVM developers and adopters!
>>>
>>> $ cat ipcp-1.ll
>>> define
>>> ;linkonce_odr
>>> dso_local i32 @f() noinline {
>>> ret i32 123
>>> }
>>> define dso_local i32 @g()
>>> {
>>> %res = call i32 @f()
>>> ret i32 %res
>>> }
>>> $ opt-10 -S -ipconstprop ipcp-1.ll
>>> ; ModuleID = 'ipcp-1.ll'
>>> source_filename = "ipcp-1.ll"
>>>
>>> ; Function Attrs: noinline
>>> define dso_local i32 @f() #0 {
>>> ret i32 123
>>> }
>>>
>>> define dso_local i32 @g() {
>>> %res = call i32 @f()
>>> ret i32 123 <========== note the result
>>> }
>>>
>>> attributes #0 = { noinline }
>>>
>>> BUT:
>>>
>>> $ cat ipcp-2.ll
>>> define
>>> linkonce_odr
>>> dso_local i32 @f() noinline {
>>> ret i32 123
>>> }
>>> define dso_local i32 @g()
>>> {
>>> %res = call i32 @f()
>>> ret i32 %res
>>> }
>>> $ opt-10 -S -ipconstprop ipcp-2.ll
>>> ; ModuleID = 'ipcp-2.ll'
>>> source_filename = "ipcp-2.ll"
>>>
>>> ; Function Attrs: noinline
>>> define linkonce_odr dso_local i32 @f() #0 {
>>> ret i32 123
>>> }
>>>
>>> define dso_local i32 @g() {
>>> %res = call i32 @f()
>>> ret i32 %res <========== note the (lack of) result
>>> }
>>>
>>> attributes #0 = { noinline }
>>>
>>> WHY? It this a bug?
>>>
>>> I observe the same behavior if I replace "-ipconstprop" with "-O3" or replace "linkonce_odr" with "available_externally", and if I use an equivalent testcase in C++ (compiled with the clang++ frontend). No problem with "external", "private" or "hidden" linkages. Also note that those "linkonce_odr"/"available_externally" do not inhibit, e.g., inlining (if I remove "noinline"), that is, as implied from the IR documentation.
>>>
>>> I am using LLVM version 10.0.0.
>>>
>>> This is a showstopper for my project (actually trying to use LLVM as an affordable static type inferer for a dynamically typed PL).

Johannes, thank you for your explanations. Now I understand why the "bug" exists in the first place.

BTW, according to your explanations, does this mean that we can/should treat the "available_externally" definitions exactly in the same way as just "external"? I understand that probably that is not specified precisely in the manual (and no standard like C++ covers the behavior in this case, unlike "_odr").

Should I now submit a bug report in order for us to proceed or you can do it yourself?

Johannes, thank you for your explanations. Now I understand why the "bug" exists in the first place.

BTW, according to your explanations, does this mean that we can/should treat the "available_externally" definitions exactly in the same way as just "external"? I understand that probably that is not specified precisely in the manual (and no standard like C++ covers the behavior in this case, unlike "_odr").

I'm not sure what you mean and why it matters. The lang ref spell out their semantics, neither is interposable as far as I can tell.

Should I now submit a bug report in order for us to proceed or you can do it yourself?

I don't need a bug report but having one to keep track doesn't hurt, especially if we don't work on it right away.
Feel free to create one.

~ Johannes

"available_externally - They exist to allow inlining and other optimizations to take place given knowledge of the definition of the global, which is known to be somewhere outside the module. Globals with available_externally linkage are allowed to be discarded at will, and allow inlining and other optimizations."

The manual does not tell here if the interposition (of a slightly different version) can happen or the ultimate definition shall be exactly the same as the "available_externally" definition seen locally.

Anyway, this is not critical for my project. I just thought that someone else could benefit from having no restricions on optimizations available in the "available_externally" case.

Oh, I think I got it. Different optimizations (levels) applied to different translation units can still lead to semantically identical but different function body for the "interposed" function (same as for "_odr"), right?

I will take it into account for the bug report.

Oh, I think I got it. Different optimizations (levels) applied to different translation units can still lead to semantically identical but different function body for the "interposed" function (same as for "_odr"), right?

The input was semantically equivalent. The result after optimizations might not be. Optimizations refine the semantics, e.g., choose behavior where there was none defined before.