[ARM] [C++ standard] Correct linkage type for string literals in extern inline functions

Hi,

I came across this behavior irregularity in LLVM for ARM backend (-target arm-linux-gnueabi) with the constant promotion optimization in arm (-arm-promote-constant=true).

For the attached source files compiling with the following:

clang++ -target arm-linux-gnueabi A.cpp B.cpp –o a.out

The addresses returned from bar() and foo() are not the same (string literals live in different memory locations) however, when we turn off the constant pool optimization

clang-arm-x++ -Ofast -mllvm -arm-promote-constant=false A.cpp B.cpp -o test_case.exe –o a.out

we are getting the same addresses for string literals.

Looking into the ll files , the strings are created as “private unnamed_addr constant” so the constant pool optimization pass is promoting them to constant pools and causing them to have different addresses, which seems fine.

Is this behavior in line with the C++ standard for strings in extern inline functions? If not, what should be the correct linkage type emitted for this constant? Is this a potential clang bug?

Thank you,

B.ll (1.26 KB)

A.ll (6.32 KB)

A.cpp (256 Bytes)

B.cpp (99 Bytes)

Hi John,

The example you are listing seems to be similar to this bug: https://bugs.llvm.org/show_bug.cgi?id=20734 actually, whch a clang bug.

This definitely looks like a bug to me – C++11 section 7.1.2 paragraph 4 clearly states

“A string literal in the body of an extern inline function is the same object in different translation units.”

Data for extern inline functions should be placed in a comdat group, and it looks like we get this

correct if an actual variable is used, e.g.

extern inline const char *fn()

{

static const char str[] = “example”;

return str;

}

generates

$_Z2fnv = comdat any

$_ZZ2fnvE3str = comdat any

@_ZZ2fnvE3str = linkonce_odr constant [8 x i8] c"example\00", comdat, align 1

define linkonce_odr arm_aapcscc i8* @_Z2fnv() #1 comdat {

entry:

ret i8* getelementptr inbounds ([8 x i8], [8 x i8]* @_ZZ2fnvE3str, i32 0, i32 0)

}

so I think string literals should be handled similarly.

John