Usually it is because nobody has noticed the problem or nobody is
motivated enough to fix the problems, not that they intentionally leave
a problem open:) I took some time to look at the problem and conclude
that clang should do nothing on this. Actually, with the clang behavior,
you can discard “Unused” if you use LLD. Read on.
Sorry if I misspoke, I was not suggesting that the bug was known and voluntary not fixed by laziness ;-). I am sure there is a valid reason and wanted to know about it. Just like you explained, it appears that LLVM rely on LLD to do that instead of enforcing it in the middle-end which is a different approach to GCC.
In GCC, -O turns on -fmerge-constants. Clang does not implement this
option, but implement the level 2 -fmerge-all-constants, which is non-conforming (“Languages like C or C++
require each variable, including multiple instances of the same variable
in recursive calls, to have distinct locations, so using this option
results in non-conforming behavior.”).
Non-confirming in the sense of C/C++ standard? How is it related to the -fdata-sections implementation?
With (-fmerge-constants or -fmerge-all-constants) & -fdata-sections, string literals are placed in .rodata.xxx.str1.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192#c16
This is, however, suboptimal because the cost of a section header
(sizeof(Elf64_Shdr)=64) + a section name (".rodata.xxx.str1.1") is quite large.
I have replied on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192#c19 and
created a GNU ld feature request
(https://sourceware.org/bugzilla/show_bug.cgi?id=26622)
In my example, LLVM/Clang already put both pointer “test” and “unused” in different data section because of “-fdata-sections” as seen below.
; Segment unnamed segment
; Range: [0x5c; 0x64[ (8 bytes)
; File offset : [144; 152[ (8 bytes)
; Permissions: -
; Section .data.test
; Range: [0x5c; 0x60[ (4 bytes)
; File offset : [144; 148[ (4 bytes)
; Flags: 0x3
; SHT_PROGBITS
; SHF_WRITE
; SHF_ALLOC
test:
0000005c dd 0x00000063
; Section .data.unused
; Range: [0x60; 0x64[ (4 bytes)
; File offset : [148; 153[ (4 bytes)
; Flags: 0x3
; SHT_PROGBITS
; SHF_WRITE
; SHF_ALLOC
unused:
00000060 dw 0x00000070
So I am not sure to understand the point about sub-optimality here since it is already the case for the .data section where each variable imply a suboptimal cost in term of section header. How the c-string like datas are different ? I mean, the concept of -fdata-section/-ffunction-section (“one section for each data/functions”) should be the same for every kind of data, no?