Hi ,
I have an issue during LTO phase of llvm compiler which is as follows,
File t3.c
Hi ,
I have an issue during LTO phase of llvm compiler which is as follows,
File t3.c
My guess is that it is due to lld change r360841 on that date (Introduce CommonSymbol). +Rui for comments.
Looks like this is indeed related to r360841.
In C, there are distinctions between declarations, definitions and tentative definitions. Global variables declared with “extern” are declarations. Global variables that don’t have “extern” and have initializers are definitions. If global variables have neither “extern” nor initializers, they are called tentative definitions.
Common symbols represent tentative definitions.
Tentative definition get special treatment in the linker. Usually if you define the same symbol in two object files, a linker report an error. However, common symbols are allowed to duplicate. Two or more common symbols are merged and then placed to the .bss section, so that they will be zero-initialized at runtime.
So, a global variable defined as struct Node* head
is actually a common symbol.
I’m not sure why LTO cannot internalize common symbols though. Teresa, is this expected?
LTO can, but it is linker driven. I confirmed that when it is a common symbol the resolution indicates that the symbol is exported, and when I add an initializer so that it is a def we no longer think it is exported and are able to internalize. So this seems to be due to a change in what the linker is telling LTO. I would have to dig in the debugger to confirm, but perhaps lld is now indicating that it might be used by a regular obj? I.e. in BitcodeCompiler::add.
Teresa
Hi Teresa,
Can you please let me know if there is any update on this issue.
Thanks
M Suresh
It seems that we do not at least set IsUsedInRegularObj for common symbols passed to LTO.
I haven’t had a chance to look, but as mentioned, the linker resolution for the symbol is exported, which explains the LTO side behavior. Someone from the linker will probably need to see what changed in the symbol info they are giving LTO is changing after that patch. If you want you can debug lld’s BitcodeCompiler::add to see what info is different in the Resols array for that symbol that gets passed to LTO. Or what else is different in the Sym used to generate the resolution. Both of those are examined in LTO::addModuleToGlobalRes when we note that the symbol is external.
Teresa
I believe this issue was already reported a month ago as https://bugs.llvm.org/show_bug.cgi?id=41978. I assume that dependent changes have landed and it is too late to revert now, but IMO this is a regression, and would normally be enough cause to revert the change to LLD (r360841), so please prioritize investigating this.
Thanks for the info Teresa,
Regards
M Suresh
Let me investigate.
The direct cause of this issue is that, previously lld converted common symbols to defined symbols before passing input files to LTO, and after r360841 they are passed as common symbols to LTO. Making lld to work as before is easy, as we can convert common symbols to defined symbols as before. Here is a patch to do that, and I confirmed that that restores the original behavior for the reported issue.
The question is why LTO cannot internalize common symbols under some conditions. Looks like if there’s no file other than bitcode files, LTO can internalize them, but if there’s other DSO file, LTO can’t, even if the DSOs don’t contain any symbols. But I don’t fully understand what is going on. I’ll try to investigate tomorrow.
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 008a6cd7954…d9deddbf357 100644
— a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1789,6 +1789,11 @@ template void LinkerDriver::link(opt::InputArgList &Args) {
if (!Config->Relocatable)
Symtab->scanVersionScript();
Sure Rui, Thanks for the update and investigation.
Regards
M Suresh
The direct cause of this issue is that, previously lld converted common symbols to defined symbols before passing input files to LTO, and after r360841 they are passed as common symbols to LTO. Making lld to work as before is easy, as we can convert common symbols to defined symbols as before. Here is a patch to do that, and I confirmed that that restores the original behavior for the reported issue.
The question is why LTO cannot internalize common symbols under some conditions. Looks like if there’s no file other than bitcode files, LTO can internalize them, but if there’s other DSO file, LTO can’t, even if the DSOs don’t contain any symbols. But I don’t fully understand what is going on. I’ll try to investigate tomorrow.
LTO doesn’t do anything special for common symbols when detecting the symbol resolution. It looks like this got fixed in D63752/r364273, which is different than the below patch. From that patch description it seems as though LLD was incorrectly marking these symbols as VisibleToRegularObj? That would explain the LTO behavior.
Teresa
Hi Teresa and Rui,
Rui had earlier submitted a fix in https://reviews.llvm.org/rL364273 ,
this resolves two of the issues reported in https://bugs.llvm.org/show_bug.cgi?id=41978 .