(Not) instrumenting global string literals that end up in .cstrings on Mac

(forgot to CC llvmdev)


On Darwin the "__cstring" section (really section with type S_CSTRING_LITERAL) is defined to contain zero terminate strings of bytes that the linker can merge and re-order. If you want pad bytes before and after the string, you need to put the strings in a different section (e.g. __TEXT, __const).

But, CF/NSString literals will be problematic. The compiler emits a static NS/CFString object into a data section. That object contains a pointer to its "backing" utf8 or utf16 string literal. The linker coalesce the NS/CFString objects (so that two translation units that define @"hello" will wind up using the same object). But to tell if two CF/NSString objects are the same, the linker must compare the string literal they point to. And in that check is an assertion that the string is in a __cstring or __ustring (utf16) section. So, putting the backing string for a CF/NSString into another section will cause a linker assertion.



I think finding a superset of globals that will end up in the “__cstring” section and not adding red zones to them is reasonable. You might be able to factor out the code that makes the decision but does not involve TargetMachine (ex: some of TargetLoweringObjectFile::getKindForGlobal). These are all constants anyway, so we are only loosing checks for invalid reads, not invalid writes.

There might be other, better solutions; I am not sure…