Is this a bug or just illegal code

I’m seeing memory corruption when I use a literal string and std::string in a ternary (?:slight_smile: expression. If I change ss to ss.c_str() on line 5, the corruption goes away. [1] is an AST diff, which might help explain it.

Although clang doesn’t give me an error or warning, is this code legal, or is this a bug?

thanks…
don

$ grep -n “.” x.cpp
1:#include “llvm/ADT/StringRef.h”
3:int main() {
4: const std::string ss;
5: llvm::StringRef Ref = true ? “noexcept” : ss;
6: std::string s = Ref;
7: return 0;
8:}

$ …/…/4.0.0/build/Release/bin/clang++ x.cpp -O1 -g -fsanitize=address -fno-omit-frame-pointer -std=c++11 -I …/…/llvm/include/ -I include

$ ./a.out

I'm seeing memory corruption when I use a literal string and std::string
in a ternary (?:slight_smile: expression. If I change ss to ss.c_str() on line 5, the
corruption goes away. [1] is an AST diff, which might help explain it.

Although clang doesn't give me an error or warning, is this code legal, or
is this a bug?

thanks...
don

$ grep -n "." x.cpp
1:#include "llvm/ADT/StringRef.h"
3:int main() {
4: const std::string ss;
5: llvm::StringRef Ref = true ? "noexcept" : ss;

This creates a std::string temporary, leaving the StringRef pointing to an
object whose lifetime ends at the end of the full expression.
Maybe the StringRef interface can be adjusted to cause this to fail at
compile time.

I'm seeing memory corruption when I use a literal string and std::string in a ternary (?:slight_smile: expression. If I change ss to ss.c_str() on line 5, the corruption goes away. [1] is an AST diff, which might help explain it.

Although clang doesn't give me an error or warning, is this code legal, or is this a bug?

No, this is a bug in x.cpp. the common type between "noexcept" and std::string is std::string, which means that the conditional expression materializes a temporary string to hold the "noexcept" and binds it to the StringRef when the true branch is taken. Clang doesn't warn here because it isn't smart enough to know what StringRef is doing. You can try rewriting this as:

llvm::StringRef Ref = "noexcept";
if (true || whatever)
   Ref = some_std_string;

I'm seeing memory corruption when I use a literal string and std::string
in a ternary (?:slight_smile: expression. If I change ss to ss.c_str() on line 5, the
corruption goes away. [1] is an AST diff, which might help explain it.

Although clang doesn't give me an error or warning, is this code legal,
or is this a bug?

thanks...
don

$ grep -n "." x.cpp
1:#include "llvm/ADT/StringRef.h"
3:int main() {
4: const std::string ss;
5: llvm::StringRef Ref = true ? "noexcept" : ss;

This creates a std::string temporary, leaving the StringRef pointing to an
object whose lifetime ends at the end of the full expression.
Maybe the StringRef interface can be adjusted to cause this to fail at
compile time.

It's a nice idea, but IIRC last time I tried this it didn't work out due to
code like this relying on an implicit std::string -> StringRef conversion:

std::string idiomatically_return_a_string();
void idiomatically_take_a_string(StringRef);

void f() {
  idiomatically_take_a_string(idiomatically_return_a_string());
}

6: std::string s = Ref;

I would expect that you are getting a temporary string object and gets
destroyed immediately, so Ref points to garbage. The AST seems to agree
with that.

Joerg

don hinton via cfe-dev <cfe-dev@lists.llvm.org> writes:

I'm seeing memory corruption when I use a literal string and std::string in
a ternary (?:slight_smile: expression. If I change ss to ss.c_str() on line 5, the
corruption goes away. [1] is an AST diff, which might help explain it.

Although clang doesn't give me an error or warning, is this code legal, or
is this a bug?

This code is illegal. What's happening is that both results of the ?:
expression have to have the same type, and "const char *" is implicitly
convertible to std::string (but not the other way around). Because of
this, you get a temporary std::string for "noexcept", the StringRef
binds to that temporary, and it immediately goes out of scope.

I'm seeing memory corruption when I use a literal string and std::string
in a ternary (?:slight_smile: expression. If I change ss to ss.c_str() on line 5, the
corruption goes away. [1] is an AST diff, which might help explain it.

Although clang doesn't give me an error or warning, is this code legal,
or is this a bug?

thanks...
don

$ grep -n "." x.cpp
1:#include "llvm/ADT/StringRef.h"
3:int main() {
4: const std::string ss;
5: llvm::StringRef Ref = true ? "noexcept" : ss;

This creates a std::string temporary, leaving the StringRef pointing to
an object whose lifetime ends at the end of the full expression.
Maybe the StringRef interface can be adjusted to cause this to fail at
compile time.

It's a nice idea, but IIRC last time I tried this it didn't work out due
to code like this relying on an implicit std::string -> StringRef
conversion:

std::string idiomatically_return_a_string();
void idiomatically_take_a_string(StringRef);

void f() {
  idiomatically_take_a_string(idiomatically_return_a_string());
}

Seems unfortunate that the g() and h() cases below are not very different
for overload resolution:
void f(llvm::StringRef);
void g() { f(std::string()); }
void h() { llvm::StringRef ref = std::string(); }

Thanks for the confirmation. My solution was to use c_str().

I wish there was a way for clang to warn about this, but as you mentioned, it’s difficult to know what StringRef will do with the result.

thanks again…
don

I think you should prefer assigning from a std::string, using c_str() will force StringRef call strlen().