recognizing DTORs and vptr updates in LLVM.

Hello,

While instrumenting LLVM IR in ThreadSanitizer (race detector), I need to distinguish between a store to vtable pointer (vptr) and any other regular store.
This special treatment should be limited to class DTORs, so I should also know when a function is a DTOR.
Rationale: need to distinguish benign and harmful races on vptr (http://code.google.com/p/data-race-test/wiki/PopularDataRaces#Data_race_on_vptr).

Currently, I can figure out when a function is a DTOR and when a store touches vptr by analyzing mangled names.
_ZN1BD1Ev==“B::~B()”
_ZTV1B==“vtable for B”

define linkonce_odr void @_ZN1BD1Ev(%struct.B* %this) unnamed_addr nounwind uwtable align 2 {
entry:

store i32 (…)** bitcast (i8** getelementptr inbounds ([5 x i8*]* @_ZTV1B, i64 0, i64 2) to i32 (…)), i32 (…)* %0, align 8

However, this does not sound right.
What would be the right way to pass this information from clang to LLVM?
Will using metadata for this purpose be a right solution?
(insn-level metadata for vptr store and module-level metadata for DTORs)

Thanks,

–kcc

It's worth pointing out the according to the abstract LLVM IR model,
your "benign" races are in fact undefined behavior. The only reason
it appears to work is that in practice non-atomic loads and stores
usually result in the same generated code as "relaxed" atomic loads
and stores. If we are in fact supposed to guarantee some sort of
behavior here, we should generate an atomic store. If we aren't, I'm
not sure why AddressSanitizer needs to distinguish between "usually
appears to work" and "almost always appears to work".

-Eli

Hello,

While instrumenting LLVM IR in ThreadSanitizer (race detector), I need
to distinguish between a store to vtable pointer (vptr) and any other
regular store.
This special treatment should be limited to class DTORs, so I should also
know when a function is a DTOR.
Rationale: need to distinguish benign and harmful races on vptr
(http://code.google.com/p/data-race-test/wiki/PopularDataRaces#Data_race_on_vptr).

Currently, I can figure out when a function is a DTOR and when a store
touches vptr by analyzing mangled names.
_ZN1BD1Ev==“B::~B()”
_ZTV1B==“vtable for B”

define linkonce_odr void @_ZN1BD1Ev(%struct.B* %this) unnamed_addr nounwind
uwtable align 2 {
entry:

store i32 (…)** bitcast (i8** getelementptr inbounds ([5 x i8*]*
@_ZTV1B, i64 0, i64 2) to i32 (…)), i32 (…)* %0, align 8

However, this does not sound right.
What would be the right way to pass this information from clang to LLVM?
Will using metadata for this purpose be a right solution?
(insn-level metadata for vptr store and module-level metadata for DTORs)

It’s worth pointing out the according to the abstract LLVM IR model,
your “benign” races are in fact undefined behavior.

Oh yes. According to C++11 too, I believe.
But C++98 did not define threads, so we are in the grey area here.

The only reason
it appears to work is that in practice non-atomic loads and stores
usually result in the same generated code as “relaxed” atomic loads
and stores. If we are in fact supposed to guarantee some sort of
behavior here, we should generate an atomic store. If we aren’t, I’m
not sure why AddressSanitizer

s/AddressSanitizer/ThreadSanitizer/

needs to distinguish between “usually
appears to work” and “almost always appears to work”.

This is more like “almost always works in practice” and “certainly broken”.
We run ThreadSanitizer (the old valgrind-based one) on millions lines of legacy code (not always our own code) and we see a lot of those “benign” vptr races.
Ignoring those (while still detecting really harmful ones) has been #1 feature request from our users until we implemented it in valgrind-based ThreadSanitizer.

–kcc

Using instruction level metadata for this would be appropriate. However, I also don’t understand why a race on this is truly benign. I’m also concerned that you’re adding even more knobs to clang and IR for special case situations. How many more special cases like this are you going to require?

-Chris

It isn't, really; calling it "benign" is deceptive. It's just that
storing a pointer which is equal to the existing pointer stored at a
given address almost always makes the optimizer/codegen generate code
which can't trigger the race in a way which visibly misbehaves.
Therefore, as a heuristic users apparently want ThreadSanitizer to
ignore (or list separately) such races.

Given that, I'm not sure I really see the issue with just
special-casing any store where the value stored is a pointer to a
global... but it could be argued either way, I guess.

-Eli

I users expect this to “just work”, why not extend the language and make it just work?

We could, as an implementation, decide to emit these as relaxed atomic stores, making the code well defined without changing the semantics (or optimization) in any meaningful way, right?

Hello,

While instrumenting LLVM IR in ThreadSanitizer (race detector), I need
to distinguish between a store to vtable pointer (vptr) and any other
regular store.
This special treatment should be limited to class DTORs, so I should also
know when a function is a DTOR.
Rationale: need to distinguish benign and harmful races on vptr
(http://code.google.com/p/data-race-test/wiki/PopularDataRaces#Data_race_on_vptr).

Currently, I can figure out when a function is a DTOR and when a store
touches vptr by analyzing mangled names.
_ZN1BD1Ev==“B::~B()”
_ZTV1B==“vtable for B”

define linkonce_odr void @_ZN1BD1Ev(%struct.B* %this) unnamed_addr nounwind
uwtable align 2 {
entry:

store i32 (…)** bitcast (i8** getelementptr inbounds ([5 x i8*]*
@_ZTV1B, i64 0, i64 2) to i32 (…)), i32 (…)* %0, align 8

However, this does not sound right.
What would be the right way to pass this information from clang to LLVM?
Will using metadata for this purpose be a right solution?
(insn-level metadata for vptr store and module-level metadata for DTORs)

Using instruction level metadata for this would be appropriate. However, I
also don’t understand why a race on this is truly benign.

It isn’t, really; calling it “benign” is deceptive.

Well, yes. Generally, I agree with you here.
But then there are tsan users who have all that legacy code and want to find races that will harm them for sure and don’t want to see “noise”.
These vptr races are hard to suppress w/o risking to hide some other races.

It’s just that
storing a pointer which is equal to the existing pointer stored at a
given address almost always makes the optimizer/codegen generate code
which can’t trigger the race in a way which visibly misbehaves.
Therefore, as a heuristic users apparently want ThreadSanitizer to
ignore (or list separately) such races.

Yep.

Given that, I’m not sure I really see the issue with just
special-casing any store where the value stored is a pointer to a
global… but it could be argued either way, I guess.

That will hide too many real races, I afraid.
Including those “harmful” vptr races.

I’m also concerned that you’re adding even more knobs to clang and IR for special case situations. How many more special cases like this are you going to require?

I don’t remember more special cases off the top of my head. valgrind-based variant has this special case and nothing else, I believe.
We’ve run our race detector unit tests (http://code.google.com/p/data-race-test/source/browse/trunk/unittest/racecheck_unittest.cc)
under the current LLVM-TSAN and this is the only thing we found so far.
But we did not run anything heavy under LLVM-TSAN yet, so something else may be hiding from us.

–kcc

Hello,

While instrumenting LLVM IR in ThreadSanitizer (race detector), I need to distinguish between a store to vtable pointer (vptr) and any other regular store.
This special treatment should be limited to class DTORs, so I should also know when a function is a DTOR.
Rationale: need to distinguish benign and harmful races on vptr (http://code.google.com/p/data-race-test/wiki/PopularDataRaces#Data_race_on_vptr).

Currently, I can figure out when a function is a DTOR and when a store touches vptr by analyzing mangled names.
_ZN1BD1Ev==“B::~B()”
_ZTV1B==“vtable for B”

define linkonce_odr void @_ZN1BD1Ev(%struct.B* %this) unnamed_addr nounwind uwtable align 2 {
entry:

store i32 (…)** bitcast (i8** getelementptr inbounds ([5 x i8*]* @_ZTV1B, i64 0, i64 2) to i32 (…)), i32 (…)* %0, align 8

However, this does not sound right.
What would be the right way to pass this information from clang to LLVM?
Will using metadata for this purpose be a right solution?
(insn-level metadata for vptr store and module-level metadata for DTORs)

Using instruction level metadata for this would be appropriate. However, I also don’t understand why a race on this is truly benign. I’m also concerned that you’re adding even more knobs to clang and IR for special case situations.

As for “more knobs”, Chandler mentioned to me recently that being able to identify vptr accesses will help
virtual function call inlining, so we may still need this knob for other purposes.

–kcc

Given that, I'm not sure I really see the issue with just
special-casing any store where the value stored is a pointer to a
global... but it could be argued either way, I guess.

I users expect this to "just work", why not extend the language and make it
just work?

I'm not sure anyone really expects this to "just work", just that they
did it by accident. Making cross-thread unsynchronized virtual calls
on an object which is being destroyed strikes me as a construct nobody
would intentionally write.

We could, as an implementation, decide to emit these as relaxed atomic
stores, making the code well defined without changing the semantics (or
optimization) in any meaningful way, right?

Making all vptr loads and stores atomic would block some optimizations
(specifically, we can't perform certain optimizations involving
memcpy, and IIRC some optimizers have incomplete atomics handling).
Not sure if it would have much practical impact, though.

Specifically just making vptr stores in destructors "unordered", and
making unordered stores which don't change the stored value
effectively no-ops in the memory model, could work too; the potential
impact on optimization is much less, and I don't think the model
changes would lead to any optimizer changes.

-Eli

Given that, I’m not sure I really see the issue with just
special-casing any store where the value stored is a pointer to a
global… but it could be argued either way, I guess.

I users expect this to “just work”, why not extend the language and make it
just work?

I’m not sure anyone really expects this to “just work”, just that they
did it by accident. Making cross-thread unsynchronized virtual calls
on an object which is being destroyed strikes me as a construct nobody
would intentionally write.

We could, as an implementation, decide to emit these as relaxed atomic
stores, making the code well defined without changing the semantics (or
optimization) in any meaningful way, right?

Making all vptr loads and stores atomic would block some optimizations

… and will not solve my problem – I still need to distinguish between
“benign-for-practical-purposes” and “definitely-harmful” vptr races.
The only difference between those two cases lies outside of the instrumented function.
(it depends on the dynamic type of the object being destroyed).

–kcc

I see. So essentially, this is purely a QoI issue, and it just happens to be so common that we can’t get everyone to fix their code.

Instruction-level metadata sounds better and better. =[ It’s not great, but at least you have evidence that this won’t be an unending series of QoI issues.

Right, but this would have to be designed properly. If someone wants to put forward a concrete use case and a design that would enable virtual call inlining that is also useful for tsan, that would certainly be interesting.

-Chris

Using instruction level metadata for this would be appropriate. However, I
also don't understand why a race on this is truly benign.

It isn't, really; calling it "benign" is deceptive. It's just that
storing a pointer which is equal to the existing pointer stored at a
given address almost always makes the optimizer/codegen generate code
which can't trigger the race in a way which visibly misbehaves.
Therefore, as a heuristic users apparently want ThreadSanitizer to
ignore (or list separately) such races.

The gcc Ada front-end does this too, in quite a range of situations. For
example multiple threads racily initialize a pointer variable, but they all
initialize to the same value. The various valgrind based race detection
tools all complain about this, which makes them much less useful than they
might be for Ada.

Ciao, Duncan.

FWIW, after thinking about this for awhile, I realize that we already have the tools to handle this: TBAA.

It would be general goodness for clang to emit VTable loads and stores in their with their own TBAA type class (one that does not even alias "char*"). This would give us improved code quality, is straight-forward to reason about, is not "another knob" and would be a really easy for ASAN to use.

One issue is that TBAA is disabled in -O0 builds: I'd just make vtable TBAA information be produced when the optimizer is enabled or if ASAN is enabled.

-Chris

Using instruction level metadata for this would be appropriate. However, I
also don’t understand why a race on this is truly benign.

It isn’t, really; calling it “benign” is deceptive. It’s just that
storing a pointer which is equal to the existing pointer stored at a
given address almost always makes the optimizer/codegen generate code
which can’t trigger the race in a way which visibly misbehaves.
Therefore, as a heuristic users apparently want ThreadSanitizer to
ignore (or list separately) such races.

The gcc Ada front-end does this too, in quite a range of situations. For
example multiple threads racily initialize a pointer variable, but they all
initialize to the same value. The various valgrind based race detection
tools all complain about this, which makes them much less useful than they
might be for Ada.

FWIW, after thinking about this for awhile, I realize that we already have the tools to handle this: TBAA.

It would be general goodness for clang to emit VTable loads and stores in their with their own TBAA type class (one that does not even alias “char*”).

Indeed, sounds very nice.
I’ll try to make a patch that adds TBAA metadata to VTable loads (unless someone else knows how to do it off the top of his head).

Thanks!

–kcc

Sounds great, thanks Kostya,

-Chris

Chris, is this how the tbaa for vtable loads/stores should look like?

Metadata:

!0 = metadata !{metadata !“vtable pointer”, metadata !1}
!1 = metadata !{metadata !“omnipotent char”, metadata !2}
!2 = metadata !{metadata !“Simple C/C++ TBAA”, null}

Load:

%0 = bitcast %struct.A* %a to void (%struct.A*)***
%vtable = load void (%struct.A*)*** %0, align 8, !tbaa !0

Store:
%0 = getelementptr inbounds %struct.B* %this, i64 0, i32 0, i32 0
store i32 (…)** bitcast (i8** getelementptr inbounds ([5 x i8*]* @_ZTV1B, i64 0, i64 2) to i32 (…)), i32 (…)* %0, align 8, !tbaa !0

–kcc

Chris, is this how the tbaa for vtable loads/stores should look like?

Metadata:

!0 = metadata !{metadata !“vtable pointer”, metadata !1}
!1 = metadata !{metadata !“omnipotent char”, metadata !2}
!2 = metadata !{metadata !“Simple C/C++ TBAA”, null}

char*'s can’t point to vtables, so I think that “Simple C/C++ TBAA” should be the parent of vtables. Otherwise, looks great.

-Chris