[Analyzer] Tips on how to fix PR22954 ? (false positive memory leak warning)

Dear All,

I have been looking into PR22954 which has been kindly raised by krzystof at https://llvm.org/bugs/show_bug.cgi?id=22954 and being new to this area of Clang I would appreciate any tips on how to fix it.

To me the root of the issue seems to originate from the CString checker as it performs invalidation of the destination buffer.
Given the snippet below:

Ping !
Adding analyzer experts to cc.

Regards,

Pierre Gousseau
SN Systems - Sony Computer Entertainment

Ping !
Adding analyzer experts to cc.

Did you add them to bcc? I can not see them in CC. (So added them. Sorry in
case it is redundant.)

Thanks, no worries I did not add them to Bcc, dont know what went wrong …

Regards,

Pierre Gousseau
SN Systems - Sony Computer Entertainment

Perhaps the ideal solution is to model "memcpy" properly (similar to assignment?). As you correctly noticed currently we just invalidate the destination buffer in the CString checker and the invalidation worker invalidates the whole struct (do not exactly know the reason for this).
While experimented wrote a patch (attached) that fixes the issue by adding a trait for the struct region that prevents the region from being invalidated. The solution does not cause regressions in the clang testsuite. Do not know if the solution is acceptable as a short term fix. Please review!

PR22954.patch (1.44 KB)

Thanks Anton for looking at the issue, modelling “memcpy” seems like the right solution, are you saying that code written to model “assignment” could be reused to model “memcpy” ?

The patch you provided definitely fixes the issue thanks, on the other hand if I understand correctly, it seems to me it can lead to new false positives, as the analyzer would wrongly assume that ‘a.data’'s content remains unchanged after the memcpy call ?

I am thinking it might be possible to modify the region store invalidation worker to restrict the invalidation in the case of arrays by adding a new flag like you did, for example TK_InvalidateArrayOnly ?

Thanks for your help,

Pierre Gousseau
SN Systems - Sony Computer Entertainment Group

Thanks Anton for looking at the issue, modelling “memcpy” seems like the right solution, are you saying that code written to model “assignment” could be reused to model “memcpy” ?

Just an idea, the memcpy action looks similar to assignment at first glance.
Here is a comment from CString checker relevant to the topic:
// Invalidate the destination (regular invalidation without pointer-escaping
// the address of the top-level region).
// FIXME: Even if we can’t perfectly model the copy, we should see if we
// can use LazyCompoundVals to copy the source values into the destination.
// This would probably remove any existing bindings past the end of the
// copied region, but that’s still an improvement over blank invalidation.

The patch you provided definitely fixes the issue thanks, on the other hand if I understand correctly, it seems to me it can lead to new false positives, as the analyzer would wrongly assume that ‘a.data’'s content remains unchanged after the memcpy call ?

I am thinking it might be possible to modify the region store invalidation worker to restrict the invalidation in the case of arrays by adding a new flag like you did, for example TK_InvalidateArrayOnly ?

Here an issue is related to the struct, not array. However arrays are also in the risk group.
I thought of something like TK_DoNotInvalidateSuperRegion. Worker’s AddToWorkList() automatically adds superregion to worklist:
bool AddToWorkList(const MemRegion *R) {
const MemRegion *BaseR = R->getBaseRegion();
return AddToWorkList(WorkListElement(BaseR), getCluster(BaseR));
}
With TK_DoNotInvalidateSuperRegion we could selectively localize invalidation to a particular subregion, not the whole superregion. This could work both for classes and arrays. Perhaps additional coding in invalidateRegionsWorker::VisitCluster() is required.
What do you think?

Thanks for your help,

Pierre Gousseau
SN Systems - Sony Computer Entertainment Group

Hi!

Thanks for the thoughts! Definetely TK_DoNotInvalidateSuperRegion should be attached selectively (struct fields, array elements, maybe something else?). Perhaps casts may need to be handled somewhere in the CString checker buffers processing to be more adequate, additional testing with mamcpy and casts required. The case with pointers seem to be unrelated to the issue and might be a good idea for a separate enhancement.

Thanks Anton/Gabor for the ideas ! I will be out of the office until the 29th of July unfortunately but I will try to propose a patch when back if the issue is still open.

Regards,

Pierre Gousseau
SN Systems - Sony Computer Entertainment Group