Prevent RetainCountChecker from Analyzing Function Bodies when They Have Certain Annotate Attribute

Dear all,

I wish to prevent the RetainCountChecker from analyzing function bodies if these functions have certain annotate attributes. Consider the following example to get a better idea of why I wish to do that.

Below is a small snippet from the Integer Set Library (ISL).

typedef struct {
int ref;
} isl_basic_map;

attribute((cf_returns_retained))
isl_basic_map *isl_basic_map_copy(isl_basic_map *bmap);

attribute((cf_returns_retained))
isl_basic_map *isl_basic_map_cow
(attribute((cf_consumed)) isl_basic_map *bmap);

void free(void *);

attribute((annotate(“rc_ownership_trusted_implementation”))) isl_basic_map *isl_basic_map_free
(attribute((cf_consumed)) isl_basic_map *bmap) {
if (!bmap)
return NULL;

if (–bmap->ref > 0)
return NULL;

free(bmap);
return NULL;
}

attribute((cf_returns_retained))
isl_basic_map *foo
(attribute((cf_consumed)) isl_basic_map *bmap) {
// After this call, ‘temp’ has a +1 reference count.
isl_basic_map *temp = isl_basic_map_copy(bmap);
// After this call, ‘bmap’ has a +1 reference count.
bmap = isl_basic_map_cow(bmap);
// After this call, assuming the predicate of the second if branch to be true, ‘bmap’ has a +1 reference count.
isl_basic_map_free(bmap);
return temp; // Object leaked: ‘bmap’
}

While running the RetainCountChecker on the above example, it raises a leak warning for ‘bmap’ in function ‘foo’. This warning is a true positive from the checker’s perspective in the sense that the reference count of ‘bmap’ obtained from ‘isl_basic_map_cow’ is not decremented(from the checker’s perspective) in ‘isl_basic_map_free’ even though it takes the argument ‘bmap’ as ‘attribute((cf_consumed))’.

Actually, ‘–bmap->ref’ does decrement the reference count (from ISL’s perspective). Hence, to prevent such false positives (from ISL’s perspective) to be raised, I wish to prevent the RetainCountChecker to analyze the bodies of the functions having ‘rc_ownership_trusted_implementation’ annotate attribute. I want the checker to just look at the declaration of such functions (and not go inside their bodies) to get the necessary information about reference counting.

Could someone suggest me a way to achieve my objective?

Thank you.

Regards,
Malhar Thakkar

For now RetainCountChecker avoids inlining CoreFoundation's CFRetain() by looking at its annotation and performing evalCall(). evalCall is the checker callback that allows the checker to completely model the function's effects; if the function is evalCall-ed by the checker, the analyzer core doesn't try to inline it. I think this is exactly what you need to do: the function you're modeling is *the* release, and the analyzer doesn't need to know anything about how it is implemented.

However, by evalCall-ing based on annotation, you essentially say that all functions that wear this annotation are to be modeled similarly. If some of them have additional side effects, you may fail to model them. For example, if the analyzer has seen the malloc() that corresponds to the free() in isl_basic_map_free(), it'd make MallocChecker unhappy - unlike the case when you neither evalCall nor inline but evaluate conservatively, causing the malloc-ed pointer to "escape" into this function. If this becomes a problem, you may need to look into our body farms - the mechanism that synthesizes ASTs of various functions for analysis purposes. I imagine farming a function that only does releases, and then farming more functions that call this function and do additional side effects.

I’ve been thinking a lot lately and I feel evalCall-ing is not the right way to proceed (even though it works as desired) as I feel I won’t be able to generalize it to work for different kinds of codebases written in C. For example, it won’t work for the following case (as you pointed out before).

// evalCall-ing based on annotation

#include <stdlib.h>
typedef struct
{
int ref_count;
} rc_struct;
void free(void *);

attribute((annotate(“rc_ownership_trusted_implementation”))) rc_struct *bar(rc_struct *r) {
if (!r)
return NULL;

if (–r->ref_count > 0)
return NULL;

free(r);
return NULL;
}

void foo() {
rc_struct *r = (rc_struct *)malloc(sizeof(rc_struct));
bar(r);
} // Leak warning raised for ‘r’

Is it possible to check if the RetainCountChecker entered a function containing a certain annotation by walking through the diagnostic path just before emitting a bug report?

Hi Malhar,

I think the idea of trusted implementation is to trust blindly that the implementation respects the release/acquire annotation without inlining nor looking inside the function. That means you need those functions to be properly annotated.

Your bar function do not have any annotations to “trust”. Hence the problem.

Dear Dr. Alexandre,

The leak warning is raised by MallocChecker and not the RetainCountChecker (which performs reference counting).
Adding ‘rc_ownership_trusted_implementation’ annotation to ‘bar’ and returning true (successfully evaluating ‘bar’) from evalCall() callback in RetainCountChecker prevents the analyzer from analyzing bar’s body. This causes problems for the MallocChecker as it is unable to find a call to ‘free’ for the call to ‘malloc’ in foo().

Using evalCall(), we’ll be able to suppress leak false positives raised in ISL due to obj_free(), obj_cow() and obj_copy(). Now, we want our solution (to suppress such false positives) to be as generalized as possible. By generalized, I mean using the same “trusted” annotation across different codebases.
However, this approach (using evalCall()) can’t be generalized for other codebases in C as it’s possible for some codebase to have code similar to the one mentioned in my previous email. Such a test-case will result in the MallocChecker raising false positives.

Dear Dr. Alexandre,

The leak warning is raised by MallocChecker and not the RetainCountChecker (which performs reference counting).
Adding 'rc_ownership_trusted_implementation' annotation to 'bar' and returning true (successfully evaluating 'bar') from evalCall() callback in RetainCountChecker prevents the analyzer from analyzing bar's body. This causes problems for the MallocChecker as it is unable to find a call to 'free' for the call to 'malloc' in foo().

How does allocation work in ISL? Are data structures allocated through library-provided functions that create an instance of the data structure? Or does client code call malloc() directly? If the first, the library-provided allocations functions could also be annotated a trusted. Then, if these are annotated they won’t be inlined and the malloc checker won’t see the malloc site — so there won’t be a diagnostic about a leak.

Using evalCall(), we'll be able to suppress leak false positives raised in ISL due to obj_free(), obj_cow() and obj_copy(). Now, we want our solution (to suppress such false positives) to be as generalized as possible. By generalized, I mean using the same "trusted" annotation across different codebases.
However, this approach (using evalCall()) can't be generalized for other codebases in C as it's possible for some codebase to have code similar to the one mentioned in my previous email. Such a test-case will result in the MallocChecker raising false positives.

Can you give a specific example of the problem you are envisioning with evalCall()? Are you worried about other checkers wanting to also evaluate the call?

Devin

Dear Dr. Alexandre,

The leak warning is raised by MallocChecker and not the RetainCountChecker (which performs reference counting).
Adding ‘rc_ownership_trusted_implementation’ annotation to ‘bar’ and returning true (successfully evaluating ‘bar’) from evalCall() callback in RetainCountChecker prevents the analyzer from analyzing bar’s body. This causes problems for the MallocChecker as it is unable to find a call to ‘free’ for the call to ‘malloc’ in foo().

How does allocation work in ISL? Are data structures allocated through library-provided functions that create an instance of the data structure? Or does client code call malloc() directly? If the first, the library-provided allocations functions could also be annotated a trusted. Then, if these are annotated they won’t be inlined and the malloc checker won’t see the malloc site — so there won’t be a diagnostic about a leak.

Using evalCall(), we’ll be able to suppress leak false positives raised in ISL due to obj_free(), obj_cow() and obj_copy(). Now, we want our solution (to suppress such false positives) to be as generalized as possible. By generalized, I mean using the same “trusted” annotation across different codebases.
However, this approach (using evalCall()) can’t be generalized for other codebases in C as it’s possible for some codebase to have code similar to the one mentioned in my previous email. Such a test-case will result in the MallocChecker raising false positives.

Can you give a specific example of the problem you are envisioning with evalCall()? Are you worried about other checkers wanting to also evaluate the call?

Devin

I think it's reasonable to assume that frameworks that shield off
free for reference counting, would also shield off malloc
in order to initialize the reference counting.
Of course, this may just be a lack of imagination on my part.
Do you have any examples of frameworks that could use your
annotations where this is not the case?

skimo

There are some frameworks/idioms that allow a transfer of ownership from a raw malloc’d pointer to a referenced-counted opaque pointer container (this is common in C++ where you use new to allocate and initialize an object). However, I think the ‘trusted implementation’ annotation could be extended to handle this as well. It would require the function that transfers ownership to the reference-counted implementation to be annotated, but I don’t think that is a high burden. That said, I wouldn’t worry about raw malloc’d pointer case for now.

I suspect that many of these side effects would have high-level invariants that the analyzer on its own wouldn’t be able to discover (for example, that a set created with a single item in it is not empty) — so custom modeling would be needed in any event, even if the calls were inlined.

I don’t understand what is going on in the above. Can you please simplify the example and explain what is going on and what you expected the analyzer to do? What call is the analyzer complaining about? Is it inlining the call? Please present the simplest example that illustrates the problem.

The default assumption (when no annotation is present) is the equivalent of __isl_keep.

Devin

The analyzer shouldn’t be doing this when the foo is not inlined. If foo is inlined it should only do this if foo() returns its single parameter. Is foo() being inlined, or not? If so, what is its body? Is this when evalCall is used, or not?

It would be good to determine whether this is actually the reason why. You will probably need to do some debugging here. You might find it helpful to visualize the exploded node graph. There are docs on how to do this at <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands>.

Devin

Hum… I am confused.

Note that you didn’t put any attribute on the return value, which should forbid any use / analysis.

Maybe we should look into what the reference counter count.

After some reflexion, I came to the conclusion that it should never count to more than 1 as it should not know that obj_copy returns it’s argument. And that is fine!

I think I understand what you’ve been describing. If you are using the evalCall currently in the retain count checker, it is modeling CFRetain(), which returns its argument. For your purposes you will not want to return the argument but instead conjure a symbol when the called function has the trusted annotation. This essentially tells the analyzer that it can’t assume anything about the return value. There is code in RetainCountChecker::evalCall to do this:

RetVal = SVB.conjureSymbolVal(nullptr, CE, LCtx, ResultTy, C.blockCount());

but it only kicks in when the first argument is unknown. You’ll want to extend it to conjure a symbol whenever the evaluated call has the trusted annotation and has a return type that isn’t void.

Hope this helps,

Devin

Yes, this is exactly what I needed. I added that functionality and it seems to be working now. So, just to make sure everything’s working as expected, I am currently running the static analyzer again on the ISL codebase after adding trusted implementation annotations to obj_free(), obj_cow() and obj_copy().

I would include obj_alloc_* too.

More exactly, any function interacting with reference counters:

$ rc -o -R isl_map::ref
isl_map_copy

isl_map_cow

isl_map_alloc_space

isl_map_free

evalCall-ing a function based on annotations worked for the most part except for the following scenario.

If the definition of the annotated function is after it is called somewhere in the code, the RetainCountChecker is unable to “see” the annotation on this function (I checked this by printing debug messages in evalCall to see if the RetainCountChecker is able to find the annotation).

Consider the following examples.

evalCall-ing a function based on annotations worked for the most part except for the following scenario.

If the definition of the annotated function is after it is called somewhere in the code, the RetainCountChecker is unable to “see” the annotation on this function (I checked this by printing debug messages in evalCall to see if the RetainCountChecker is able to find the annotation).

Consider the following examples.

**// Definition of annotated function after it is called.**
#define NULL 0
typedef struct
{
	int ref;
} isl_basic_map;

void free(void *);
__isl_give isl_basic_map *foo(__isl_take isl_basic_map *bmap);
isl_basic_map *isl_basic_map_free(__isl_take isl_basic_map *bmap);

__isl_give isl_basic_map *bar(__isl_take isl_basic_map *bmap) {
	bmap = foo(bmap);
	return isl_basic_map_free(bmap); **// Leak warning for 'bmap' raised here.**
}

__attribute__((annotate("rc_ownership_trusted_implementation"))) isl_basic_map *isl_basic_map_free(__isl_take isl_basic_map *bmap)
{
	if (!bmap)
		return NULL;

	if (--bmap->ref > 0)
		return NULL;

	free(bmap);
	return NULL;
}

I don’t know. One thing to note is that unlike most annotations, this one seems like it belongs on the definition of the function and not its interface, so you probably want to use getDefinition() to get the FuncDecl corresponding to the definition of the function body.

It would be helpful if you explained what the obj_alloc_* functions do. Do they malloc memory directly and return as __isl_give? Do they manipulate reference counts directly?

What does obj_dup() do?

Devin