Question about node collapse

Hi guys,

I'm working on a project using DSA to mark the type-unsafe store operations. The example code is below,

int main() {
int *a = (int*)malloc(sizeof(int));

*a = 256;
*((char *)a) = 1;
assert(*a == 257);

free(a);

return 0;
}

Based on my understanding of DSA, *((char *)a) = 1 will cause the node to which "a" points to collapsed because I think there is type-inconsistency here in the sense that a is declared as int* and used as int* when *a = 256 happens while is used as char* afterwards. However, it seems that no node is collapsed when the analysis is finished. I was wondering if my understanding of DSA is correct or not. Suggestions from your guys are really appreciated.

Best,
Shaobo

Based on my understanding of DSA, *((char *)a) = 1 will cause the node to
which "a" points to collapsed because I think there is type-inconsistency
here in the sense that a is declared as int* and used as int* when *a = 256
happens while is used as char* afterwards.

I'm not familiar with DSA (ds-aa?) so I can't say for sure what it
does, but char (both signed and unsigned) is special in the C and C++
aliasing rules. You're allowed to access any object via a pointer to
char (E.g. C99 6.5p7).

Cheers.

Tim.

Hi guys,

I'm working on a project using DSA to mark the type-unsafe store operations. The example code is below,

int main() {
int *a = (int*)malloc(sizeof(int));

*a = 256;
*((char *)a) = 1;
assert(*a == 257);

free(a);

return 0;
}

Based on my understanding of DSA, *((char *)a) = 1 will cause the node to which "a" points to collapsed because I think there is type-inconsistency here in the sense that a is declared as int* and used as int* when *a = 256 happens while is used as char* afterwards. However, it seems that no node is collapsed when the analysis is finished. I was wondering if my understanding of DSA is correct or not. Suggestions from your guys are really appreciated.

First, which DSA pass are you using?

Second, what does the LLVM IR for the program look like?

DSA can now track multiple types per offset (this feature was added after the DSA paper). In this case, it might track the fact that you're storing a 4-byte int at offset zero and a 1-byte int at offset zero. As the integer doesn't overlap a pointer field, DSA does not need to collapse the DSNode for the pointer. That's my guess as to why you're not seeing the node collapse.

Regards,

John Criswell

Hi John, all,

Thanks for your responses everybody.

This is actually helpful and I think I now better understand what is
going on here. Unless there is a pointer involved, DSA will not
collapse nodes. That makes sense...

What we would like to leverage DSA for is essentially type-unsafe
memory accesses, such as the example where code write a byte into the
0th byte of an integer. Another example would be where a short is
written over an integer. Or an integer is written starting from the
2nd byte of another integer. And so on...

Now, after I read your answer below, it seems that DSA could still
provide us with such conservative information - for each DS node, we
should be able to iterate over its offsets and determine whether some
of the above listed type-unsafe accesses are happening on the node. Am
I getting this about right?

If you have time to point us at some API functions to get us started
with the above idea, that would be great. If not, then don't worry,
hopefully we'll figure it out on our own.

Thanks!

Best,
-- Zvonimir

Hi John, all,

Thanks for your responses everybody.

This is actually helpful and I think I now better understand what is
going on here. Unless there is a pointer involved, DSA will not
collapse nodes. That makes sense...

What we would like to leverage DSA for is essentially type-unsafe
memory accesses, such as the example where code write a byte into the
0th byte of an integer. Another example would be where a short is
written over an integer. Or an integer is written starting from the
2nd byte of another integer. And so on...

Now, after I read your answer below, it seems that DSA could still
provide us with such conservative information - for each DS node, we
should be able to iterate over its offsets and determine whether some
of the above listed type-unsafe accesses are happening on the node. Am
I getting this about right?

Correct.

If you have time to point us at some API functions to get us started
with the above idea, that would be great. If not, then don't worry,
hopefully we'll figure it out on our own.

There is a TypeSafety analysis pass that you can use. The lib/OptimizeChecks/SafeLoadStoreOpts.cpp code in SAFECode has an example of how to use it. Quickly looking over the code, it looks like it searches for overlapping fields in the DSNode; it also handles issues with the casting flags, Incomplete Flag, and Unknown Flag.

Regards,

John Criswell

Thanks John!

You would not believe this :), but literally just 5 minutes ago I saw
the TypeSafety pass and it seems to be exactly what we need. So we'll
try to leverage that...

Best,
-- Zvonimir

Hi John,

I have a follow up question about TypeSafety and would appreciate your help.

So we've been studying its implementation, and in particular the
function typeFieldsOverlap. As it turns out, the current
implementation of that function does not catch an overlap of fields if
they start at the same offset. For example, this would not be caught
as a field overlap since x and y are at the same offset (if though
they are of different types and type sizes):
int *x = 5;
char *y = x;
*y = 3;

On the other hand, this would be caught since x and y are not at the
same offset:
int *x = 5;
char *y = x;
*(y+1) = 3;

Do you maybe know if such behavior is a feature of TypeSafety or a bug?

In my mind even the first code snippet brakes type safety (at least
the way we define it), but the current implementation of TypeSafety is
not catching those.

Thanks,
-- Zvonimir

I think that it is a bug. Please feel free to file a bug report for it.

Are you able to fix the code yourself, or do you need assistance? I'm a bit swamped with end-of-the-semester work.

Regards,

John Criswell

We can probably do it ourselves and email you our patch. We'll keep
you posted. Thanks for your help!