[StaticAnalyzer] Changing the loop widening functionality & measurements

Hello everyone,

I am working on loop modeling improvements in the Static Analyzer and sent recently some patches on loop widening. (D36690)
There is already a widening solution (widen-loops config) which invalidates every MemRegion in the process. My version only widen loops which meets specific requirements (e.g. does not contain any pointers). It is hidden behind a new flag ‘widen-loops-conservative’. (It is conservative in the sense that it only widen specific loops.)

In general case the difference of the 2 version is:

widen-loops | Invalidate everything |

  • | - |
    widen-loops-conservative | Only invalidate modified variables |

But a huge difference when there are pointers (most precisely if it encounters a statement which can result a modified variable but it is not handled yet):

widen-loops | Invalidate everything |

  • | - |
    widen-loops-conservative | Invalidate nothing (don’t widen) |

I thought the new flag is necessary since this two options are very different. However, Sean (who created the current version of widening in D12358) suggested to replace the ‘loop-widening’ option with the new functionality. For considering this I collected the relevant statistics (note: the ‘widen-loops’ config encountered some crashes on C++ projects so the statistics are only collected on the files which has passed):

Widen states for the current implementation**, Widen v2** states for the new (D36690) implementation.

Coverage measurements (% of reachable basic blocks statistics)


| curl | libpng | vim | bitcoin | ffmpeg | xerces |

  • | - | - | - | - | - | - |
    Normal | 58.05 | 55.07 | 51.12 | 69.37 | 49.78 | 74.86 |
    Widen | 75.8 | 56.69 | 51.54 | 72.06 | 65.63 | 76.06 |
    Widen v2 | 70.7 | 55.92 | 51.77 | 69.67 | 59.53 | 75.04 |

The number of founded bugs:


| curl | libpng | vim | bitcoin | ffmpeg | xerces |

  • | - | - | - | - | - | - |
    Normal | 35 | 32 | 81 | 9 | 375 | 52 |
    Widen | 35 | 36 | 94 | 12 | 456 | 57 |
    Widen v2 | 27 | 34 | 84 | 10 | 369 | 51 |

The new findings are mostly resulted by the invalidation process and not the coverage increasement. So in general they are false positives which was founded by invalidating the informations (on MemRegions) we have.

Although there is a small observable coverage loss it’s offset by the number of the false positives not discovered.

In conclusion it would be beneficial to replace the current implementation of the ‘widen-loops’ option.

What do you think?

Cheers,

Peter

Hi Péter!

Hello everyone,

I am working on loop modeling improvements in the Static Analyzer and sent recently some patches on loop widening. (D36690)
There is already a widening solution (widen-loops config) which invalidates every MemRegion in the process. My version only widen loops which meets specific requirements (e.g. does not contain any pointers). It is hidden behind a new flag ‘widen-loops-conservative’. (It is conservative in the sense that it only widen specific loops.)

In general case the difference of the 2 version is:

widen-loops | Invalidate everything |

  • | - |
    widen-loops-conservative | Only invalidate modified variables |

But a huge difference when there are pointers (most precisely if it encounters a statement which can result a modified variable but it is not handled yet):

widen-loops | Invalidate everything |

  • | - |
    widen-loops-conservative | Invalidate nothing (don’t widen) |

I thought the new flag is necessary since this two options are very different. However, Sean (who created the current version of widening in D12358) suggested to replace the ‘loop-widening’ option with the new functionality. For considering this I collected the relevant statistics (note: the ‘widen-loops’ config encountered some crashes on C++ projects so the statistics are only collected on the files which has passed):

Widen states for the current implementation**, Widen v2** states for the new (D36690) implementation.

Coverage measurements (% of reachable basic blocks statistics)


| curl | libpng | vim | bitcoin | ffmpeg | xerces |

  • | - | - | - | - | - | - |
    Normal | 58.05 | 55.07 | 51.12 | 69.37 | 49.78 | 74.86 |
    Widen | 75.8 | 56.69 | 51.54 | 72.06 | 65.63 | 76.06 |
    Widen v2 | 70.7 | 55.92 | 51.77 | 69.67 | 59.53 | 75.04 |

The number of founded bugs:


| curl | libpng | vim | bitcoin | ffmpeg | xerces |

  • | - | - | - | - | - | - |
    Normal | 35 | 32 | 81 | 9 | 375 | 52 |
    Widen | 35 | 36 | 94 | 12 | 456 | 57 |
    Widen v2 | 27 | 34 | 84 | 10 | 369 | 51 |

The new findings are mostly resulted by the invalidation process and not the coverage increasement. So in general they are false positives which was founded by invalidating the informations (on MemRegions) we have.

Although there is a small observable coverage loss it’s offset by the number of the false positives not discovered.

In conclusion it would be beneficial to replace the current implementation of the ‘widen-loops’ option.

What do you think?

I do not think the choice of flag you use matters much at this point as both features are very experimental. (If Sean is fine with you reusing the flag, that likely means no one else is using it.) The next step for this project is to design and implement primitives for loop modeling in the CFG. That will allow us to have a solid ground for this feature. Could you send out a design proposal of how loops will be modeled in the CFG to support widening?

Would be happy to help you if you have questions!
Anna