Special case list files; a bug and a slowness issue

Hi,

I need to be able to use a special case list file containing thousands
of entries (namely, a list of libc symbols, to be used when using
DFSan with an uninstrumented libc). Initially I built the symbol
list like this:

fun:sym1=uninstrumented
fun:sym2=uninstrumented
fun:sym3=uninstrumented
...
fun:sym6000=uninstrumented

What I found was that, despite various bits of documentation [1,2],
the symbol names are matched as substrings, the root cause being that
the regular expressions built by the SpecialCaseList class do not
contain anchors. The attached unit test demonstrates the problem.
If I modify my symbol list to contain anchors:

fun:^sym1$=uninstrumented
fun:^sym2$=uninstrumented
fun:^sym3$=uninstrumented
...
fun:^sym6000$=uninstrumented

the behaviour is as expected, but compiler run time is slow (on the
order of seconds), presumably because our regex library doesn't cope
with anchors very efficiently.

I intend to resolve the substring bug and the slow run time issue
by using a StringSet for symbol patterns which do not contain regex
metacharacters. There would still be a regex for any other patterns,
which would have anchors added automatically.

Thoughts?

Thanks,

scl.patch (842 Bytes)

Hi Peter!

No, I need the (documented) whole string semantics in dfsan.

Thanks,