Basic Coverage

Hi

My goal is that given a binary and the corresponding input. I want to know what IR level basic blocks are covered. I need the detail information, which is the set of all the covered BBs rather than just a number.

I want to know whether there are some tools that can support this requirements. If not, I think maybe instrumentation can helps. However, I do not know too much about this. Any suggestions or ideas are welcome. Thank you so much

Regards
Muhui

this is a classic problem: how to derive a complete coverage of all potential paths of execution. and the path covered is not only dependent on the input, it may depend on many other external factors like OS environment, external resource like network, storage, and memories etc. normally a complete coverage is computationally infeasible, but if you let a fuzzer lib libFuzzer do the work of finding random inputs + environments, then a “most probabilistically likely coverage” may be constructed.

Hi Peter

I think you may misunderstand my question. I mean I have a binary and one input. I run the binary with this input. Can I know what IR basic blocks are covered. What I want to know is not only the number but also the BBs’ labels/IDs. Many thanks

Regards
Muhui

Peter Teoh <htmldeveloper@gmail.com>于2018年9月3日 周一下午6:06写道:

Hi Peter

Yeah. I know that BB IDs are virtual addresses. One method I think is to use the debugging information so that I could distinguish different BBs and also map them to each IR BB.

Maybe I need to do instrumentation? Or I just print out some debugging information so that I know what BBs are visited. Do you think it works? I just want to distinguish the BBs. Maybe I can Taylor some existing tools. I don’t know where to start. Many thanks

Regards
Muhui

Peter Teoh <htmldeveloper@gmail.com>于2018年9月3日 周一下午10:15写道:

Hi Peter

Yeah. I know that BB IDs are virtual addresses. One method I think is to use the debugging information so that I could distinguish different BBs and also map them to each IR BB.

Maybe I need to do instrumentation? Or I just print out some debugging information so that I know what BBs are visited. Do you think it works? I just want to distinguish the BBs. Maybe I can Taylor some existing tools. I don’t know where to start. Many thanks

yes, instrumentation is needed. so for example, you can use “clang” to compile + instrument it to do BB (-fsanitize-coverage=trace-bb) or function or edge-level tracing (-fsanitize-coverage=[func,edge]), see this for more details:

https://bcain-llvm.readthedocs.io/projects/clang/en/release_39/SanitizerCoverage/