I am a Master’s student at the ECE department of the University of Florida, USA. For my research project, supervised by Prof. Mark Tehranipoor and Prof. Farimah Farahmandi, I need to use Clang LLVM DataflowSanitizer library in KLEE. However, I have faced some difficulties (explained below) while interfacing this library with KLEE and I am seeking your help to solve it.
For our research purpose, we are using LLVM compiler and LLVM DataFlowSanitizer library. We instrumented a C code using LLVM DataFlowSanitizer library (from dfsan_interface.h) along with the insertion of some KLEE assertion (from klee.h) and converted it to LLVM Bit-code. Then we provided this LLVM Bit-code to KLEE (software verification tool) to run with the archive library of DataFlowSanitizer tool, libclang_rt.dfsan-x86_64.a. But, KLEE showed the following error.
However, to figure out the problem, I have built a sample custom archive library from the object files generated by Clang/LLVM (using cmd llvm-ar) and used this library in a C code along with KLEE. In this case, KLEE gave the same error as above. Then I generated the same sample custom archive library from the LLVM bit-code files and this time KLEE executed successfully. Hence, I am afraid whether KLEE requires the bit-code version of libclang_rt.dfsan-x86_64.a or there is any other solution.
In this circumstance, it would be great if you could direct me to resolve this issue. Please let me know if you need any other information.
I’m not entirely familiar with KLEE nor with the dataflow sanitizer, but I think there are two issues with your approach.
First, I would expect KLEE to require for the whole program to be provided together with its bitcode (modulo known library functions) – you can see how they achieve that in the tutorials section on their website (e.g., for coreutils ). From the high-level point of view, KLEE uses a technique called dynamic symbolic execution (DSE) that when interpreting LLVM bitcode maintains symbolic execution state alongside the (usual) concrete state , and while it’s possible to use it on binaries, KLEE works on the level of LLVM IR. They have a custom libc implementation that KLEE can reason about (the files with .bca extension on your screenshot). I think that unless you link in whatever binaries you use (such that you end up with a single bitcode file), use on of the provided libraries like libc, KLEE won’t be able to come up with a symbolic state for library code.
The second issue I see is that you want to use the dataflow sanitizer together with KLEE. Typically, the mainstream sanitizers provide cleaver encoding that makes checking some property of interest fast for runtime execution. For example, a sanitizer can maintain a mapping from normal program memory to shadow memory, and instrumentation that maintains this mapping and asserts some invariants over both the memory regions. This encoding/mapping use fast (runtime) operations like shifts, bit operations, hashing, that can be very difficult to reason about for an underlying constraint solver in a DSE tool. Software verification tools usually use custom encodings/instrumentation that is easier to reason about and can use tricks that don’t work for runtime verification (e.g., use of nondeterminism in ). Things you want to avoid are complicated math function (hashing), large arrays and dependencies across them, etc.
My suggestion would be to try to first manually instrument your C/C++ source with an encoding friendly for DSE, and once you are happy with it create a custom transformation pass over LLVM IR that emits the instrumentation.
Thank you very much for your detail explanations and valuable suggestions. I will follow your suggestions.
I completely agree with you that KLEE requires libraries with the .bca extension. In fact, I have already tried to generate the archived bit-code version of libclang_rt.dfsan-x86_64.a using WLLVM tool. But it could not generate the .bca file successfully (please, see the attached screenshot).
Would you please let me know how I can get the archived bit-code version of the LLVM DataFlowSanitizer library? Is there any repository that I may not be aware of? If there is none, could you please guide me on how I can generate it? I really appreciate any suggestion you may have. Thank you very much.