Hello,
I am trying to run a program with DFSan where I do not control the compilation of all of its source files. Meaning, I can only instrument a subset of the source files to have DFSan support. Thus, during linking, some of the object files have DFSan support (i.e., use the “instrumented” ABI) and some object files don’t (i.e., use the “native” ABI). I am hoping that although my entire program isn’t instrumented, I can still at least use DFSan to analyze the instrumented part.
However, my program makes function calls across the native/instrumented boundary — and although this is fine for calls from an instrumented function to a native function (since I can simply add the native function to DFSan’s ABI list) — it is creating issues for calls from a native function to an instrumented function. Specifically, when linking, I get undefined reference errors because the native function is attempting to call the instrumented function’s original name — i.e., without the dfs$
prefix. E.g., the native function is trying to call foo
, but foo
has been renamed dfs$foo
. I can workaround these linking errors by adding foo
to the ABI list, but then calls to foo
from an instrumented function don’t automatically propagate taint into/out of foo
like it should (it uses the ABI list’s rules).
I can demonstrate what I mean with an example. Suppose my program has two object files: instrumented.o
(which is instrumented by DFSan) and native.o
(which is not instrumented by DFSan). main
(from instrumented.o
) calls add3
(from native.o
) to compute the sum of three numbers, (x+y+z). To compute this sum, add3
makes two calls to add2
(from instrumented.o
), to compute ((x+y)+z). main
then performs two tests: (i) It checks whether a call to add2
maintains accurate label information, and (ii) It checks whether a call to add3
maintains accurate label information (according to the ABI list’s rule). The source files are below.
Additionally, I have the following ABI list:
fun:add3=uninstrumented
fun:add3=functional
Together, this gives the following linker error:
native.o: In function `add3':
native.c:(.text+0x18): undefined reference to `add2'
This is because add3
is attempting to call add2
, but add2
has been replaced by dfs$add2
.
I can work around this linker error by adding add2
to the ABI list:
fun:add3=uninstrumented
fun:add3=functional
fun:add2=uninstrumented
fun:add2=discard
As a result, this successfully links, however it removes DFSan support of taint into/out of add2
. Running the program gives the following output:
INST-->INST label test...
x label (1) == x_test label (0)? FALSE
INST-->NATIVE label test...
sum label has x label? TRUE
In the output above, x_test
is the result of add2(x,0)
, so it should have the same label as x
; however, because add2
is in the ABI list as discard
, its return value is unlabelled, so x
's label does not match x_test
's label. add3
preserves taint correctly because it is listed as functional
in the ABI list.
Is there a way to maintain accurate label information for instrumented–>instrumented function calls but also permit native–>instrumented function calls to the same callee? Maybe I’m missing something obvious, but I only see the following workarounds here:
- Add each instrumented function to the ABI list correctly. In my example, this would mean setting
add2
as afunctional
orcustom
function. However this does not scale well for large applications, and defeats the purpose of DFSan’s automatic taint propagation. - Go through the instrumented object files and replace, e.g.,
dfs$foo
withfoo
. However, this would probably produce some sort of undefined behavior, as mentioned in the DFSan design document.
This seems like it would be a common use case for DFSan — where there are circular dependencies between native and instrumented compilation units. I would appreciate any feedback.
Thanks,
Brian