Hello everybody,
I’ve run into some strange behavior with memory sanitizer that I can’t explain and hope somebody with more knowledge of the implementation would be able to help me out or at least point me into the right direction.
For background, I’m using memory sanitizer to check Julia (julialang.org), which uses (or at least will once I track down a few bugs) MCJIT for the code compilation. So far I have rebuilt the runtime and all dependencies (including LLVM, libcxx, etc.) with memory sanitizer enabled and added the instrumentation pass in the appropriate place in the julia code generator.
I’m now going through the usual bootstrap which basically loads the standard library and compiles it, does inference, etc. This works fine for several hours (this is usually much faster - by which I mean several hundred time - I suspect the issue is with MCJIT having to process a ton more relocations and code and being inefficient at it, but I can’t prove that). That’s not the issue however. Eventually, I get
==17150== WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7f417cea3189 in bitvector_any1 /home/kfischer/julia-san/src/support/bitvector.c:177
[ snip ]
Uninitialized value was created by a heap allocation
#0 0x7f41815de543 in __interceptor_malloc /home/kfischer/julia-san/deps/llvm-svn/projects/compiler-rt/lib/msan/msan_interceptors.cc:854
#1 0x7f417cc7d7f1 in alloc_big /home/kfischer/julia-san/src/gc.c:355
[snip]
Now, by going through it in the debugger, I see
(gdb) f 3
#3 0x00007f417cea318a in bitvector_any1 (b=0x60c000607240, b@entry=, offs=0, offs@entry=, nbits=256, nbits@entry=)
at bitvector.c:177
177 if ((b[0] & mask) != 0) return 1;
(gdb) p __msan_print_shadow(&b,8)
ff ff ff ff ff ff ff ff
o: 3f0010a6 o: 80007666
which seems to indicate that the local variable b has uninitialized data. I’m having a hard time believing that though, since if I look at the functions before it, the place where it’s coming from is initialized:
#4 0x00007f41755208a8 in julia_isempty248 ()
#5 0x00007f417c163e3d in jl_apply (f=0x606000984d60, f@entry=, args=0x7fff9132da20, args@entry=, nargs=1,
nargs@entry=) at ./julia.h:1043
(here’s the code of that julia function for reference)
isempty(s::IntSet) =
!s.fill1s && ccall(:bitvector_any1, Uint32, (Ptr{Uint32}, Uint64, Uint64), s.bits, 0, s.limit)==0
Looking at where that value is coming from:
(gdb) f 5
#5 0x00007f417c163e3d in jl_apply (f=0x606000984d60, f@entry=, args=0x7fff9132da20, args@entry=, nargs=1,
nargs@entry=) at ./julia.h:1043
1043 return f->fptr((jl_value_t*)f, args, nargs);
(gdb) p ((jl_array_t*)((void**)args[0])[1])->data
$43 = (void ) 0x60c000607240
(gdb) p __msan_print_shadow(((jl_array_t)((void**)args[0])[1]),0x30)
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496
There are no uninitialized values to be seen anywhere and the b
value isn’t touched before that line, so I’m a little stumped.
One note I should make is that I did have to implement TLS support myself in MCJIT for this to work (I’ll upstream the patch soon), so I may have made a mistake, but I haven’t found anything wrong yet. If nothing looks unusual, I’d also appreciate pointers on what to look for in the TLS variables.
Thank you for your help,
Keno