Docs for leak checker (and other sanitizers)?

Greg_Stark · November 7, 2015, 7:55pm

I'm having trouble finding any documentation of the sanitizers,
especially the leak checker -- apparently because of the move from
code.google.com to github. Every doc has a link for "more information"
which goes to code.google.com which just redirects to github and
points to a placeholder doc that says lots of links are broken. I
haven't found the docs using any google-fu or greps over the source
tree so I'm hoping I'm just missing something.

In particular I'm looking for a few specific pieces of API:

1) I have a server that runs in a loop but I want to use the leak
checker to check that each trip through the loop (or more likely every
1,000 trips or so) hasn't leaked memory. There are obviously
allocations prior to the main loop that are "leaked" and for that
matter I want to skip the first few thousand trips or else various
data structures that grow gradually will show up as leaks. So I need
to be able to control when the leak checker actually starts looking at
allocations and which it checks. Ideally I would want to print the
reports and only abort if its growing consistently or by more than
some amount but that may not be necessary if I can track down where
the growth is currently.

2) I want to use the poison/unpoison macros and function calls for
asan and msan. However I haven't been able to find any documentation
on this api. In particular msan and asan seem to have different
thoughts on whether these functions should be called directly or
through macros so I'm not sure which is the stable api. Also I have no
idea when to call __msan_allocated_memory() and company but I suspect
it's relevant for giving msan enough information about a pool
allocator to find leaks...

Kostya_Serebryany · November 9, 2015, 7:20pm

I'm having trouble finding any documentation of the sanitizers,
especially the leak checker -- apparently because of the move from
code.google.com to github.

Yea. We hope to clean these up in Dec.
Help is welcome!

Every doc has a link for "more information"
which goes to code.google.com which just redirects to github and
points to a placeholder doc that says lots of links are broken. I
haven't found the docs using any google-fu or greps over the source
tree so I'm hoping I'm just missing something.

Most likely, you need

In particular I'm looking for a few specific pieces of API:

1) I have a server that runs in a loop but I want to use the leak
checker to check that each trip through the loop (or more likely every
1,000 trips or so) hasn't leaked memory. There are obviously
allocations prior to the main loop that are "leaked" and for that
matter I want to skip the first few thousand trips or else various
data structures that grow gradually will show up as leaks. So I need
to be able to control when the leak checker actually starts looking at
allocations and which it checks. Ideally I would want to print the
reports and only abort if its growing consistently or by more than
some amount but that may not be necessary if I can track down where
the growth is currently.

I don't think lsan supports this mode directly,
but why do you think that the init-time allocations are going to be
"leaked"?
If there is some object still pointing to those allocations lsan will not
complain.

2) I want to use the poison/unpoison macros and function calls for
asan and msan. However I haven't been able to find any documentation
on this api. In particular msan and asan seem to have different
thoughts on whether these functions should be called directly or
through macros so I'm not sure which is the stable api.

The best documentation you can find is the the headers, [am]san_interface.h
Macros are given for convenience, you don't have to use them.
api is pretty stable

Also I have no
idea when to call __msan_allocated_memory() and company but I suspect
it's relevant for giving msan enough information about a pool
allocator to find leaks...

msan and lsan do not work together yet, so there is not reason to call msan
api if you want to increase lsan sensitivity.
asan does work with lsan, and marking some stale memory as poisoned in asan
mode gives some advantages:
E.g. see

(at the bottom)

hth,

--kcc

Greg_Stark · November 10, 2015, 5:43pm

Most likely, you need
AddressSanitizerLeakSanitizer · google/sanitizers Wiki · GitHub

Thanks!

I don't think lsan supports this mode directly,
but why do you think that the init-time allocations are going to be
"leaked"?
If there is some object still pointing to those allocations lsan will not
complain.

Hm. That's a good point. I'm not exactly sure how the ripcord
allocator being used in Postgres combined with the asan
poison/unpoison calls would actually appear to lsan. But even aside
from that I believe there are allocations that are made early in
server initialization that are never freed and it's considered not
worth the book-keeping to free them due to the on-off nature of the
initialization.

Greg_Stark · November 11, 2015, 11:00pm

Most likely, you need
AddressSanitizerLeakSanitizer · google/sanitizers Wiki · GitHub

Thanks!

This was really helpful.

I have used __asan_poison_memory_region() but it's the call stack for
the allocation site is based on when malloc was called. This is a
random earlier allocation that allocated the pool that the chunk was
returned from. Is there an API to specify the block was just allocated
and to associate it with a new call stack? It looks like msan does
have this, __msan_allocated_memory()

I'm also struggling with a asan report I can't explain but I think
I'll write that up as a separate e-mail.

I don't think lsan supports this mode directly,
but why do you think that the init-time allocations are going to be
"leaked"?
If there is some object still pointing to those allocations lsan will not
complain.

Hm. That's a good point. I'm not exactly sure how the ripcord
allocator being used in Postgres combined with the asan
poison/unpoison calls would actually appear to lsan. But even aside
from that I believe there are allocations that are made early in
server initialization that are never freed and it's considered not
worth the book-keeping to free them due to the on-off nature of the
initialization.

Yeah. Further investigation shows that a lot of command-line parsing
logic in both the main server and various other tools leak a lot of
small strings here and there. There are also a number of other
assorted small leaks during setup such as the SUS environ global which
can't be instrumented and so on.

The short story though is that in the fuzzer I just want to check that
I'm not leaking memory during the fuzzer loop. I don't want to embark
on the quest of eliminating memory leaks everywhere, especially during
system startup and utilities. That's probably the same workflow if I
wanted to use it to find leaks in the server processing loop too. I
would be pretty happy if there was a way to just clear the list of
allocations being tracked entirely and then start from that clean
slate.

Kostya_Serebryany · November 11, 2015, 11:56pm

>> Most likely, you need
>> AddressSanitizerLeakSanitizer · google/sanitizers Wiki · GitHub
>
> Thanks!

This was really helpful.

I have used __asan_poison_memory_region() but it's the call stack for
the allocation site is based on when malloc was called. This is a
random earlier allocation that allocated the pool that the chunk was
returned from. Is there an API to specify the block was just allocated
and to associate it with a new call stack?

No, we never implemented that.
Feel free to file a bug (I don't think we have one), but no promises here.

It looks like msan does
have this, __msan_allocated_memory()

Correct.
msan has significantly different layout of metadata and so it was easier to
implement (and more important)

I'm also struggling with a asan report I can't explain but I think
I'll write that up as a separate e-mail.

>> I don't think lsan supports this mode directly,
>> but why do you think that the init-time allocations are going to be
>> "leaked"?
>> If there is some object still pointing to those allocations lsan will
not
>> complain.
>
> Hm. That's a good point. I'm not exactly sure how the ripcord
> allocator being used in Postgres combined with the asan
> poison/unpoison calls would actually appear to lsan. But even aside
> from that I believe there are allocations that are made early in
> server initialization that are never freed and it's considered not
> worth the book-keeping to free them due to the on-off nature of the
> initialization.

Yeah. Further investigation shows that a lot of command-line parsing
logic in both the main server and various other tools leak a lot of
small strings here and there. There are also a number of other
assorted small leaks during setup such as the SUS environ global which
can't be instrumented and so on.

The short story though is that in the fuzzer I just want to check that
I'm not leaking memory during the fuzzer loop. I don't want to embark
on the quest of eliminating memory leaks everywhere, especially during
system startup and utilities. That's probably the same workflow if I
wanted to use it to find leaks in the server processing loop too. I
would be pretty happy if there was a way to just clear the list of
allocations being tracked entirely and then start from that clean
slate.

Maybe you can just use suppressions for that?

You can also try to call __lsan_disable() very early at start up and then
__lsan_enable() before fuzzing.

hth,

--kcc

Topic		Replies	Views
Static Analyzer Rocks Hard Clang Frontend	15	253	June 28, 2008
Brainstorming: ASAN snapshots Clang Frontend	3	100	October 11, 2016
Does LeakSanitizer not work on macOS 13 / Apple Silicon? Sanitizers arm64 , macos , clang	8	2188	September 24, 2025
Eliminating global memory roots (or not) to help leak checkers LLVM Dev List Archives	28	346	May 5, 2021
LLVM-based address sanity checker LLVM Dev List Archives	25	349	August 1, 2011

Docs for leak checker (and other sanitizers)?

Related topics