Optimised-code debugging experience Round Table

Hi all,

I haven’t seen a proposal for an optimised-code debugging experience Round Table yet so here goes!

Please let me know if you are interested by emailing me at:

orlando.hyams@sony.com

Below is a non-exhaustive list of possible topics. Feel free to include any preferences and

suggestions in your response.

a. Line tables:

  1. Can we fix is_stmt?

  2. Is prologue_end reliable?

  3. General stepping behaviour/quality.

b. Variable locations:

  1. The state of DW_OP_entry_values in llvm.

  2. The state of the instruction-referencing DBG_VALUE work.

  3. The state of multi-register DWARF expressions in llvm.

  4. The possibility of salvaging out-of-liveness values using the 3 projects mentioned above.

  5. Floating point debug-info quality in llvm.

  6. Loop induction variable locations.

c. Testing debug-info:

  1. Variable correctness testing tools.

  2. Location coverage testing tools.

d. The state of -Og.

Please respond before Friday (25th) if you are interested as that is the submission deadline.

Thanks,

Orlando

+LLDB Dev

I’ll sign up. :slight_smile:

My particular interests are:

  • Og (and O1 as Og)
  • Correctness testing tools

Past that the rest of your list seems quite specific, but the overall “line tables” and “variable locations” are important.

Relatedly we have a number of DWARF committee members in llvm and another possible discussion area could be: “what extensions do debug info consumers think should happen to make dwarf a better input into debugging”.

Thanks.

-eric

Hi Eric & Orlando,

It’s great to see interest in a lot of different aspects of debug info. At the same time, I’m concerned about a risk to making the topic so broad that we don’t have time to get through all the things people want to get through. I’m thinking there’s a different way to slice the topics, hopefully without much overlap, but that will allow a bit more focus. No doubt a lot of the same people would be interested in multiple slices, but by limiting the scope of each conversation I’m hoping we’ll get more accomplished. I daresay a lot of people interested in debug-info quality in general might totally tune out a DWARF-nerd discussion :blush:

The slicing could be something like this:

Hi Paul, Eric, lists,

FWIW I agree with Paul here. Given the limited time available for the discussions I think it makes sense to split up the conversations to keep them focused. Though, it’d be good to coordinate non-overlapping time slots. As you say, it is likely that people (including me) would want to attend more than one of these.

Saying that, I’ve only had 3 people outside of Sony express an interest in the Round Table that I proposed. At this rate we may not have the requisite numbers to split. Of course, if that number is indicative of actual turn out then we won’t need to split anyway, but I suspect that there will be more attendees on the day.

We must also remember that the Round Table submission deadline is tomorrow (tonight?). Unless more people express an interest very soon, I think we might need to fall back onto the single Round Table.

Thanks,

Orlando

Forwarding to lldb-dev now that I’ve signed up.

ATT00003.txt (158 Bytes)

Does anyone has an explanation of this weird run of 'valgrind --tool=drd':

==2715== drd, a thread error detector
==2715== Copyright (C) 2006-2017, and GNU GPL'd, by Bart Van Assche.
==2715== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==2715== Command: /home/antipov/.local/llvm-12.0.0/bin/lldb
==2715== Parent PID: 1702

In LLDB, do 'process attach --pid [PID of running Firefox]', then:

==2715== Thread 5:
==2715== The impossible happened: mutex is locked simultaneously by two threads: mutex 0xe907d10, recursion count 1, owner 1.
==2715== at 0x4841015: pthread_mutex_lock_intercept (drd_pthread_intercepts.c:893)
==2715== by 0x4841015: pthread_mutex_lock (drd_pthread_intercepts.c:903)
==2715== by 0x504FBEE: __gthread_mutex_lock (gthr-default.h:749)
==2715== by 0x504FBEE: lock (std_mutex.h:100)
==2715== by 0x504FBEE: lock_guard (std_mutex.h:159)
==2715== by 0x504FBEE: SetValue (Predicate.h:91)
==2715== by 0x504FBEE: lldb_private::EventDataReceipt::DoOnRemoval(lldb_private::Event*) (Event.h:121)
==2715== by 0x5113644: lldb_private::Listener::FindNextEventInternal(std::unique_lock<std::mutex>&, lldb_private::Broadcaster*, lldb_private::ConstString const*, unsigned int, unsigned int, std::shared_ptr<lldb_private::Event>&, bool) (Listener.cpp:309)
==2715== by 0x5113DD1: lldb_private::Listener::GetEventInternal(lldb_private::Timeout<std::ratio<1l, 1000000l> > const&, lldb_private::Broadcaster*, lldb_private::ConstString const*, unsigned int, unsigned int, std::shared_ptr<lldb_private::Event>&) (Listener.cpp:357)
==2715== by 0x5113F4A: lldb_private::Listener::GetEventForBroadcaster(lldb_private::Broadcaster*, std::shared_ptr<lldb_private::Event>&, lldb_private::Timeout<std::ratio<1l, 1000000l> > const&) (Listener.cpp:395)
==2715== by 0x506ADD4: lldb_private::Process::RunPrivateStateThread(bool) (Process.cpp:3872)
==2715== by 0x506B3F5: lldb_private::Process::PrivateStateThread(void*) (Process.cpp:3857)
==2715== by 0x483DB9A: vgDrd_thread_wrapper (drd_pthread_intercepts.c:449)
==2715== by 0x488B3F8: start_thread (in /usr/lib64/libpthread-2.32.so)
==2715== by 0xDFCEA92: clone (in /usr/lib64/libc-2.32.so)
==2715== mutex 0xe907d10 was first observed at:
==2715== at 0x4840F55: pthread_mutex_lock_intercept (drd_pthread_intercepts.c:890)
==2715== by 0x4840F55: pthread_mutex_lock (drd_pthread_intercepts.c:903)
==2715== by 0x5058502: __gthread_mutex_lock (gthr-default.h:749)
==2715== by 0x5058502: lock (std_mutex.h:100)
==2715== by 0x5058502: lock (unique_lock.h:138)
==2715== by 0x5058502: unique_lock (unique_lock.h:68)
==2715== by 0x5058502: WaitFor<lldb_private::Predicate<T>::WaitForValueEqualTo<bool>::<lambda(bool)> > (Predicate.h:123)
==2715== by 0x5058502: WaitForValueEqualTo (Predicate.h:157)
==2715== by 0x5058502: WaitForEventReceived (Event.h:114)
==2715== by 0x5058502: lldb_private::Process::ControlPrivateStateThread(unsigned int) (Process.cpp:3698)
==2715== by 0x505BC61: lldb_private::Process::StartPrivateStateThread(bool) (Process.cpp:3647)
==2715== by 0x5065B96: lldb_private::Process::Attach(lldb_private::ProcessAttachInfo&) (Process.cpp:2961)
==2715== by 0x544DBB8: PlatformPOSIX::Attach(lldb_private::ProcessAttachInfo&, lldb_private::Debugger&, lldb_private::Target*, lldb_private::Status&) (PlatformPOSIX.cpp:401)
==2715== by 0x509F531: lldb_private::Target::Attach(lldb_private::ProcessAttachInfo&, lldb_private::Stream*) (Target.cpp:3008)
==2715== by 0x54C3F17: CommandObjectProcessAttach::DoExecute(lldb_private::Args&, lldb_private::CommandReturnObject&) (CommandObjectProcess.cpp:386)
==2715== by 0x4FC0ACD: lldb_private::CommandObjectParsed::Execute(char const*, lldb_private::CommandReturnObject&) (CommandObject.cpp:993)
==2715== by 0x4FBCBD7: lldb_private::CommandInterpreter::HandleCommand(char const*, lldb_private::LazyBool, lldb_private::CommandReturnObject&, lldb_private::ExecutionContext*, bool, bool) (CommandInterpreter.cpp:1803)
==2715== by 0x4FBDB96: lldb_private::CommandInterpreter::IOHandlerInputComplete(lldb_private::IOHandler&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (CommandInterpreter.cpp:2838)
==2715== by 0x4EF21C0: lldb_private::IOHandlerEditline::Run() (IOHandler.cpp:579)
==2715== by 0x4ED02B0: lldb_private::Debugger::RunIOHandlers() (Debugger.cpp:861)

Hopefully this is an issue with valgrind and not lldb. But still curious whether someone else can reproduce something similar.

Dmitry

Tom Weaver will answer the councils call!

count me in, can’t wait to have a good chin wag (talk for our non brit brethren) about debug info and it’s many faceted forms.

I somewhat agree with Paul’s concern about discussing everything, but we can make a judgement call on the day about what we wish to focus on the most.

thanks for sorting this out Orlando, looking forward to it.

Tom W

This must be a valgrind issue, there would be major problems if the OS isn't able to lock mutex objects correctly ("mutex is locked simultaneously by two threads"). It is getting confused by a recursive mutex? LLDB uses recursive mutexes.

Hi Paul,

I took it rather as a set of suggested topics depending on who is interested rather than a proposed agenda.

-eric

LLDB's Predicate.h uses plain std::mutex, which is not recursive, and std::lock_guard/std::unique_lock
on top of it.

This needs more digging in because the latest Valgrind snapshot reports the same "impossible" condition.

Dmitry

To whom it may be interesting, thread sanitizer reports nearly the same:

WARNING: ThreadSanitizer: double lock of a mutex (pid=2049545)
     #0 pthread_mutex_lock <null> (libtsan.so.0+0x528ac)
     #1 __gthread_mutex_lock /usr/include/c++/10/x86_64-redhat-linux/bits/gthr-default.h:749 (liblldb.so.12git+0xd725c0)
     #2 std::mutex::lock() /usr/include/c++/10/bits/std_mutex.h:100 (liblldb.so.12git+0xd725c0)
     #3 std::lock_guard<std::mutex>::lock_guard(std::mutex&) /usr/include/c++/10/bits/std_mutex.h:159 (liblldb.so.12git+0xd725c0)
     #4 lldb_private::Predicate<bool>::SetValue(bool, lldb_private::PredicateBroadcastType) /home/antipov/llvm/source/lldb/include/lldb/Utility/Predicate.h:91 (liblldb.so.12git+0xd725c0)
     #5 lldb_private::EventDataReceipt::DoOnRemoval(lldb_private::Event*) /home/antipov/llvm/source/lldb/include/lldb/Utility/Event.h:121 (liblldb.so.12git+0xd725c0)
     #6 lldb_private::Event::DoOnRemoval() /home/antipov/llvm/source/lldb/source/Utility/Event.cpp:82 (liblldb.so.12git+0xedb7da)
     #7 lldb_private::Listener::FindNextEventInternal(std::unique_lock<std::mutex>&, lldb_private::Broadcaster*, lldb_private::ConstString const*, unsigned int, unsigned int, std::shared_ptr<lldb_private::Event>&, bool) /home/antipov/llvm/source/lldb/source/Utility/Listener.cpp:309 (liblldb.so.12git+0xee6099)
     #8 lldb_private::Listener::GetEventInternal(lldb_private::Timeout<std::ratio<1l, 1000000l> > const&, lldb_private::Broadcaster*, lldb_private::ConstString const*, unsigned int, unsigned int, std::shared_ptr<lldb_private::Event>&) /home/antipov/llvm/source/lldb/source/Utility/Listener.cpp:357 (liblldb.so.12git+0xee6b63)
     #9 lldb_private::Listener::GetEventForBroadcaster(lldb_private::Broadcaster*, std::shared_ptr<lldb_private::Event>&, lldb_private::Timeout<std::ratio<1l, 1000000l> > const&) /home/antipov/llvm/source/lldb/source/Utility/Listener.cpp:395 (liblldb.so.12git+0xee6dea)
     #10 lldb_private::Process::GetEventsPrivate(std::shared_ptr<lldb_private::Event>&, lldb_private::Timeout<std::ratio<1l, 1000000l> > const&, bool) /home/antipov/llvm/source/lldb/source/Target/Process.cpp:1139 (liblldb.so.12git+0xd7931d)
     #11 lldb_private::Process::RunPrivateStateThread(bool) /home/antipov/llvm/source/lldb/source/Target/Process.cpp:3872 (liblldb.so.12git+0xda3648)
     #12 lldb_private::Process::PrivateStateThread(void*) /home/antipov/llvm/source/lldb/source/Target/Process.cpp:3857 (liblldb.so.12git+0xda3f87)
     #13 lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) /home/antipov/llvm/source/lldb/source/Host/common/HostNativeThreadBase.cpp:68 (liblldb.so.12git+0xc2c0ea)
     #14 <null> <null> (libtsan.so.0+0x2d33f)

Again, lldb_private::Predicate uses plain std::mutex, which is not recursive.

Dmitry

How could LLDB even function then? We are using the standard std::mutex + std::condition workflow here. Not sure how LLDB could even function if it locking was nor working as expected.

Doing a quick web search, this seems to be due to a mismatched libc++ and libstdc++:

https://github.com/google/sanitizers/issues/1259

Greg

How could LLDB even function then? We are using the standard std::mutex + std::condition workflow here. Not sure how LLDB could even function if it locking was nor working as expected.

Well, obviously this is an issue (and probably the same one) with debugging tools.

Doing a quick web search, this seems to be due to a mismatched libc++ and libstdc++:

https://github.com/google/sanitizers/issues/1259

Nice. So if your libstdc++ is new enough to use pthread_cond_clockwait(), both TSan and
valgrind produces weird results just because they can handle pthread_cond_timedwait() only.

Dmitry

Glad we know why at least! Thanks for bringing this to our attention.

Greg