[Bug 15301] New: LLDB prints incorrect size of libstdc++ vector<bool> containers (when inferior built with GCC on Linux)

Bug ID 15301
Summary LLDB prints incorrect size of libstdc++ vector containers (when inferior built with GCC on Linux)
Product lldb
Version unspecified
Hardware PC
OS Linux
Status NEW
Severity enhancement
Priority P
Component All Bugs
Assignee lldb-dev@cs.uiuc.edu
Reporter daniel.malea@intel.com
Classification Unclassified

In the TestDataFormatterStdVBool, instead of printing the correct size (49)
lldb prints -1 for the size of the std::vector<bool> container (libstdc++).

To reproduce, run:

python dotest.py
functionalities/data-formatter/data-formatter-stl/libstdcpp/vbool -C gcc

Hi,

When the inferior is built with GCC 4.7.2, the DWARF contains duplicate
DW_AT_template_type_param DIEs for std::vector (and others). There is a
reported GCC bug about it (54410 – [4.6/4.7/4.8 Regression] doubled DW_TAG_template_type_param).
Because of this, lldb parses the type
"std::vector<bool, std::allocator<bool> >" as
"std::vector<bool, std::allocator<bool, bool>, bool, std::allocator<bool,

>".

Later, the FormatManager fails to match this type with the appropriate
SyntheticChildrenFrontEnd for std::vector<bool>. The printable
representation produced is then the incorrect:
(lldb) frame variable vBool
(std::vector<bool, std::allocator<bool, bool>, bool, std::allocator<bool,

>) vBool = size=0 {}

I've attached 3 patches (against r181819) for review:
- An attempt at a workaround for the GCC bug
- A change in the FormatManager to allow "std::vector<bool,
  std:allocator<bool> >" to be matched to the vector<bool> Synthetic.
- The re-enabling of the test case

Thanks,
Yacine

0001-SymbolFile-DWARF-Ignore-duplicate-template-parameter.patch (4.9 KB)

0002-FormatManager-Match-std-vector-bool-std-allocator-bo.patch (3.16 KB)

0003-Tests-Re-enable-TestDataFormatterStdVBool.py.patch (1.27 KB)

What are the chances that these patches could affect other correctly emitted DWARF for templates?

Unless this patch will _absolutely_ _never_ affect any correct debug info that is emitted by GCC, I would rather not fix this in LLDB, but just get GCC fixed.

Greg

Yacine,

In addition to what Greg said, I am also slightly confused by your FormatManager changes.Two points here:

  1. a regular expression is not really necessary. If I understand the scope of the GCC bug correctly, all we need to do is match the two exact strings
    std::vector<std::allocator> and std::vector<bool, std::allocator, bool, std::allocator >
    to catch all cases
    Regexp matching is slower and it is easy to write a regexp that over-reaches and matches more types than we would like to
    If this is a workaround for a bug rather than a necessity, I would rather much keep it as confined as possible, so I would definitely rework in terms of exact matching

  2. It looks like you are removing the summary for a vector of bools entirely. Why?
    You would just need to add the same summary with a new typename to the category, instead your patch at lines 42 and 43 is removing the creation of the summary for std::vector and I can’t see the replacement

This is mostly a minor cosmetic issue, and it’s not even enforced consistently in the code, but I tend to add new formatters using AddFormatter calls instead of directly playing with the FormatNavigator objects
It’s kind of laying groundwork for potentially reworking the way built-in formatters are added to an easier to maintain coding style. I have a couple ideas there, just not enough time to code them :slight_smile:

Thanks for clarifying these data points.

Enrico Granata
:envelope_with_arrow: egranata@.com
:phone: 27683

Hi Greg,

What are the chances that these patches could affect other correctly emitted DWARF for templates?

Unless this patch will _absolutely_ _never_ affect any correct debug info that is emitted by GCC, I would rather not fix this in LLDB, but just get GCC fixed.

Greg

I understand your concerns. I wasn't sure if a workaround was wanted in
lldb. I attached one in case it was, and to have some code to discuss about.

The patch is based on the assumption that a class or a function DIE
shouldn't contain multiple identical template paramter DIEs. That template
parameters DIEs should at least differ by name if everything else is
identical. It looks reasonably safe to me, but I may be missing something.

As I understand it, the choice is to avoid the workaround in lldb, and let
GCC fix the problem. According to the GCC bug report, it's "Fixed for 4.8".
So the test case will get from XFAIL to XPASS by itself, someday :slight_smile:

Yacine

Hi Enrico,

Yacine,

In addition to what Greg said, I am also slightly confused by your
FormatManager changes.
Two points here:

1) a regular expression is not really necessary. If I understand the scope
of the GCC bug correctly, all we need to do is match the two *exact* strings
std::vector<std::allocator<bool>> and std::vector<bool,
std::allocator<bool>, bool, std::allocator<bool> >
to catch all cases
Regexp matching is slower and it is easy to write a regexp that over-reaches
and matches more types than we would like to
If this is a workaround for a bug rather than a necessity, I would rather
much keep it as confined as possible, so I would definitely rework in terms
of exact matching

Adding an exact string to match "std::vector<bool, std::allocator<bool>,
bool, std::allocator<bool> >" would indeed serve as a workaround, but only
for the specific case of vector<bool> + the GCC version with the bug +
"frame variable". It will not be enough to match the "std::vector<bool,
std:allocator<bool> >" that GCC initially intended to produce for
"std::vector<bool>". It will not avoid problems with the erroneous parsing
of other types that have templates parameters that will get duplicated. So I
think the first patch was important to have a complete workaround.

As I needed to match two similar types:
"std::vector<bool, std:allocator<bool> >" (produced by GCC) and
"std::vector<std:allocator<bool> >" (produced by Clang), I thought it was
more appropriate to factor the two into the regex
"^std::vector<(bool,( )?)?std::allocator<bool> >$".

2) It looks like you are removing the summary for a vector of bools
entirely. Why?
You would just need to add the same summary with a new typename to the
category, instead your patch at lines 42 and 43 is removing the creation of
the summary for std::vector<bool> and I can’t see the replacement

I removed it because it looked redundant, as it's already included in the
regex summary for other vectors, on line 572.

This is mostly a minor cosmetic issue, and it’s not even enforced
consistently in the code, but I tend to add new formatters using
Add/Formatter /calls instead of directly playing with the FormatNavigator
objects
It’s kind of laying groundwork for potentially reworking the way built-in
formatters are added to an easier to maintain coding style. I have a couple
ideas there, just not enough time to code them :slight_smile:

Oh yes, I should have used that. It's a lot cleaner.

Thanks for you review.
Yacine

Hi,
thanks for addressing these concerns and for your contribution to LLDB.
Please find replies inlined.

Enrico Granata
:envelope_with_arrow: egranata@.com
:phone: 27683

Hi Enrico,

Yacine,

In addition to what Greg said, I am also slightly confused by your
FormatManager changes.
Two points here:

  1. a regular expression is not really necessary. If I understand the scope
    of the GCC bug correctly, all we need to do is match the two exact strings
    std::vector<std::allocator> and std::vector<bool,
    std::allocator, bool, std::allocator >
    to catch all cases
    Regexp matching is slower and it is easy to write a regexp that over-reaches
    and matches more types than we would like to
    If this is a workaround for a bug rather than a necessity, I would rather
    much keep it as confined as possible, so I would definitely rework in terms
    of exact matching

Adding an exact string to match “std::vector<bool, std::allocator,
bool, std::allocator >” would indeed serve as a workaround, but only
for the specific case of vector + the GCC version with the bug +
“frame variable”. It will not be enough to match the “std::vector<bool,
std:allocator >” that GCC initially intended to produce for
"std::vector”.

We can have exact matches for all of these things:

  • the original thing that LLDB matches
  • the buggy GCC thing
  • the correct GCC thing

Then we know exactly what we are matching and why.

It will not avoid problems with the erroneous parsing
of other types that have templates parameters that will get duplicated. So I
think the first patch was important to have a complete workaround.

What I meant was more along the lines of “we could erroneously match other versions of std::vector<> that should not match”.

As I needed to match two similar types:
“std::vector<bool, std:allocator >” (produced by GCC) and
“std::vector<std:allocator >” (produced by Clang), I thought it was
more appropriate to factor the two into the regex
"^std::vector<(bool,( )?)?std::allocator >$”.

I personally tend to avoid regexes unless you really need to match a broad and potentially unbounded range of types.
In this case, it seems we have two or three different names, and it would also be good to track that one of these is just a workaround for a bug, rather than a real incarnation of the class.
It is not critical because we do formatters caching (so we would look at the regexes once, figure out the match for vector and then reuse it until the user changes anything in the formatters) but I feel it makes things more obvious to use regex matches only when a regex is really needed (I can’t anticipate anything a user can want to put in a vector, can I?) vs. a small number of exact matches (all of which I can actually anticipate) - small being key here, between 50 new exact entries and a well tested regex, definitely go for the regex!

  1. It looks like you are removing the summary for a vector of bools
    entirely. Why?
    You would just need to add the same summary with a new typename to the
    category, instead your patch at lines 42 and 43 is removing the creation of
    the summary for std::vector and I can’t see the replacement

I removed it because it looked redundant, as it’s already included in the
regex summary for other vectors, on line 572.

Fair. I still think we should have exact matches whenever possible, but not a big deal.

Enrico Granata changed bug 15301

What | Removed | Added |

Comment # 4 on bug 15301 from Enrico Granata

Author: enrico
New Revision: 184265

URL: [http://llvm.org/viewvc/llvm-project?rev=184265&view=rev](http://llvm.org/viewvc/llvm-project?rev=184265&view=rev)
Log:
<rdar://problem/14086503>

Hardening the libstdc++ std::map test case against line table changes

Let me know if this fixes the bug.