Problem with ClangExpressionDeclMap

Was looking into this bug on Windows, and I traced it down to an issue in ClangExpressionDeclMap. ClangExpressionDeclMap::FindExternalVisibleDecls has a piece of code that looks like this:

if (namespace_decl && module_sp)
{
const bool include_symbols = false;

module_sp->FindFunctions(name,
&namespace_decl,
eFunctionNameTypeBase,
include_symbols,
include_inlines,
append,
sc_list);
}
else if (target && !namespace_decl)
{
const bool include_symbols = true;

// TODO Fix FindFunctions so that it doesn’t return
// instance methods for eFunctionNameTypeBase.

target->GetImages().FindFunctions(name,
eFunctionNameTypeFull,
include_symbols,
include_inlines,
append,
sc_list);
}

So it’s only searching for the base name if the function is in a namespace, and if it’s at global scope it’s searching for the full name.

I’m not sure why this works on other platforms, but on Windows it doesn’t work because if I have this code:

int foo(int x)
{
}

then the full name of this function is ?foo@@YAHH@Z

If I change eFunctionNameTypeFull to eFunctionNameTypeBase then everything works, and “p foo” finds the function. So I’ve got a couple of questions:

  1. Why all this complicated logic? I would expect that if I type “p foo” then it would just print everything whose base name is foo and is visible from within the current scope? Am I underthinking this? There’s special cases for namespaces, global scope, instance methods, variables, functions, and it looks for things in certain orders, etc. Is there any reason why it can’t just find everything with a basename of foo visible within the current stack frame?

  2. I’m not sure what this comment means:

// TODO Fix FindFunctions so that it doesn’t return
// instance methods for eFunctionNameTypeBase.

I changed it to eFunctionNameTypeBase and it fixes the original bug on Windows. I inserted a class with an instance method named foo() and also a global method named foo(), and running “p foo” doesn’t find the instance method, only the global method. Is it possible this was somehow fixed, and it was forgotten to change this back to eFunctionNameTypeBase?

  1. Why does this work on other platforms? full names are mangled on other platforms too, so I don’t know how “p foo” doesn’t run into this same issue on other platforms. Any ideas?

Currently the best fix I have for this bug on Windows is to change eFunctionNameTypeFull to eFunctionNameTypeBase. but I want to get some thoughts on that worrisome comment before I go forward with that patch.

In SymbolFileDWARF::FindFunctions (), there is code that runs only for Linux and BSD. That finds the function in global namespace.

Regards,

Abid

But then why does it work on Mac?

And why is this workaround needed at all, when you can just have it check the basename in the first place (e.g. in FindExternalVisibleDecls)?

I guess a higher level, more general question - Why is name lookup so complicated? Clang already knows everything, why does LLDB need to modify the default behavior of Clang’s name lookup? I know LLDB can insert its own decls into the AST, but it seems like we can just get avoid returning those to the user, and still use clang for everything else.

But then why does it work on Mac?

And why is this workaround needed at all, when you can just have it check the basename in the first place (e.g. in FindExternalVisibleDecls)?

I guess a higher level, more general question - Why is name lookup so complicated? Clang already knows everything, why does LLDB need to modify the default behavior of Clang's name lookup? I know LLDB can insert its own decls into the AST, but it seems like we can just get avoid returning those to the user, and still use clang for everything else.

I don't get what you mean by this comment. Name lookup in the debugger is pretty different from name lookup as it would be if we were sitting in the compiler trying to compile the expression as if it had been inserted in the current function at the PC. The compiler can be quite strict about what is and isn't visible, whereas the debugger has to be as lax as possible.

For instance, suppose I am in a frame of a function that has a pointer to Foo, but that function is in a compile unit (and maybe even a shared library) that only contains a forward reference to Foo. If you had another shared library that contains a definition of Foo, you would certainly want to be able to print *Foo. Of course, from a strict compiler standpoint, this would be illegal, but telling debugger users that would not be a popular move.

Similarly, folks want to be able to use global variables (and eventually function statics when we work out a syntax for that) that aren't visible in the current function.

So the debugger has to get into the business of broadening name lookup to meet these needs.

Jim

But then why does it work on Mac?

MAC can be a different code path. There is quite a bit of code in that function that run for m_using_apple_tables.

Regards,

Abid

Before I write out this email, I just want to state that my concern comes mostly from the fact that the name lookup code seems to have many codepaths that depend on not only the OS flavor, but also the mangling scheme. It makes it very hard to maintain, improve, or even verify that things are working correctly. For example, see this bug. Most of this could be solved by just saying “clang, what is this thing?” BTW, I CC’ed David and Chandler, because they know quite a lot about how name lookup works in Clang and can probably provide quite a lot of edge cases that would break name lookup in LLDB currently, similar to the aforementioned bug.

That being said, to respond to your point, since you have the forward declaration, when you do “p *foo”, the compiler should be able to tell you "The type of this expression is ". Then you look in the debug info ignoring scope of any kind for the definition of . The compiler only does the name lookup based on the current scope, but once it knows the type it shouldn’t have to worry about visibility anymore when checking the symbol file. There may be cases where the debug info file contains multiple definitions of in different source files, or modules. In the case where it’s not ambiguous, there’s nothing to worry about. In the case where it is, I don’t think LLDB should try to be smart about it anyway, because the user probably knows best.

So ambiguous types could be handled the same way as something that isn’t visible from the current function (another use case you said LLDB tries to support) – in particular there could be a syntax for specifying the scope. Off the top of my head, something like this, [,,]expression. Any of the 3 fields in the braces could be unspecified. So, for example, suppose you have a file named foo.cpp. This file is compiled into two separate libraries, foo1.so and foo2.so, and looks like this:

// foo.cpp
int g_global = 0;

void func()
{
static int g_static = 1;
}

void func2()
{
int local = 2;
}

If the debugger is stopped in func() from foo1, you could do this:

(lldb) p g_global ; equivalent to “p [foo1, foo.cpp, ]g_global”
(lldb) p g_static ; equivalent to “p [foo1, foo.cpp, func]g_static”
(lldb) p local ; error, name not found

But you could also do this:

(lldb) p [foo2, , ]g_global ; No matter what module you’re in, look for all names in foo2.so that are called g_global, in any file. If you find more than 1, it’s ambiguous and needs to be refined by the user.
(lldb) p [foo2, , func]g_static ; No matter what module you’re in, find function local statics in module ‘foo’, in functions named ‘func’, declared in any file. If you find more than 1 it’s ambiguous.

But the default behavior,

(lldb) p g_global

Just searches based on the current stack frame and doesn’t do anything fancy, which is usually what you want.

(For the record, I know you guys have thought about this harder than I have, so I’m just trying to understand. I still think delegating all the name lookup to clang is more robust, but I still might be missing some important use cases)

Yes, I thought along the lines you suggest at first, but what we were told is that at least for C languages, at the point clang needs to know "what could the token 'Foo'" mean, it doesn't yet know what kind of entity the name represents. Since you can have a type, a local variable, and an ivar that all share the same name, it needs to know what the name could possibly be before it can tell which one of these it actually is (or if the statement can't possibly make sense.) So it asks us about the name, and we have to provide all the entities it could possibly be. Then clang will sort out which one makes sense. As I understand it the model where lldb only needs to answer "find me a class named Foo", etc. is not viable.

I started chasing down a bug similar to the one that you refer to and it looks like there are places where clang assumes that it knows from the context already provided so it doesn't ask us if there might be another entity of the same name. There may be more than one problem of this sort, but the proper solution is to figure out the other places in clang that need to call out to the debugger's ClangASTSource.

As for forcing the user to specify globals, etc, by fully writing them out if they weren't in the current shared library, etc, you can try to propose that, but you will find yourself a very unpopular person...

Jim

Hmm. Well I will defer to the clang people that I CC’ed on this thread. Admittedly this isn’t an area I know much about. I was told by them that in theory that they thought it would work, but then again they don’t know much about LLDB either. We need someone who knows a lot about both :slight_smile:

Anyway, going back to the original topic unless Chandler or David wants to chime in about name lookup.

Is Sean the only person that can reasonably answer my question about changing the full name lookup to a base name lookup? It fixes this bug on Windows. I might be able to fix it by, as Abid pointed out earlier, adding Windows to the list of platforms that use this “workaround” similar to how Linux and FreeBSD are doing. But since it manually inspects the mangled name, I’d kind of like to avoid this and do a proper fix, and I actually think changing the full name lookup to a base name lookup would allow me to remove that workaround on Linux and FreeBSD.

It broke 2 tests on Linux, but I fixed them in the actual tests. The breakage occurred in cases where someone has casted something to a class type in the debugger and (I think) a synthetic type had been created in the debugger with the same name. So the fix I employed in the test was instead of casting to, for example, (Foo *)variable, cast it like (class Foo *)variable.

Hmm. Well I will defer to the clang people that I CC'ed on this thread. Admittedly this isn't an area I know much about. I was told by them that in theory that they thought it would work, but then again they don't know much about LLDB either. We need someone who knows a lot about both :slight_smile:

Anyway, going back to the original topic unless Chandler or David wants to chime in about name lookup.

Is Sean the only person that can reasonably answer my question about changing the full name lookup to a base name lookup? It fixes this bug on Windows. I might be able to fix it by, as Abid pointed out earlier, adding Windows to the list of platforms that use this "workaround" similar to how Linux and FreeBSD are doing. But since it manually inspects the mangled name, I'd kind of like to avoid this and do a proper fix, and I actually think changing the full name lookup to a base name lookup would allow me to remove that workaround on Linux and FreeBSD.

It broke 2 tests on Linux, but I fixed them in the actual tests. The breakage occurred in cases where someone has casted something to a class type in the debugger and (I think) a synthetic type had been created in the debugger with the same name. So the fix I employed in the test was instead of casting to, for example, (Foo *)variable, cast it like (class Foo *)variable.

Let me think about it a bit. I'm filling in for Sean, but I'm only part way up to speed on this so it will take me a little longer to figure out what I think about this. If you want to do something for the short term, and the name lookup workaround put in for Linux & FreeBSD will work, lets do that for now.

Jim

I’m fine leaving this broken on windows for the short term. Fixing this bug doesn’t actually fix any tests for me, so at this point its just a nice to have (although when i do fix it i will add a test for global function name lookup)

Please also note that changing look from Full to Base changes the behaviour somewhat.

namespace testabc

{

int foo(int y)

{

return y+1;

}

}

int foo (int x)

{

return x + 1;

}

int main (void)

{

int y = 0;

while (1)

{

y = foo(y);

y += testabc::foo (y);

}

return 0;

}

While stopped at the start of main,

Currently:

(lldb) p foo

(int (*)(int)) $2 = 0x00000000004004fc (expr`foo(int) at expr_test.cpp:11)

(lldb) p testabc::foo

(int (*)(int)) $3 = 0x00000000004004ed (expr`testabc::foo(int) at expr_test.cpp:4)

(lldb) b foo

Breakpoint 4: 2 locations.

After proposed change:

(lldb) p foo

(int (*)(int)) $0 = 0x00000000004004ed (expr`testabc::foo(int) at expr_test.cpp:4)

(lldb) p ::foo

(int (*)(int)) $1 = 0x00000000004004ed (expr`testabc::foo(int) at expr_test.cpp:4)

(lldb) b foo

Breakpoint 2: 2 locations.

So after the change, lldb will find ‘foo’ from the namespace instead of the global one. It also shows the currently lldb is not handling the :: operator correctly too so there is no way to access the global one.

Regards,

Abid

Wanted to ping this thread so it doesn’t get lost. Jim, is Sean back yet? No worries if not as I’ve been sidetracked with other things, but at some point I’d like to discuss this with him.

Not yet.

Jim