lldb-server tests

Hello all,

In case you haven't noticed it, I'd like to draw your attention to
D32930, where we're proposing a new test framework for lldb-server
tests. The discussion has so far been about low-level implementation
details, so you don't have to read through it if you don't feel like
to (but I do encourage it)

However, I'd like to explain some of the high-level motivations which
led to our proposed design and open a discussion on them here. I'll do
this in an FAQ form: :slight_smile:

- why new framework?:

Lldb-server tests were never really suited for the dotest.py model
anyway (for example they end up creating an SBDebugger, only to be
completely ignoring it and opening a socket connection to lldb-server
directly). Perhaps for this reason, they are also harder to write than
usual lldb tests, and a lot more messy internally (e.g., after Chris's
ipv6 patch, which caused all lldb-server connections in the test to
fail, I've learned that the test harness will attempt the lldb-server
connection 400(!!!) times before conceding defeat). The test suite
operation is also very illogical when it comes to doing remote tests:
to test lldb-server on a remote target, you first have to build
lldb-server for the target, THEN you have to build lldb for the HOST,
and THEN you run dotest.py in the HOST build folder while passing
funny dotest.py arguments.

- why lldb-server ?:
We'd like this to be a first step in porting the existing test off of
the dotest.py test runner. Unlike the full test suite, the number of
lldb-server tests is not that big, so porting them is an task
achievable in a not too long timeframe, and it can serve as a proof of
concept when considering further steps. Also, lldb-server already
performs a relatively well-defined and simple task, which means it
fits the llvm model of testing isolated components of functionality
without the need for a massive refactor.

- why c++ (aka, if the existing test suite is broken, why not just fix it) ?:
There are two fundamental issues with the current test suite which
cannot be easily "fixed". The first one is the remote execution (which
is where a large part of the test harness complexity comes from). By
writing the test in c++ we can run the test *and* lldb-server remotely
(***), avoiding the network communication and flakyness that comes
with it. The other issue is the fact that it needs to have a
completely independent reimplementation of the gdb-remote protocol.
Sure, some duplication is expected from tests, but it does not have to
be that big. If we write the test in c++ we can reuse parts of the
gdb-remote client code (thereby increasing test coverage of that as
well), and only resort to manual packet parsing when we really want to
(e.g., when testing the server response to a malformed packet or
similar).

- ok, but why not have the test described by a text file, and have a
c++ FileCheck-like utility which interprets it?:
Due to the unpredictable (e.g. we cannot control the addresses of
objects in the inferior), and interactive nature of the tests, we
believe they would be easier to write imperatively, instead of a more
declarative FileCheck style. E.g. for one test you need to send
qRegisterInfoN for N=1,... until you find the pc register then pluck a
field with that number from a stop-reply packet, and compare that the
result of another packet, while possible reversing endianness. To
describe something like this in a text file, you will either need
primitives to describe loops, conditionals, etc (which will then tend
towards implementing a full scripting language), or have a very
high-level primitive operation which does exactly this, which will
tend towards having many specialized primitive operations.

regards,
pavel

(***) To achieve this, we want to propose adding the ability to
execute tests remotely to llvm-lit, which we hope will be useful to
more people. I'll write more about this in a separate email with
llvm-dev included.

Thank you for this work.

I will add a usual requirement - please make sure that "check-lldb-unit"
works in standalone builds.

One thing about lit that most people either don’t understand or forget about is that FileCheck has nothing to do with lit. You can have lit tests without FileCheck. It’s more work because you would have to define an LLDBServerTestFormat and invent some DSL that isn’t just a bunch of run lines and check statements. You could then write a c++ program like lldb-server-test, which gives you all the benefits of code reuse and packet parsing that you’re talking about, and have your test consist of something like:

check-lldb-server --prefix=TEST_ERROR_RESPONSE < %s
check-lldb-server --prefix=TEST_SUCCESS_RESPONSE < %s

TEST_ERROR_RESPONSE: SEND:
TEST_ERROR_RESPONSE: RECV-ERROR: 62

TEST_SUCCESS_RESPONSE: SEND:
TEST_SUCCESS_RESPONSE: RECV-SUCCESS: eax = 7

I think this would make tests both easier to write and easier to understand than what is being proposed here.

That said, what is being proposed here can’t exactly be called a lateral move, because I do agree it’s better. So because of that, I’m willing to let it go in. But I’m 100% confident that a better solution can be devised in lit with some thought. Unfortunately, given that I don’t work on lldb-server, all I can really do is offer some high level ideas, and it will be up to you guys to figure out the details.

I wasn't forgetting about this -- this is what i meant by a
"FileCheck-like utility" in the original email.

The thing I am afraid of there is the complexity hidden in the "<some
hex packet string>" in your example. The thing is this will rarely be
a fixed string. For example, to set a breakpoint, you need to send
other packets to find the address of a suitable place to place the
breakpoint in. So now you need to have the ability to refer to the
previous packets (and their parts, e.g. to pluck out a field of a json
response) and do at least some basic arithmetic on them. I mean, it
can be done, I just feel it would be a significant amount of code to
basically implement a scripting language with a set of integer and
string operations to be able to construct and parse packets. I think
that's a bit of a lost effort as we already have a language that
supports all those operations, and everyone is familiar with. I am
still aiming to make the syntax of the test as readable as possible,
ideally a sequence of assert statements:
ASSERT_FOO(SendPacket(<any expression for constructing a packet>));
ASSERT_FOO(RecieveOK()) and something like that.

Of course, it could be I am overestimating the difficulty of this, and
I have to admit I am biased towards gtest.

However, maybe these don't have to actually be incompatible goals.
Most of the patch in question is about writing a simple client capable
of sending and receiving messages and presenting their contents to the
test. You would need something like this even if you went for the
check-lldb-server utility, which could then use this client as a back
end, coupled with an additional library for parsing the contents of
the test scripts. So if someone ever writes that, hopefully he will
find this code useful. And we could even have the two frameworks
coexisting side-by-side -- the scripted one could be used for simple
tests, and the gtest one when you need advanced flow control or data
manipulation.

Note to apple folks: The lldb-server test suite is also used for
testing debugserver. I'm not sure how much you guys care about these
tests (the commit history shows only Todd touched them), but we would
be interested in hearing your thoughts on this.