Port LLDB to IBM AIX

Hi Community Members,

We have Implemented the necessary code changes required to run LLDB on AIX.

Here is some background for you -
AIX is IBM’s Unix based operating system designed to run on IBM Power Enterprise Servers.
AIX is known for its robustness, scalability which makes it a popular choice for workload that require
reliable and high performance computing environment.
AIX has traditionally been using its own xlc compiler, however in the recent years there has been an increased support
and adaption of LLVM-based compiler. By leveraging LLVM, AIX can benefit from modern open-source compiler technology that
continues to evolve and is also supported in other targets.

Using LLDB on AIX allows developers to develop using advanced LLVM toolchains which aligns AIX to the open-source community
making it easy to integrate with LLVM based tools.
The goal of this support is to make AIX as a supported platform for LLDB.

We are excited to inform that we have successfully ported the LLDB project to AIX with the code changes as mentioned in this PR:

The porting process involved adding multiple Plugins for the required support on AIX such as:
AIX-DYLD, Object Container for AIX, XCOFF Support, Platform related support, lldb-server Plugins for AIX and few other things.
Other than some key differences related to PowerPC and XCOFF, a lot of the changes are derived from Linux and Posix.
More about these can be read from here:

  1. IBM Power AIX
  2. https://www.ibm.com/docs/en/aix/7.3?topic=formats-xcoff-object-file-format
  3. https://www.ibm.com/docs/en/aix/7.3?topic=storage-power-family-powerpc-architecture-overview
  4. IBM Documentation

We have been doing sanity tests for many of the basic functionalities/commands like breakpoint, run, disassembly, registers etc
and look forward to the enhanced possibilities of working with the community for fixing the issues we encounter in the process
and continue evolving LLDB on AIX.

Here are some people who will be the code owners on behalf of IBM:

1. dhruv.srivastava@ibm.com (Dhruv Srivastava)
2. lakshmi.kovvuri@ibm.com (Lakshmi Kovvuri)
3. suresana@in.ibm.com (Suresh A Kumar)

We request your inputs or feedbacks about our code changes. We would like our code to be a part of the next release of the LLDB Project
and actively involve in maintaining LLDB on AIX and the Community.

Thanks!

1 Like

Can you make statements with regards to testing infrastructure for this? It’s especially important given the proprietary nature of AIX. If you don’t maintain buildbot that makes sure LLDB works on AIX, no one else will.

Currently, we have setup an internal CI which we are running on a daily basis.
We can plan to setup a CI for the community as well in the future.

We have buildbots that do builds daily, so this wouldn’t be unprecedented. But it’s worth pointing out that the monorepo has quite high rate of commits (600-800 per week), so daily builds inevitably lead to lengthy blamelists when something breaks. It would be beneficial for everyone if you can provide a more powerful builder.

I find this statement rather vague. My opinion is that AIX support in LLDB should come together with testing infrastructure, and should be kept downstream until then.

And what architectures are those? I see you’ve added support for AIX on PPC64, is that the exclusive platform or just the first one to get LLDB changes?

Do you have a high level list of those differences? Presumably you were able to work with or around them, because you got it working, but the ones that currently need the greatest level of hack or volume of changes to make work.

What is the state of the test suite right now then? With the linked PR, is it possible to run check-lldb, and what results do you get?

(I’m not saying you need to have 100% passes to be making this post or asking for feedback, it’s just a test suite we are all familiar with to get a baseline of where the port is at)

I wonder if the test building helpers like the makefiles need some work with the switch to xcoff or if that all “just works” by clang’s magic.

Great to see folks stepping up.

However, my question here is along the same lines as Endill’s. What is the “contract” going to be between the community dealing with the publicly tested configurations, and yourselves maintaining AIX?

We have accepted ports in the past where there is no public builder. Another IBM architecture, s390x is supported for Linux and is upstream without a public builder. However, I’ve never had a problem with the related code, and never had to fix an s390x issue myself.

So I assume the deal there was that it would be tested somewhere in IBM and it was on the IBM contributors to maintain it. That or there is a red build somewhere no one has looked at for years :slight_smile:

(and FWIW, I do like having a big endian target around, but that’s beside the point)

We also have FreeBSD/NetBSD and they again have their own builders over in their infrastructure but they’ve never got in the way and do pop up from time to time to fix their own problems.

Though if this AIX support is PPC64 only, there are PPC64 buildbots already I think, so a public builder wouldn’t be as big of an ask, I assume?

There is a clang builder already - Buildbot - is that also you folks running that?

Also, what is the licensing situation with AIX? Say I wanted to make a big change and test it beforehand on AIX, is it something that I can boot in qemu, or does IBM provide some sort of access for software developers? (iirc there was/is some program like this for s390x)

1 Like

Actually there are public s390x build bots, but not one for lldb.

Looks like we have a single AIX PPC64 bot testing Clang (clang-ppc64-aix), and a number of s390x buildbots running Linux (I guess LinuxONE). Notable absence here are s390x buildbots running AIX.

After looking at the changes I am also wondering if you (@Dhruv-Srivastava-IBM ) know the status of the Linux PPC64 lldb port. Obviously AIX on PPC64 can share code with this but I am worried about having a lot of if AIX then ... if we don’t know that the Linux side still works correctly (or rather, is no worse than it was before your changes).

Also I don’t see any support for core files so I assume that’s a later goal. Which is reasonable given that it’s a lot of the same sort of copy paste code for core files. I will note though that having core file support is very useful for developers without access to hardware, so keep it on the TODO list for sure.

AIX only runs on PPC. s390x is the architecture for IBM’s Z series mainframes.

True. I think I mixed that up with the lack of z/OS builders.

Hi David and Endill,
Great to see the inputs and questions. Here are a few points for you:

Mostly IBM POWER series. The AIX systems are big-endian.
Also, we did test out the changes on Linux PPC64 and I think it is already supported well. AIX on PPC64 will be the next one to follow and as of now, the plans are only to port for AIX on PowerPC.

Here is a brief overview of the major changes/differences (wrt LINUX) :

  1. XCOFF Object file support (The object format supported in AIX) and support in dependant libs.
  2. Support for AIX’s object container - Big archive and parsing for objects inside the archive’s module format
  3. Added AIX’s DYLD (due to having dependancies on ldx_info)
  4. Support for LDX Info
  5. Keeping liblldb STATIC
  6. Modifications and Addition of AIX Host component in the Host module to access info about AIX Host.
  7. Implementation of dladdr, strlcat and strlcpy (Likely needs to be changed due to license issue)
  8. Handling for AIX on PPC64 for various common lldb components
  9. Alot of the code has been added only for 64-bit support right now (using if AIX…).
  10. ptrace usage (also we are only supporting 64-bit right now)
  11. Emulation for PPC branch instructions : b,ba,bla,bc,bca,bclr,bcctr and bctar
  12. lldb server components like NativeProcessAIX/NativeThreadAIX etc have only taken the necessary changes from AIX and not everything. They also have additions for handling AIX/PPC64 registers.
  13. Some Minor Hack in DWARF components to match with AIX’s object file handling (both in llvm as well as lldb code).

Accordingly, multiple new libs get created as Plugins for AIX.

The latest test suite results are as follows:

Total Discovered Tests: 2893
  Skipped          :    3 (0.10%)
  Unsupported      : 1513 (52.30%)
  Passed           : 1258 (43.48%)
  Expectedly Failed:    4 (0.14%)
  Failed           :  115 (3.98%)

Giving some more detail:

check-lldb-unit
Total Discovered Tests: 1152
  Skipped:    3 (0.26%)
  Passed : 1100 (95.49%)
  Failed :   49 (4.25%)
check-lldb-shell
Total Discovered Tests: 545
  Unsupported      : 317 (58.17%)
  Passed           : 158 (28.99%)
  Expectedly Failed:   4 (0.73%)
  Failed           :  66 (12.11%)
check-lldb-api
Total Discovered Tests: 1196
  Unsupported: 1196 (100.00%)

We have compared the results with Linux PPC as well, and the results are decent.

We will be maintaining the required changes and and fixing the failures for AIX support, as the llvm-project updates.
But, we would only request that if some new addition to a generic/common piece of code is breaking AIX, that code should be handled for AIX as well by the contributor.

No, we are not using it right now for this support.

Yes, we will plan to modify the changes such that there are less of if AIX then ... in the code (right now the only reason to add in such a way was to first make it work on AIX)

Yes, as we proceed, we are parallely working on adding support for the other functionalities which have not been ported for AIX yet, including functionalities such as core file support.

Hope these inputs help. I know I have not answered all of your questions, specially related to the testing infrastructure, but we will get back to you on them soon.

Cool.

One convention in the codebase is to name files with a suffix of the architecture they apply to if they’re native files. E.g. NativeRegisterContextLinux_arm64.cpp, so please stick to that form regardless of AIX being PowerPC exclusive (and I assume little endian, or at least lldb’s support for it).

It can be useful to look up *_ppc64le.cpp files for example.

Thanks!

I don’t see anything there you haven’t already flagged as risky, these things should all be able to be debated in their individual PRs.

Caveat: I know resources are always limited, but this statement is from me with my Code Owner hat on.

Ideally we would get to a point where there was a public build bot for Linux on ppc64le, with the tests brought up to date. From there AIX could be added with some assurance that Linux was not regressing.

An eventual point where there is a Linux and AIX builder would enable us to work out whether it’s a Linux / AIX difference or a ppc64 property that’s causing any given issue.

But then again, we’ve had the ppc64 support upstream for a long time and I presume you folks did that work too (IBM at least). So I don’t think I can push too hard for this dream scenario.

If you want to reduce the manual maintenance work your team will have to do, this is what I’d aim for anyway.

These situations are always a spectrum between “author fixes it themselves”, “author fixes it with help from code owner” and “code owner fixes it themselves”. At least for Linaro, we want to empower authors but ultimately, we will do the work ourselves if we need to. Especially for anything esoteric, which is usually 32 bit Arm.

So this won’t be a situation like for example the GN build system being in llvm-project. Where if I break that I have zero requirement to fix it, that’s for Google to do.

(and you’re not suggesting it will be, just using that as a contrasting example)

So I’m ok with having authors fix their own PRs if they can, but they may need your expertise to be able to do so. That could be answering questions or testing patches on their behalf. If work on other platforms is blocked for extended periods trying to fix AIX issues from a distance, I’d be inclined to just leave the AIX build broken for you folks to sort out when you get around to it.

For new features, authors will do as much as they have the time or enthusiasm to do, but if AIX is sufficiently weird in some respect, I as a reviewer would not have a problem with them saying that feature X is not supported on AIX. As long as their is proper user feedback and all the tests are skipped on AIX.

For example, watchpoints are not supported on Windows on Arm because we (Linaro) haven’t done that work yet.

This works the other way too of course, you can ask of authors what their intent was and if the approach taken seriously complicates AIX support, we should certainly discuss changing that.

Those are my expectations of a platform’s code owners. I think that fits what you wrote but let me know if anything differs.

We don’t have a written policy for this, so I’m falling back to writing out examples we can agree on.

1 Like

Sure! One question though, since AIX PPC64 is exclusively in Big-endian,
so according to the convention, is it okay to name these new files as “_ppc64be.cpp”, for categorisation? If not, please do suggest which alternative suits best.

As of now, we do not have any buildbot for Linux PPC64 for LLDB, and it will be tough to set it up for both AIX and LINUX, as AIX supports Big-endian, and for Linux the setup should be switched to Little-endian.
But for starters, we are working on setting up a Buildbot for this AIX PPC64 development. We will update you once things are finalised on that end.

For now, As a reference for test suite data comparison, I can provide you the results with and without our AIX code changes on Linux PPC64:

LINUX check-lldb llvm-project

Total Discovered Tests: 2958
  Unsupported      : 1410 (47.67%)
  Passed           : 1533 (51.83%)
  Expectedly Failed:    4 (0.14%)
  Failed           :   11 (0.37%)

LINUX check-lldb lldb-for-aix

Total Discovered Tests: 2958
  Unsupported      : 1410 (47.67%)
  Passed           : 1533 (51.83%)
  Expectedly Failed:    4 (0.14%)
  Failed           :   11 (0.37%)

Just for Reference, we have run the suite on MAC as well:

MAC check-lldb llvm-project

Total Discovered Tests: 2972
  Unsupported      : 1427 (48.01%)
  Passed           : 1536 (51.68%)
  Expectedly Failed:    7 (0.24%)
  Failed           :    2 (0.07%)

MAC check-lldb lldb-for -aix

Total Discovered Tests: 2972
  Unsupported      : 1427 (48.01%)
  Passed           : 1538 (51.75%)
  Expectedly Failed:    7 (0.24%)

Please do suggest if we can add any other kind of testing in our TODO to make sure no regression is caused on other platforms such as Linux PPC64.

Yes I agree, We will be actively involved in fixing on our end and/or guiding the contributors, based on which side of the spectrum the problem lies in, and will definitely try to handle AIX support from our end in such scenarios.

Thats good to hear. Thanks for that! :slight_smile:

We do have a qemu setup but things are still in a WIP condition for it, so as of now, it wont be of much help. Although as I said, we are working on a buildbot setup, hope that will be helpful?

Certainly, that hint will be very useful. If you have code that can be shared between le and be, ppc64 is fine though.

I didn’t expect it to be in the same build if that’s what you mean but I see your point. It probably can’t be 2 containers on the same kernel or similar.

A buildbot for AIX ppc64be would be an improvement over the current support so that’s fine with me.

Really it’s don’t regress either, the other platforms have build bots or folks looking at them once in a while so they’ll report if something breaks.

Which reminds me, I think Fedora packages lldb for ppc64 Linux. So there has been someone checking it. If I recall correctly there were some issues but nothing major.

It’ll be good for anything we can reproduce with just the compiler, which is a lot of things.

Though if your bot uses the AIX system compiler instead of clang, you may have more incidents where you have to test patches. Then again, you may want to verify it builds with the system compiler, up to you really.

Ok great!! Thanks for all the insightful answers and feedback.

I will get back on the bot setup once we have made some progress on our end and we have more information to share.

In the meantime, we’d appreciate your suggestions on how we can get started with the upstreaming process? Are there any specific steps or considerations we should address before we move forward?

I understand that splitting up the original PR will be necessary, but if you have any particular suggestions or recommendations, I would love to hear them.

Anything that can be landed even without the context of AIX support are the obvious first things. Refactoring, any fixes that also apply to ppc64 Linux. Then anything that fixes compilation of lldb on AIX, like those extra namespace markers.

As I’d say it’s legitimate to compile lldb (the client) on AIX to debug some other platform remotely, so as long as the fix is pretty simple we’d accept it.

Anything where you need to make a change outside of lldb then use it in lldb, you can do a “stacked PR” of sorts. Where the first one is the outside lldb change and the second one is both, but with a note that only the second change in that PR is new.

This sidesteps the usual “please split this” request, but keeps the context of why you’re making the changes in the first place.

Once you get into the AIX specific support patches, it would be very useful to have a public tree of the (roughly) split patches just in case you need to explain why an earlier change is being made.

If you can also split patches in a way that makes it obvious what was copied code and what is AIX specific tweaks, that would help.

So PR 1 copies a bunch of boiler plate (we have so much :slight_smile: ) and maybe is not correct for AIX. Then PR 2 tweaks a few values to make it correct for AIX.

Or refactoring so you can share existing ppc64 code in one PR, then add the AIX paths in the next.

As long as the code builds on all existing platforms it doesn’t matter if AIX support isn’t correct at all points. It will be when it’s all landed.

Okay cool. Thats a sound plan.

So, I’ll start with some straightforward changes such as:
replacing tid_t with lldb::tid_t and making other minor adjustments in the common code.

From there, we can work on the broader changes, that aren’t tied to AIX-specific details and may go with Linux PPC64 as well.

It might take a couple of commits before LLDB actually fully compiles on AIX, but I hope that’s manageable.

Once the base is set, we can then focus on AIX specific enhancements, following your advice, in the form of stacked PRs.

That is, by incorporating the Linux derived code and then introducing AIX specific tweaks, while proceeding to make it compile on AIX. We will initially target the LLDB client and eventually both the client and server.

Hope I’ve covered the plan as per your guidance.

I will proceed with raising the first PR.

Thanks!!

I knew I’d seen this somewhere before, take a look at request lldb cmake config for static build · Issue #98754 · llvm/llvm-project · GitHub. Maybe you want the same thing.

1 Like

Yes, Thats helpful. Thanks!