[RFC] Backend for Motorola 6800 series CPU (M68k)

Hi All,

We would like to contribute our supports for Motorola 68000 series CPU (also known as M68k or M680x0) into LLVM. And we want to hear feedbacks from you

Here is some background for M68k: Motorola 68000 series CPU was one of the most popular CPUs used by personal computers in the ‘80, including some of the earliest Apple Macintosh. Fast-forwarding to modern days, M68k is still popular among retrocomputing communities - a bunch of people doing cool stuff, mostly porting modern software and systems, on old computers. For example, Planet m68k (http://m68k.info/) is a portal and a bulletin board for many communities that focus on specific M68k computer models, Amiga, Atari, Mac68k to name a few, to share their news. Major operating systems like Debian [1] (Adrian in the CC list can back me up on the Debian part) and NetBSD [2] also support M68k. Long story short, there is a big community and a huge amount of developers in this ecosystem.

Some of you might remember that LLVM backend for M68k has been brought up in the mailing list sever times. The latest one was in 2018 [3]. Though those attempts never went through, we learned precious lessons: It’s important to show who’s behind this backend, how sustainable they are, and how we can make these changes easy to review.

As I illustrated earlier, majorities of the participants in the M68k community are hobbyists and non-profit groups. So do the people behind this backend: Currently there are three core members (CC’ed): Adrian, Artyom, and me. All of us participate in this project as individual contributors. I know the fact that we’re not supported (financially) by any institution or organization will put us in a lower hand when it comes to reliability. However, the quality of the backend has improved quite a lot since the last discussion. We’ve also settled down the code owner / primary maintainer. Not to mention we’ve been working closely with the rest of the M68k community to help us improve the testing. On the financial side, we’re trying to open up a donation campaign (e.g. Patreon). Though that involves many other practical issues so we’re still discussing that. LLVM is an open and inclusive community accepting contributions from talented people all over the world, regardless of their backgrounds. I believe this virtue can still be seen in the support of hardware backends, where each of the targets is judged by its code quality, maintenance, and user base. Rather than which company supports it.

Last but not the least, on the technical side, we’ve ported the code base onto ToT, and splitted all the changes into 8 patches, organized by their functioning. I’ll put them onto Phabricator later. Meanwhile, you can checkout the exact set of patches in our out-of-tree repo:

  1. TableGen related changes. https://github.com/M680x0/M680x0-mono-repo/commit/5b7d0ef709355f86039a799374867ba59385a79e

  2. Target-independent codegen changes. https://github.com/M680x0/M680x0-mono-repo/commit/70a6baed6afaf5fc0f5137804d130b891d54a255

  3. Infrastructure for the M680x0 backend.
    https://github.com/M680x0/M680x0-mono-repo/commit/f75435c9b34e7a6f2e4091b8d024b5cc391ee755

  4. M680x0 target information and spec.
    https://github.com/M680x0/M680x0-mono-repo/commit/9535d3dd55acb892b45f83f85906b8a6a5545f6f

  5. M680x0 target lowering.
    https://github.com/M680x0/M680x0-mono-repo/commit/253af82aa396ac5ea928fa2c9a6e31da70448313

  6. M680x0 supports for MC
    https://github.com/M680x0/M680x0-mono-repo/commit/d42bc0e355e4911c6aab6468ae12dc9e21072285

  7. Basic M680x0 support in Clang
    https://github.com/M680x0/M680x0-mono-repo/commit/636893780575912973130972cb5fc73153e9cbee

  8. M680x0 driver support in Clang
    https://github.com/M680x0/M680x0-mono-repo/commit/c5834ffbda019df8c94c669c658d804cb9c19af3

As you can see, some of the patches also touch some target-independent parts like TableGen. We tried to minimize their scope, make sure they’re optional and won’t break any existing code. I’ll justify them in their Phabricator pages, or even open up new threads here on the mailing list.

Please feel free to leave any feedback!

Thank you for your time,
Min

[1] https://www.debian.org/ports/m68k/
[2] http://wiki.netbsd.org/ports/mac68k/
[3] https://lists.llvm.org/pipermail/llvm-dev/2018-August/125317.html

Thank you to everyone who worked on this! It looks like it's coming along well.

What are your plans for handling the variety of ABIs used on 68000 systems? ELF for Linux/NetBSD/SVR4 is certainly a reasonable starting point, but there was quite a bit of variety in that area.

  -- Chris

Hi Chris,

Thank you to everyone who worked on this! It looks like it's coming along well.

Thank you :slight_smile:

What are your plans for handling the variety of ABIs used on 68000 systems? ELF for Linux/NetBSD/SVR4 is certainly a reasonable starting point, but there was quite a bit of variety in that area.

Currently we only support ELF for Linux. Our short-term goal is to fully support Debian for M68K, followed by NetBSD support. So it’s likely that we will not consider other ABIs any time soon.

Thank you
Min

I know it's really early in the project's life, but another question I had: How does the generated 68K code perform, at least compared to modern GCC?

  -- Chris

Hello Chris!

hello.c (74 Bytes)

hello.llvm.S (7.87 KB)

hello.gcc.S (7.52 KB)

Hello!

On 9/25/20 1:31 AM, Min-Yih Hsu wrote:> Here is some background for M68k: Motorola 68000 series CPU was one of the most popular CPUs used

by personal computers in the ‘80, including some of the earliest Apple Macintosh. Fast-forwarding
to modern days, M68k is still popular among retrocomputing communities - a bunch of people doing
cool stuff, mostly porting modern software and systems, on old computers. For example, Planet m68k
(http://m68k.info/) is a portal and a bulletin board for many communities that
focus on specific M68k computer models, Amiga, Atari, Mac68k to name a few, to share their news.
Major operating systems like Debian [1] (Adrian in the CC list can back me up on the Debian part)
and NetBSD [2] also support M68k. Long story short, there is a big community and a huge amount
of developers in this ecosystem.

Adding to this: Despite its age, the Motorola 68000 is still a very popular architecture due to the
fact that the CPU was used by a wide range of hardware from the 80s throughout the 2000s. It is
used in the Amiga, Atari, Classic Macintosh, Sega MegaDrive, Atari Jaguar, SHARP X68000, various
Unix workstations (Sun 2 and 3, Sony NeWS, NeXT, HP300), many arcade systems (Capcom CPS and CPS-2)
and more [1].

As many of these classic systems still have very active communities, especially the Amiga community,
development efforts are still very strong. For example, despite being the oldest port of the Linux
kernel, the m68k port has still multiple active kernel maintainers and is regularly gaining new
features and drivers. There are companies still developing new hardware around the CPU (like
Individual Computers, for example) like network cards, graphics adapters or even completely
new systems like the Vampire [2].

Since we would like to be able to keep up with modern software development on m68k, we need a modern
toolchain which necessarily includes LLVM due to the fact that it's needed for modern languages
like Rust. In particular, several projects like GNOME have started rewriting parts of their
codebases in Rust which is why any architecture that is supposed to run modern versions of
GNOME and related projects needs Rust support and therefore a working LLVM backend.

But Rust is naturally not the only reasons why having an LLVM backend is useful for the m68k
architecture, another very compelling argument is that NetBSD can use a modern C/C++ for their
m68k with a permissive license and the retro-computing and video gaming community gets a compiler
with built-in cross-compiling capabilities which is incredibly valuable for anyone developing
new software for retro-computing platforms like the Amiga, Atari or Sega MegaDrive.

Some of you might remember that LLVM backend for M68k has been brought up in the mailing list sever
times. The latest one was in 2018 [3]. Though those attempts never went through, we learned precious
lessons: It’s important to show who’s behind this backend, how sustainable they are, and how we can
make these changes easy to review.

As I illustrated earlier, majorities of the participants in the M68k community are hobbyists and
non-profit groups. So do the people behind this backend: Currently there are three core members
(CC’ed): Adrian, Artyom, and me. All of us participate in this project as individual contributors.
I know the fact that we’re not supported (financially) by any institution or organization will put
us in a lower hand when it comes to reliability. However, the quality of the backend has improved
quite a lot since the last discussion. We’ve also settled down the code owner / primary maintainer.
Not to mention we’ve been working closely with the rest of the M68k community to help us improve
the testing. On the financial side, we’re trying to open up a donation campaign (e.g. Patreon).
Though that involves many other practical issues so we’re still discussing that. LLVM is an open
and inclusive community accepting contributions from talented people all over the world, regardless
of their backgrounds. I believe this virtue can still be seen in the support of hardware backends,
where each of the targets is judged by its code quality, maintenance, and user base. Rather than
which company supports it.

Very well said. I would like to add here that LLVM can be considered to be one of the most important
open source projects currently in existence and having ones architecture supported by LLVM means that
the language support for that architecture dramatically improves. So, with LLVM supporting m68k, the
architecture will get a significant boost allowing it to reach even wider communities, especially the
Rust community.

This means there will be new software being written for the architecture as we're attracting new
developers. For example, there might be Rust developers interested to develop new games for the
Sega MegaDrive or new software for the Amiga. While this certainly doesn't have much of a big
commercial factor, it definitely has a huge community factor due to the fact the m68k architecture
is so popular among hobbyists.

As for the maintainership: As Min explained we're going to make sure the architecture has a dedicated
maintainer once its in LLVM so it doesn't bitrot. Our idea was that multiple people are joining a
Patreon to pay Min a little sponsoring fee every month to support him with the maintenance effort
so he doesn't have to do the work for free. So while that won't be as professional as someone being
hired to work on LLVM by one of the big players, at least we will have a dedicated maintainer that
is being paid to do the maintenance work.

Last but not the least, on the technical side, we’ve ported the code base onto ToT, and splitted all
the changes into 8 patches, organized by their functioning. I’ll put them onto Phabricator later.
Meanwhile, you can checkout the exact set of patches in our out-of-tree repo:

TableGen related changes.

[M680x0] Base Patch1: TableGen related changes · M680x0/M680x0-mono-repo@5b7d0ef · GitHub
<https://github.com/M680x0/M680x0-mono-repo/commit/5b7d0ef709355f86039a799374867ba59385a79e&gt;

Target-independent codegen changes.
[M680x0] Base Patch 2: Changes in the target-independent CodeGen part · M680x0/M680x0-mono-repo@70a6bae · GitHub
<https://github.com/M680x0/M680x0-mono-repo/commit/70a6baed6afaf5fc0f5137804d130b891d54a255&gt;

Infrastructure for the M680x0 backend.
[M680x0] Base Patch 3: Basic infrastructures and binary formats · M680x0/M680x0-mono-repo@f75435c · GitHub
<https://github.com/M680x0/M680x0-mono-repo/commit/f75435c9b34e7a6f2e4091b8d024b5cc391ee755&gt;

M680x0 target information and spec.
[M680x0] Base Patch 4: Target information · M680x0/M680x0-mono-repo@9535d3d · GitHub
<https://github.com/M680x0/M680x0-mono-repo/commit/9535d3dd55acb892b45f83f85906b8a6a5545f6f&gt;

M680x0 target lowering.
[M680x0] Base Patch 5: Target lowering · M680x0/M680x0-mono-repo@253af82 · GitHub
<https://github.com/M680x0/M680x0-mono-repo/commit/253af82aa396ac5ea928fa2c9a6e31da70448313&gt;

M680x0 supports for MC
[M680x0] Base Patch 6: MC and assembly · M680x0/M680x0-mono-repo@d42bc0e · GitHub
<https://github.com/M680x0/M680x0-mono-repo/commit/d42bc0e355e4911c6aab6468ae12dc9e21072285&gt;

Basic M680x0 support in Clang
[M680x0] Base Patch 7: Clang basic support · M680x0/M680x0-mono-repo@6368937 · GitHub
<https://github.com/M680x0/M680x0-mono-repo/commit/636893780575912973130972cb5fc73153e9cbee&gt;

M680x0 driver support in Clang
[M680x0] Base Patch 8: Clang driver support · M680x0/M680x0-mono-repo@c5834ff · GitHub
<https://github.com/M680x0/M680x0-mono-repo/commit/c5834ffbda019df8c94c669c658d804cb9c19af3&gt;

As you can see, some of the patches also touch some target-independent parts like TableGen.
We tried to minimize their scope, make sure they’re optional and won’t break any existing code.
I’ll justify them in their Phabricator pages, or even open up new threads here on the mailing list.

Much appreciated, Min. I can't thank you enough for your hard work and I'm really excited to
see this going forward. I hope that we will soon be able to get the first bits for m68k merged
soon the same way the first changes for C-Sky are being merged.

I'm keeping my fingers crossed and I hope that our long-time efforts are being rewarded by LLVM upstream :).

Adrian

I’m irrationally thrilled that you’re doing this. The 68K is, to this day, my favorite CPU to code for, probably because 68K-based Macs were bare-metal machines, where application programs ran in full supervisor mode, and the “programmer’s switch” that came with each Mac allows you to interrupt the CPU no matter what it was doing - user code or system code - and drops you into the machine’s debugger right at that point.

Thank you so much for taking this on.

I took a very quick look at your 68K code for main:

int main () {
printf(“Hello World!\n”);
return 0;
}

gcc:
8000044c :
8000044c: 4e56 0000 linkw %fp,#0
80000450: 4879 8000 04fc pea 800004fc <_IO_stdin_used+0x4>
80000456: 4eb9 8000 0330 jsr 80000330 puts@plt
8000045c: 588f addql #4,%sp
8000045e: 4280 clrl %d0
80000460: 4e5e unlk %fp
80000462: 4e75 rts

Your llvm:
8000042c :
8000042c: 2f0e movel %fp,%sp@-
8000042e: 2c4f moveal %sp,%fp
80000430: 9ffc 0000 0010 subal #16,%sp
80000436: 2d7c 0000 0000 fffc movel #0,%fp@(-4)
8000043e: 41fb 0170 0000 00bc lea %pc@(800004fc <_IO_stdin_used+0x4>),%a0
80000446: 224f moveal %sp,%a1
80000448: 2288 movel %a0,%a1@
8000044a: 4eb9 8000 0310 jsr 80000310 printf@plt
80000450: 7200 moveq #0,%d1
80000452: 48ee 0001 fff8 moveml %d0,%fp@(-8)
80000458: 2001 movel %d1,%d0
8000045a: dffc 0000 0010 addal #16,%sp
80000460: 2c5f moveal %sp@+,%fp
80000462: 4e75 rts

This immediately makes me wonder what you mean by “default optimization settings”…

But more importantly, I share Chris’ concern about ABI. In particular, in the Mac world, there were two ABIs in common use:

  1. For code written in pascal, including the Mac OS itself, parameter passing was on the stack, where you reserve space for the return value, then push the parameters in order, and the called function clears up the stack prior to returning to you. If you were to try to generate code for Mac applications, you would have to have some way of using this convention on a per-call basis, because you’d need it in order to make system calls. Early compilers dodged this requirement by having glue libraries that took C conventions and then made the appropriate Pascal calls using assembly, so it’s not necessary for your initial check-in. Just something to keep in mind.

  2. For code written in C, parameter passing was on the stack, where you push the parameters in reverse order, and the called function puts the return value in register d0. It’s the callers responsibility to clean up the stack, in order to support variadic functions such as printf. As Chris says, there were variations on this, even on the Mac, because Apple’s language of choice was originally Pascal so each compiler vendor had a different idea of what “C” conventions were. In particular, early on, many vendors chose a 16-bit size for int, which had important implications for ABI since it changes how you call printf(“%d\n”, 2); Apple’s MPW (Macintosh Programmer’s Workshop) chose a 32-bit size for int.

As I look at this generated code, however, I can’t see what conventions it’s trying to follow.

The area on the stack prior to the call to printf is:

??? ??? ??? ??? ??? ??? 0000 0000 fpfpfpfp
(where “fpfpfpfp” is the old 32-bit value of the frame pointer, and the current frame pointer points to it.)

And the parameter to printf is in both a0 and a1.

The code seems to think a return value from printf is in d0, judging by the moveml instruction.

This all is fine, albeit woefully inefficient, if your calling convention sometimes passes function arguments in registers. Does it?

Separately, please consider using the link and unlk instructions, always. All the production compilers used these instructions, even when optimization was off; the advanced compilers could avoid them in certain cases such as “leaf” routines, but that was rare. They’re efficient and make code shorter and easier to read.

And again, thanks for doing this!

– Jorg

It seems the split into multiple commits isn't correct. For example,
the "patch 3" commit doesn't compile, because it uses TargetInfo,
which is added in patch 4.

Does m68k have multiple assembly syntaxes? Which one does this backend
support? I tried compiling newlib and it fails on the assembly code
with syntax like "moveal %sp@(4),%a0".

Hi All,

I just flushed out the Phabricator reviews. Here are all the patches:

  1. [TableGen][M68K] (Patch 1/8) Utilities for complex instruction addressing modes: CodeBeads and logical operand helper functions (https://reviews.llvm.org/D88385)
  2. [MIR][M68K] (Patch 2/8): Changes on Target-independent MIR part (https://reviews.llvm.org/D88386)
  3. [M68K] (Patch 3/8) Basic infrastructures and changes on object file encoding (https://reviews.llvm.org/D88389)
  4. [M68K] (Patch 4/8) Target information (https://reviews.llvm.org/D88390)
  5. [M68K] (Patch 5/8) Target lowering (https://reviews.llvm.org/D88391)
  6. [M68K] (Patch 6/8) MC supports and assembly (https://reviews.llvm.org/D88392)
  7. [cfe][M68K] (Patch 7/8) Basic Clang support (https://reviews.llvm.org/D88393)
  8. [Driver][M68K] (Patch 8/8) Add driver support for M68K (https://reviews.llvm.org/D88394)

As Nicloas mentioned, you need to compile all 8 patches as a whole. Which I think is not ideal from the point of reviewers. So I’m wondering what is the suggestion on splitting a new target backend?

Thank you!
Min

One of the hardest things to do is to build a community of maintainers around. I used to love those architectures, and I still ran emulators on them, but I never contributed back with code (mainly because it’s not my area of expertise).

But the motorola 68k still is an iconic chip and still has a large breadth of maintainers in other projects. I think it’s reasonably safe to assume we’ll attract some of them into LLVM for the foreseeable future. That would be a big win for us.

I think this, and other topics in the requirements’ list [1] are covered for the 68k.

For now, codegen quality and ABI conformance probably won’t be on par with the target’s requirements (discussion on pascal vs C for example), but that’s a solvable problem. If the code follows the LLVM policies and the maintainers have a clear list of points to address, introducing it as experimental would be a reasonably trivial thing to do.

In theory, a target can remain in “experimental” mode for a while. But the more it does, the harder it gets to keep it working. Basically, the cost of doing that falls almost entirely on the local target’s community while experimental.

It’s not until the target is built by default (leaves experimental status) that the other buildbots start building and testing them, and developers start building it locally and fixing issues before submitting the review.

But the quality has to be “production” by then, so the 68k community in LLVM should really have a plan to remove the experimental tag soon. Maintenance after that reduces to continuous development (new features) and bug fixing and is much more amenable.

Has anyone compiled a list of features that will be added and what’s the timeframe for them? What’s to be done during the experimental phase and afterwards?

cheers,
–renato

[1] http://llvm.org/docs/DeveloperPolicy.html#adding-a-new-target

Hi Renato!

One of the hardest things to do is to build a community of maintainers
around. I used to love those architectures, and I still ran emulators on
them, but I never contributed back with code (mainly because it's not my
area of expertise).

But the motorola 68k still is an iconic chip and still has a large breadth
of maintainers in other projects. I think it's reasonably safe to assume
we'll attract some of them into LLVM for the foreseeable future. That would
be a big win for us.

Yes, I fully agree. To underline how big the community is, let me just share
a short anecdote. Previously, the GCC m68k backend had to be converted from
the CC0 register representation to MODE_CC. We collected donations within the
Amiga and Atari communities and within just a few weeks, we collected over
$6000 which led to an experienced GCC engineer with m68k background to finish
that very extensive work in just a few weeks.

So, I think in case there was a problem with the backend in LLVM, the community
would have enough momentum to work towards solving this issue.

For now, codegen quality and ABI conformance probably won't be on par with
the target's requirements (discussion on pascal vs C for example), but
that's a solvable problem. If the code follows the LLVM policies and the
maintainers have a clear list of points to address, introducing it as
experimental would be a reasonably trivial thing to do.

I'm very happy to hear that :-).

In theory, a target can remain in "experimental" mode for a while. But the
more it does, the harder it gets to keep it working. Basically, the cost of
doing that falls almost entirely on the local target's community while
experimental.

I agree. But we will enable the target in Debian the moment it becomes usable
and we will expose it to as much testing as possible to unconver bugs or remaining
features and report them upstream.

We have made very good experiences in Debian Ports with using experimental software
for production in order to iron out bugs. This way we smashed dozens of bugs in
QEMU, for example which has a very good m68k emulation these days.

It's not until the target is built by default (leaves experimental status)
that the other buildbots start building and testing them, and developers
start building it locally and fixing issues before submitting the review.

I understand, thanks for the clarification. That's why I'll make sure the
backend gets used in Debian as early as possible.

But the quality has to be "production" by then, so the 68k community in
LLVM should really have a plan to remove the experimental tag soon.
Maintenance after that reduces to continuous development (new features) and
bug fixing and is much more amenable.

Gotcha.

Has anyone compiled a list of features that will be added and what's the
timeframe for them? What's to be done during the experimental phase and
afterwards?

We have a rough list of remaining issues in [1] and [2]. Min also gave a talk
in [3] where he drafted out the TODO and plans for the backend [3]. The talk
is also available on Youtube [4].

Adrian

This is problematic for regression testing and bisecting to find a broken commit. The guideline is to split the patch into multiple commits, but that each commit builds on their own.

They don’t need to have tests in between and be fully functional, but they need to compile and not to break anything.

Usually (and very generally) people break new targets into a few steps:

  1. Adds the new directory, base table-gen files, hooks on CMake, etc.
  2. Adds target description in table-gen (registers, etc) and operations. [this could be split in two if large]
  3. Adds codegen hooks from MIR and gets some initial “hello world” compiled, add tests for the features added

Steps 1 and 2 should not break anything else. Step 3’s tests should all pass and not break other tests. There could be more commits than 1 per step, but not breaking build/tests nor depending on future patches.

You should also have a buildbot with the target enabled (there’s a CMake flag for that) to show that it’s green most of the time.

Optional but nice, you could have some kind of testing on “hardware” (which could be an emulator). Does QEMU have user emulation for 68k?

You don’t need to add all hardware features nor all operations in the first patch-set, but it’s nice if the first wave can compile a hello world program.

That’s where the plan comes in. We usually ask the new community to provide a roadmap of what features will come in which order, so that we can help with the plan and be prepared to review the code. It also gives us more confidence that the community is serious about adding a production target, not just a toy target.

cheers,
–renato

So, I think in case there was a problem with the backend in LLVM, the community
would have enough momentum to work towards solving this issue.

Great!

I agree. But we will enable the target in Debian the moment it becomes usable
and we will expose it to as much testing as possible to unconver bugs or remaining
features and report them upstream.

That’s good to hear. The Debian project has helped us do extensive tests in other hardware and it provided us with confidence that what we build actually works in some real world context.

We have a rough list of remaining issues in [1] and [2]. Min also gave a talk
in [3] where he drafted out the TODO and plans for the backend [3]. The talk
is also available on Youtube [4].

So, IIUC, the current implementation is reasonably complete. You’re able to compile C programs and run them on real hardware. The main effort now is to upstream what you have, and continue the development.

This would make the “plan” easier. Getting the current state upstream would make for a nice experimental target. Getting Debian packages compiled and tested with QEMU would demonstrate production quality.

cheers,
–renato

Hi Renato,

Thanks for all your feedback, those were extremely helpful, especially the guidelines to split the patches. I think in my case, patch 3 ~ 6 are the most problematic, I will rework them shortly.

And most importantly, I’ll present a roadmap regarding blockers we need to clear and milestones to reach before graduating from experimental targets. We’ll also try to prepare the buildbot, as well as testing bots running m68k QEMU (IIUC, we need to prepare our own server…right?)

Best,
Min

Thanks for all your feedback, those were extremely helpful, especially the guidelines to split the patches. I think in my case, patch 3 ~ 6 are the most problematic, I will rework them shortly.

Perfect, thanks!

And most importantly, I’ll present a roadmap regarding blockers we need to clear and milestones to reach before graduating from experimental targets.

Great!

This would be slightly different than the Github issues, and should be focused on the two milestones: adding the backend as experimental (ie, what you have today) and moving to production. Both should have a list of features that are expected to exist / be implemented and some rough time frame to get there.

Given you already have what looks like a working back-end (maybe not production yet, but), I’m expecting the experimental target to be more in shape than some others we had recently.

You should, however, expect more reviews on policy (style, patterns, containers), due to the previous separate nature of the development. This is to make sure the code goes in in a way that everybody else understands and can change. Don’t take this as a measure of code quality, only style adjustment to the new community.

We’ll also try to prepare the buildbot, as well as testing bots running m68k QEMU (IIUC, we need to prepare our own server…right?)

Right.

During the experimental phase, none of the other buildbots will be building your target, so you must make sure at least one is. This can be running on any hardware (x86, arm, etc) but needs to be building the experimental back-end and running all its tests (check-all).

Once the target moves on to production, most other bots will be building it and testing, so you don’t need that simple builder any more. But you can have builders with different configurations that aren’t built by any other builder out there, to increase coverage and trust in your back-end.

You should also eventually try to run the LLVM test-suite on the target. I’m not sure it’s possible to run all, due to the aged nature of the 68k, or if the tests will produce the same results (we had some trouble on Arm vs x86, for example). But trying to run them and understanding the issues gives you a lot of good information about your back-end.

The Debian tests are similar to the test-suite. Their builds will be slower than a standard bot cycle and so will end up queueing a lot of commits. It will be harder to investigate the issues, you’ll need longer bisects, codegen tests, etc.

All buildbots should initially be in the “silent master” (ie no email warnings when they break), and your community should monitor them closely.

Once the target leaves experimental, and we expect all bots to be mostly green by then, you can move the builders to the “loud master”, which will start nagging people when they break the build.

The expectation is that, by then, things will be stable enough (one of the criteria to move to production) that the noise will be no more annoying than any other buildbot, and consisting mostly of true positives.

cheers,
–renato

Hi Renato!

This is problematic for regression testing and bisecting to find a broken
commit. The guideline is to split the patch into multiple commits, but that
each commit builds on their own.

They don't need to have tests in between and be fully functional, but they
need to compile and not to break anything.

I fully agree.

Usually (and very generally) people break new targets into a few steps:
1. Adds the new directory, base table-gen files, hooks on CMake, etc.
2. Adds target description in table-gen (registers, etc) and operations.
[this could be split in two if large]
3. Adds codegen hooks from MIR and gets some initial "hello world"
compiled, add tests for the features added

Steps 1 and 2 should not break anything else. Step 3's tests should all
pass and not break other tests. There could be more commits than 1 per
step, but not breaking build/tests nor depending on future patches.

Thanks for the guide. I guess that should be easy to follow, and I agree,
there shouldn't be single commits that break the build.

You should also have a buildbot with the target enabled (there's a CMake
flag for that) to show that it's green most of the time.

Optional but nice, you could have some kind of testing on "hardware" (which
could be an emulator). Does QEMU have user emulation for 68k?

Yes, there is qemu-user support and we're heavily relying on it within Debian
for building packages. It has some very minor issues with atomics (which aren't
present in qemu-system) and FPU support could be better, but overall it's absolutely
battle-tested.

You don't need to add all hardware features nor all operations in the first
patch-set, but it's nice if the first wave can compile a hello world
program.

That already works fine, see:

Re: LLVM for m68k completed (but not merged)

That's where the plan comes in. We usually ask the new community to provide
a roadmap of what features will come in which order, so that we can help
with the plan and be prepared to review the code. It also gives us more
confidence that the community is serious about adding a production target,
not just a toy target.

We're definitely serious about this as otherwise we wouldn't have tried so
hard for the past years to get the development move forward and get the
backend accepted upstream.

Thanks so much for your advise, much appreciated!

Adrian

That's good to hear. The Debian project has helped us do extensive tests in
other hardware and it provided us with confidence that what we build
actually works in some real world context.

And, FWIW, please let us know whenever you need help with testing LLVM on
less common targets. I have access to every architecture supported by
Debian and own machines with most of these targets.

We also try to provide as many different architectures through the GCC compile
farm that anyone can use who is developing open source software:

CompileFarm - GCC Wiki

So, whenever you need assistance with architectures like SPARC, please let
us know :).

We have a rough list of remaining issues in [1] and [2]. Min also gave a

talk
in [3] where he drafted out the TODO and plans for the backend [3]. The
talk
is also available on Youtube [4].

So, IIUC, the current implementation is reasonably complete. You're able to
compile C programs and run them on real hardware. The main effort now is to
upstream what you have, and continue the development.

Yes, that would be awesome.

This would make the "plan" easier. Getting the current state upstream would
make for a nice experimental target. Getting Debian packages compiled and
tested with QEMU would demonstrate production quality.

Sounds good.

Adrian

Hi Renato!

During the experimental phase, none of the other buildbots will be building
your target, so you must make sure at least one is. This can be running on
any hardware (x86, arm, etc) but needs to be building the experimental
back-end and running all its tests (check-all).

Once the target moves on to production, most other bots will be building it
and testing, so you don't need that simple builder any more. But you can
have builders with different configurations that aren't built by any other
builder out there, to increase coverage and trust in your back-end.

So, shall we setup a server for that or is there some existing infrastructure
from LLVM that is used in this case?

You should also eventually try to run the LLVM test-suite on the target.
I'm not sure it's possible to run all, due to the aged nature of the 68k,
or if the tests will produce the same results (we had some trouble on Arm
vs x86, for example). But trying to run them and understanding the issues
gives you a lot of good information about your back-end.

As far as I understand, 90% of the tests already pass (according to Min).

The Debian tests are similar to the test-suite. Their builds will be slower
than a standard bot cycle and so will end up queueing a lot of commits. It
will be harder to investigate the issues, you'll need longer bisects,
codegen tests, etc.

All buildbots should initially be in the "silent master" (ie no email
warnings when they break), and your community should monitor them closely.

Once the target leaves experimental, and we expect all bots to be mostly
green by then, you can move the builders to the "loud master", which will
start nagging people when they break the build.

The expectation is that, by then, things will be stable enough (one of the
criteria to move to production) that the noise will be no more annoying
than any other buildbot, and consisting mostly of true positives.

Great! Thanks again for all the very valuable input. We'll move forward getting
infrastructure up and running.

Adrian

So, shall we setup a server for that or is there some existing infrastructure
from LLVM that is used in this case?

Unfortunately, we don’t have a centralised infrastructure like GCC. Each target community is responsible for maintaining their own buildbots.

All we provide is the “build master”, which aggregates all builds, email when there are regressions, etc.

http://llvm.org/docs/HowToAddABuilder.html

It should be trivial to copy & paste an existing bot config and change to add the experimental hardware (once it’s in the tree).

Pay attention to the version of buildbot you install, as using an even slightly different version can cause weird errors.

(I’m sure you understand how many times we tried to move on for the past 10 years… :slight_smile:

As far as I understand, 90% of the tests already pass (according to Min).

Awesome! We have buildbots with test-suite running, you can copy those, too. There were some QEMU bots in the past, I’m not sure they’re around, but the infra to run them should still be there.

cheers,
–renato

Hi All,

I’ve composed a draft roadmap for this new target. I’ve decided to try Github’s “Projects” feature, as it provides a clearer view to see all the blockers and action items, IMAO. Here is the link:
https://github.com/M680x0/M680x0-mono-repo/projects

Currently I only created two major milestones: Becoming an experimental target and becoming an official target. For each milestone, I’ve listed the expected features, estimated time frame (though I’m not really confident on that), and most importantly, the blockers for the milestone.
Fortunately, all of our essential features are complete (e.g. ISel, MC), so there will be more house cleaning tasks and bug fixing than adding new features in the second milestone.

In addition to the aforementioned two milestones, I’ve added “be able to run toy programs” as a separate milestone to accommodate some more urgent tasks right now. More specifically, driver problems that make Clang unable to leverage all our components.

Thank you!
Min