I passed your patches to the group. I see a good relationship potential here. It is nice to have more folks that are genuinely interested and concerned about the Mips development on LLVM.
It is clear that areas like the standalone assembler and tablegen need to be reworked to work with an old architecture like Mips. Decisions on how we implemented things were often based on how others had done their ports. This got us to the point of having a working Mips compiler, but a review of the architecture from a higher level needs to be redone.
I’m keen to get the bugfixes you mention upstreamed. Some of my colleagues have had a look at your git repo and tell me that some of the bugfixes lack testcases but should otherwise be ok to upstream. It also sounds like you have working MIPS-IV support. If this is the case then I’m keen to get that upstreamed as well since it will go some way towards properly supporting the older ISA’s (MIPS-II and MIPS-III in particular) in use by some Linux distro’s, OpenBSD, etc.
Some have test cases, but the lack is the main reason why I haven’t pushed them upstreamed yet. We have to demo our platform to DARPA in early January, so my current priority is getting it working enough for the demo code to build, even if that requires some quite embarrassing hacks. After that deadline is passed, I intend to tidy up the code and separate out all of the generic-MIPS parts from the parts specific to our CPU.
We do have MIPS IV working, as our base processor (BERI, which we’ve almost finished the paperwork required to open source and should be pushing our very soon) is a MIPS IV implementation. I’ve not poked at MIPS III, but I now have a Loongson 2F to play with, so I may have a poke at some point.
This would have been easier if a more systematic approach had been taken to the instruction definitions, starting with an ISA reference and defining them, along with their minimum version, and then working on patterns once the assembler was working.
Maybe our codegen folks can comment on this. Review and refactoring of code is good. Especially if it makes it easier and cleaner to extend going forward.
Yes, all patches are reviewed either before or after commit. A wider review of the backend would be a sensible thing to do at some point in the near future. However, I have to balance the desirability of a perfect design and implementation with impracticality of achieving it and the business needs of our company.
Some design decisions, such as the way 64-bit registers are treated as 64-bit registers with 32-bit subregisters rather than as registers capable of containing 64-bit or 32-bit subvalues have complicated things a lot. Some of this is simplified, but I still find I have to duplicate patterns or add some hacky code because SelectionDAG decides that something should be an i32 and then it’s regarded as a different register to something that takes an i64.
Guilty. Ideas on making it better?
Lots of things look like they’re work-arounds for TableGen or SelectionDAG limitations, but don’t document in the code what these limitations are.
Yes, TableGen needs to be reworked. Who is going to do it?
I will therefore need to work this around existing schedules and deadlines without disrupting them. At the moment, my aim is to free up developer-time from style issues in patches. The time saved in this area is likely to be significant since our team is split between three/four timezones and as a result discussions about patches can take multiple days.
I’m also keen to hear specific criticisms from outside our team. Could you elaborate on the issues you mention?
There are almost no comments in the code. Things like the expansion of JALR rely on some magic and some documentation would be very helpful.
Guilty, guilty, guilty. It is getting better under Daniel’s benevolent gaze, but much of the existing code is under commented.
There have been some very odd implementation choices, for example deciding to special-case hardware register 29, when it is actually less code to support all 31 hardware registers.
There are loads of test cases for [d]la with immediate arguments, but [d]la is almost always used with a symbol as the argument and this case wasn’t working.
No effort has been made over the course of development to ensure that we get the same output via the assembler as we do via direct object code emission. This is a very simple and obvious sanity check, but the back end can’t consume its own output, which means that bugs are very easy to slip in.
This is partially because the base AsmParser along with its autogenerated code did not fit the Mips models of macros and register types. This has been complicated by the desire of the llvm for us to not have llc (integrated assembler) direct object tests. The insistence has been to test it with llvm-mc (standalone assembler) which as you pointed out sadly lags behind the integrated assembler.
Do we stop development of a product that can be used (integrated assembler) in order to rebuild the standalone assembler? This is a business decision. I think your project required some expediency as does ours. In the long or medium run we need both, but initially we needed something that works or else you could not have used it in the first place.