The description size is often much more compact, but I haven’t measured actual generated code/data size wrt tablegen. Tablegen generates a lot of tables, and so does the MDL compiler. In general, I believe the MDL “database” is simpler, but I don’t actually know at this point if its bigger or smaller.
Regarding schedule quality: the explicit goal was to be able to handle more complex architectures, and we really haven’t done anything to improve the schedulers. For the current targets, the goal is for it to be the same, with a few caveats:
We’re starting with the same information tablegen has, so in effect we’re just presenting similar information to the schedulers, albeit in perhaps a different order.
The forwarding information in tablegen isn’t exactly complete. The MDL has a general language for describing forwarding information, and its difficult to convert tablegen’s incomplete description to our generated description. So currently we’re mostly ignoring forwarding information when scraping. (This doesn’t affect most current targets).
So, providing complex heuristics with the ~same information in a slightly different order inevitably leads to minor scheduling differences. That said, the majority of the lit tests (around 51000) pass. We’re still investigating the others. The performance tests differences are below the noise threshhold. This is pretty much what we expected.
MDL generates descriptions of targets that can be 1/10 of the size of equivalent TableGen descriptions
Is there a reason why we cannot improve TableGen in the areas that MDL is better? For example, I wonder whether we could look to MDL as inspiration to improve TableGen to generate smaller descriptions?
Schedules and Itineraries are simply not sufficient to model everything.
Schedules and Itineraries are built using TableGen, and it sounds to me like we may need more constructs, or even need to start with a new set of constructs, to model everything. I am curious to learn more about the shortcomings of the expressiveness of TableGen compared to MDL. Is the problem that TableGen cannot express everything that needs to be modeled, or that we haven’t used TableGen to model everything that needs to be modeled?
First I should clarify my first statement: MDL descriptions of scheduling information can be 1/10 the size of equivalent TableGen Descriptions. We’re definitely not trying to replace all of TableGen.
That could probably be done, and certainly I considered it early on, but I didn’t see a way to do it in a way thats easy to understand, expand, and reuse. But here’s an example of a problem that might be hard to describe in TableGen: instructions that have different behaviors depending on what functional unit (or functional unit cluster) they issue on - different latencies, different reserved resources, different register constraints, different issue rules. This is actually quite common on VLIW processors. The only way I’ve seen to model this in LLVM is to create instances of an instruction for each unique behavior.
If you haven’t seen it, check out llvm/docs/Mdl/MachineDescriptionNotes.md. It documents the language, and a lot of the design decisions that went into it. The tool that generates MDL from TableGen (called TdScan) only uses a subset of the language for all the upstream targets, so while its useful to look at those (they’re built in build/lib/Target/*/*.mdl) they don’t necessarily demonstrate all the power of the language. Happy to answer any questions you have!
@reidtatge I am trying to write an LLVM backend and am actually stuck at modelling the architecture (custom) with tablegen. It is an accelerator kind of architecture. Would you suggest that the entire backend can be prepared using MDL?
So first a question: are there particular things that you can’t model in Schedules/Itineraries? Is your architecture public?
MDL can express anything that Schedules and Itineraries can express, but is really a superset of both (so it models things you can’t model in the others). So if you cannot express it in Schedules/Itineraries, you may be able to describe it in MDL. You might check out the documentation in the repo (still in Phabricator at ⚙ D158790 [MDL] First full integration of MDL with LLVM) under llvm/docs/Mdl/.
That said, the current llvm backend is pretty limited in the things it can do for accelerator architectures, so it may not be quite ready to support some of your architecture’s features. But hopefully you can express them and they can be reflected in the MDL database. If not, we can always extend the language. Then you can write back-end passes using the description to guide things.
The architecture is proprietary, but, I can tell you this - it is much smaller and quite different from what would be called a CPU architecture - hence I called it an accelerator architecture. I am currently having trouble modelling all aspects in the TableGen files. So, can MDL help me model most aspects? Since you have used it for TPU which I believe is much more complicated, I guess it can be used (without involving TableGen)?
@reidtatge Have you looked at modelling frontend decode/dispatch or opcache/dispatch at all? Most x86 cpus have various/diverse restrictions, such as limits on the variable length instruction decoding, only decoding >8-byte instructions in the first slot, limits on what instructions/ops/uops can fit into an opcache line etc.
We provide all the stuff Tablegen provides for issue management (BeginGroup, EndGroup, SingleIssue, RetireOOO). This allows us to easily convert Tablegen to our language, and feed that information back to existing passes unchanged. Its interesting that X86 doesn’t use any of these attributes.
In general, we tried to avoid defining language features tied to specific kinds of processor restrictions like those you mention. Instead, we wanted to provide language to allow you to describe those in declarative terms. For issue slot management, in MDL you can:
Tie instructions to specific functional units (like in tablegen)
Explicitly define issue slots (give them names)
Restrict functional units to specific issue slots.
We don’t currently model cache behaviors, and the LLVM backend doesn’t seem to model that, but this is “on the agenda” for future versions of the language.
Variable length encoding is another area that LLVM doesn’t seem to directly model today. I agree its something worth knowing in the back-end. I tend to think of it more as an “instruction modeling” issue, rather than a micro-architectural feature (which MDL is focused on), but clearly it would have implications in the back-end.
aah!.. precisely what I was thinking about asking… but if MDL doesn’t handle that and LLVM Backend doesn’t explicitly handle that then how do Intel backends handle the L i caches etc. in the generated code?