[GlobalISel] A Proposal for global instruction selection

Hi,

With this email, I would like to kick-off the development for the next instruction selector that I described during the last LLVM Dev’ Meeting.
For the motivations, see Jakob’s proposal (http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-August/064727.html) and for the proposal, see the slides (Keynote: http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2015-10/slides/Colombet-GlobalInstructionSelection.key?view=co or PDF: http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2015-10/slides/Colombet-GlobalInstructionSelection.pdf?revision=252430&view=co) or the talk (https://www.youtube.com/watch?v=F6GGbYtae3g&list=PL_R5A0lGi1AA4Lv2bBFSwhgDaHvvpVU21&index=2).

TL;DR This is happening now, feedbacks invited!

*** Context ***

During the last LLVM Dev’ Meeting, I have presented a proposal for the next instruction selector, GlobalISel. The proposal is basically summarized in "High Level Prototype Design” and “Roadmap”. (If you want further details, feel free to reach me.)

The first step of the development plan is to prototype the new framework on open source. The idea is to start prototyping now(!) and have the discussion ongoing in parallel. The reason of such approach is to have code that can be used to inform those discussions, e.g., by collecting data and trying different designs approaches. Regarding the discussion, I have listed a few points where your feedbacks would be particularly appreciated (see Feedback Invite).

Also, as I have mentioned in my talk, some issues are controversial but I expect them to be resolved during prototype development. Specifically theses concern aspects of legalization (should parts of it be done at the LLVM IR level or all at the MI level?) and code re-use for instruction combiner. Please feel free to bring up your specific concern as I move along with the development plan.

I expect the design to evolve with our experimental findings and your feedbacks and contributions.
Nonetheless, we expect to nail down some design decisions once and for all as the prototype progresses. I have highlighted them with the following pattern [final].

*** Feedback Invite ***

If you follow and support this work you need to be aware of three things and I am eager to hear your feedback and thoughts about them: the overall goals of Global ISel, the goals of the prototype, and the impact of the prototype work on backend design.

In the section “Goals", I defined (repeated for people that saw the talk) the goals for the Global ISel design.

  • Do you see anything missing?
  • Do you see something that should not be there?

The prototype will answer critical design questions (see “Design Questions the Prototype Addresses at the End of M1" for examples) before the actual design of Gobal ISel is finalized, but it cannot cover everything.
Specifically we will not look into improving TableGen or reuse InstCombine (see “ Proposed Approach” for the rational). Please let me know if you see any issue with that.

There is also basic ground work needed to prepare for Global ISel and I need to extend the core MachineInstr-level APIs as explained during the talk. For this, I prepared sketches of patches to illustrate them and describe the details in the “Implications” section below. Please have a look at the patches to have a better idea of the expected impact.

If there is anything else you want to discuss related to Global ISel feel free to reach me. In particular, several people expressed their interests during the LLVM Dev Meeting in contributing to the project. Let me know what is your area of interest, so that we can coordinate our efforts.
Anyhow, please add [GlobalISel] in the subject line to help categorizing the emails.

*** Goals ***

The high level goals of the new instruction selector are:

  • Global instruction selector.
  • Fast instruction selector.
  • Shared code path for fast and good instruction selection.
  • IR that represents ISA concepts better.
  • More flexible instruction selector.
  • Easier to maintain/understand framework, in particular legalization.
  • Self contained machine representation, no back links to LLVM IR.
  • No change to LLVM IR.

Note: The goals are common to all targets. In particular, we do not intend to work on target specific feature for the prototype.
The bottom line is please make sure those goals are compatible with what you want to achieve for your target, even if your requirement does not get listed here.

*** Proposed Approach ***

In this section, I describe the approach I plan to pursue in the prototype and the roadmap to get there. The final design will flow out of it.

For this prototype, we purposely exclude any work to improve or use TableGen or InstCombine [final]. We will keep in mind however, that some of the C++ code we write will be table-generated at some point.
The rational is that we do not want to lay down a new TableGen/InstCombine infrastructure before being able to work on the ISel framework itself.

The prototype vehicle will be AArch64. None of the changes for GlobalISel will negatively impact the existing ISel.

** High Level Prototype Design **

As shown in the talk, the expected pipeline for the prototype is:
LLVM IR → IRTranslator → Generic (G) MachineInstr → Legalizer → RegBankSelect → Select → MachineInstr

Where:

  • Terms in bold are intermediate representations.
  • Generic MachineInstrs are machine instructions with a generic opcode, e.g., ADD, COPY.
  • IRTranslator: Translate LLVM IR to (G) MachineInstr.
  • Legalizer: Legalize illegal (G) MachineInstr to legal (G) MachineInstr.
  • RegBankSelect: Assign virtual register with size to virtual register with Register Bank.
  • Select: Translate the remaining (G) MachineInstr to MachineIntr.

** Implications **

As part of the bring-up of the prototype, we need to extend some of the core MachineInstr-level APIs:

  • Need to remember FastMath flags for each MachineInstr.
  • Need to know the type of each MachineInstr. We don’t want ADD8, ADD16, etc.
  • Extend the MachineRegisterInfo to support size as well as register classes for virtual registers.

I have sketched the changes in the attached patches to help picturing how the changes would impact the existing APIs.

Note: I do not intend to commit those changes as they are. They will go the usual review process in due time.

The patches contain “// ***”-like comment that give a rough explanation on why those changes are needed w.r.t. the goals.
The order of the patches could be modified since the dependencies between those are not sequential. Anyhow, here are the patches:

  1. Introduce (some of) the generic opcode.
  2. Make MachineFunction more independent of LLVM IR to eventually be able to delete the LLVM IR instance from the memory.
  3. Extend MachineInstr to represent additional information attached to generic opcode.
  4. Teach MachineRegisterInfo about size for virtual registers.
  5. Introduce a helper class to build MachineInstr related objects.
  6. Add new target hooks to lower the ABI directly to MachineInstr.
  7. Introduce the IRTranslator pass.

** Roadmap for the Prototype **

We plan to split the prototype in three main milestones:

  1. Translation: LLVM IR to (G) MachineInstr translation.
  2. Basic selector: Legal LLVM IR to target specific MachineInstr.
  3. Simple legalization: Support scalar type legalization and some vector instructions.

Notes:

  • For #1, we will not support any fancy instructions like landing pad or switch.
  • Each milestone should take about 3-4 months.
  • At the end of #2, we would have a FastISel like selector.

Each milestone will be detailed right before starting it. The rational is that we want to accommodate what we discovered with the prototype for the next milestone. In other words, in this email, I only describe the first milestone in detail and I will give more details on the next milestone shortly before we start it and so on. For your information, here is the remaining of the intended roadmap for the full project:
4. Productization: Clean up implementation, stabilize the APIs.
5. Complex legalization: Extend legalization support to everything missing.
6. Completeness: Fill the blanks, e.g., landing pad.
7. Clean-up and performance: Add the necessary bits to be at parity or beat SelectionDAG generated code.
8. Transition: Document how to switch, provide tools to help.

** Milestone 1 **

The first phase is focused on the IRTranslator pass.

The IRTranslator is responsible for translating the LLVM IR into Generic MachineInstr. The IRTranslator pass uses some target hooks to perform the ABI lowering. We can either define a new API for them, e.g., ABILoweringInfo, or extend the existing TargetLowering.
Moreover, the prototype will focus on simple instruction, i.e., we will not support switch or landing pad for this iteration.

At the end of M1, the prototype will not be able to produce code, since we would only have the beginning of the Global ISel pipeline. Instead, we will test the IRTranslator on the generic output that is produced from the tested IR.

  • Design Decisions *
  • The IRTranslator is a final class. Its purpose is to move away from LLVM IR to MachineInstr world [final].
  • Lower the ABI as part of the translation process [final].
  • Design Questions the Prototype Addresses at the End of M1 *
  • Handling of aggregate types during the translation.
  • Lowering of switches.
  • What about Module pass for Machine pass?
  • Introduce new APIs to have a clearer separation between:
  • Legalization (setOperationAction, etc.)
  • Cost/Combine related (isXXXFree, etc.)
  • Lowering related (LowerFormal, etc.)
  • What is the contract with the backends? Is it still “should be able to select any valid LLVM IR”?

Thanks,

-Quentin

0001-Extend-generic-opcodes-to-be-able-to-represent-the-i.patch (3.67 KB)

0002-Pull-more-of-the-LLVM-IR-function-representation-int.patch (1.81 KB)

0003-Extend-MachineInstr-to-supply-more-information-regar.patch (2.21 KB)

0004-Teach-MachineRegisterInfo-about-size-for-virtual-reg.patch (3.05 KB)

0005-Introduce-a-MachineIRBuilder-to-gather-all-the-Machi.patch (2.54 KB)

0006-Add-new-target-hooks-to-be-able-to-lower-the-ABI-rig.patch (2.77 KB)

0007-Introduce-the-IRtranslator-pass-for-GlobalISel.patch (7.67 KB)

Hi Quentin,

I’m really excited to see this happening!

My major question is over the testing story for this. How are we going to write unit tests for GIR? Are you intending to leverage the LIR lowering that noone is using yet? Will you be using unit/LIT tests right from the start, or adding them in later?

Cheers,

James

Hi Quentin,

Hi James,

Hi Quentin,

I’m really excited to see this happening!

My major question is over the testing story for this. How are we going to write unit tests for GIR?

Thanks for bringing that up!
That is a very good question and also one that will require a lot of work to address properly.

Ultimately, I’d like we are able to write unit tests directly in the MachineInstr representation. Part of the goal of making the IR self contained, i.e., with no back links to LLVM IR, is to make the testing easier.

Now, to answer the question on how we do that, I have a pragmatic answer, though I am not proud of it:
We are going to write unit tests with LLVM IR as input and check the MI output of the pass, e.g.,with print-after=IRTranslator.

That’s not great, but at least we can test now!

Are you intending to leverage the LIR lowering that noone is using yet?

That’s a tricky question because I do not intend to work on this in the prototype timeframe and I am not fond of the way this testing works.
However, yes, I believe that we need to redevelop or leverage the LIR lowering for this purpose. Actually, I was looking for volunteers to work on that during the prototype timeframe, so that we have everything we need when we productize the new framework.

Interested? :stuck_out_tongue:

Note: My main concern is that is uses a YAML format, i.e., we cannot dump the output of a machine function and feed into it.

Will you be using unit/LIT tests right from the start, or adding them in later?

Definitely right from the start, with the “output” method I mentioned.
The hope is that a “LIR lowering” like mechanism will be developed along the way and we can migrate tests to the new format when it is ready. If we carefully design this "LIR lowering” format, we may just have to change the RUN line :).

Thanks,
-Quentin

We do have the .mir dumping and reading. To me that code looks like it basically works and just might need a bug fix here and there. Should be the right thing to use when starting a new project like this, shouldn't it?

- Matthias

Thanks Quentin for the effort in putting this together!!

I’m super excited in seeing this going forward and I’m looking forward in helping in bringing GlobalISel up as much as I can as it is very promising for our targets!
It also catched my eye that you mentioned the possibility of having Module level Machine passes.
Having that would simplify some parts of our pipeline for example I believe as now we are using some hacks to obtain basically the same result at the very end of the pipeline.
This would be useful at least for us!

Marcello

Hi James,

Hi Quentin,

I’m really excited to see this happening!

My major question is over the testing story for this. How are we going to write unit tests for GIR?

Thanks for bringing that up!
That is a very good question and also one that will require a lot of work to address properly.

Ultimately, I’d like we are able to write unit tests directly in the MachineInstr representation. Part of the goal of making the IR self contained, i.e., with no back links to LLVM IR, is to make the testing easier.

Now, to answer the question on how we do that, I have a pragmatic answer, though I am not proud of it:
We are going to write unit tests with LLVM IR as input and check the MI output of the pass, e.g.,with print-after=IRTranslator.

That’s not great, but at least we can test now!

We do have the .mir dumping and reading. To me that code looks like it basically works and just might need a bug fix here and there. Should be the right thing to use when starting a new project like this, shouldn’t it?

That’s the thing, I don’t want to spend time writing mir code. I basically want to be able to take the machine code I expect and add CHECK lines without going through the yaml formatting business.

For just the translator, the story might be different though, because it’s LLVM IR to MI, so maybe that’s already well supported.
Then, I haven’t followed the mir stuff closely to know whether or not it would work with “a bug fix here and there”. For instance, I don’t know how it gets the opcode for the parser, i.e., how automatic/easy it is to add the generic opcode, plus we will have to teach it how to deal with type on instructions and so on.

My impression was that it was sufficiently lacking usability so that a fresh start would make sense (basically I see it as a prototype), but you may well be right. Again, I don’t plan to look into it for the prototype timeframe, but if you are, by all means!

Cheers,
-Quentin

Hi David,

Hi Quentin,

In the section “Goals", I defined (repeated for people that saw the talk) the goals for the Global ISel design.
- Do you see anything missing?
- Do you see something that should not be there?

I really like the design that you outlined. I have one very small request:

Please maintain pointers as a distinct type from integers for as long as possible. We currently have some patches in SelectionDAG to add some pointer-specific operations, as in our architecture the operations valid on pointers are not the same as those valid on integers (and pointers are not the same size as integers). Your proposed model looks like it would be *much* easier for us to use as long as that constraint is kept. Various systems with different integer and address registers hit the same problem as us.

I understand the problem, but I feel like Jakob back in the day:
http://lists.llvm.org/pipermail/llvm-dev/2013-August/064734.html
http://lists.llvm.org/pipermail/llvm-dev/2013-August/064760.html

To summarize with my own words and feelings that gives:
To me the pointer/integer distinction is a way for you to specify the register classes you want. This is something the RegBankSelect pass will do for you and this distinction should not be necessary to produce efficient or correct code.

If that doesn’t work, you should be able to have target specific pass to select what you want directly after the translation or with a custom translation. One can envision some kind of IRTranslationKit that has all the generic translation build into to help you in such case.

Anyway, the good point with the prototype is that we will be able to experiment these things :).

Given the way that you’re proposing to do legalisation, this seems like it should be easy (for most architectures, assigning pointers to the same register bank as integers will be a simple choice and then all of the later selection should be the same).

On a related note, keeping pointer address spaces around in the machine IR would make things easier for us and, I think, some of the GPU folks.

Good point, this is also something that the MachineInstr should also expose as part of the make the IR self contained.

Thanks,
-Quentin

Hi Marcello,

Thanks Quentin for the effort in putting this together!!

I’m super excited in seeing this going forward and I’m looking forward in helping in bringing GlobalISel up as much as I can as it is very promising for our targets!
It also catched my eye that you mentioned the possibility of having Module level Machine passes.
Having that would simplify some parts of our pipeline for example I believe as now we are using some hacks to obtain basically the same result at the very end of the pipeline.
This would be useful at least for us!

Good to know!
Right now, I was considering it for the “LLVM IR → MachineInstr” translation because if we want to go all the way down everything in MachineInstr, we need to lower the global variables as well, and this conceptually does not fit into a function-like pass.
Knowing that there are other users of that sounds like it would indeed by good to have it!

Thanks for your feedbacks!

Cheers,
-Quentin

As long as the pointer vs integer distinction is preserved until the RegBankSelect stage, then that will work for us. The problem with the current SelectionDAG ordering is that ‘what integer type do you use to represent pointers?’ is the first question that the generic CodeGen infastructure asks the back end during type legalisation, and the information is then gone (unless you add new MVTs, as we’ve had to do). If the initial lower preserves the difference between pointer-in-address-space-X and i64, and we are allowed register banks that overlap for final register allocation (which is almost certainly needed for less exotic use cases anyway), then this scheme would be *lot* easier for us to work with than the existing CodeGen infrastructure.

David

From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of David Chisnall via llvm-dev

[..]

As long as the pointer vs integer distinction is preserved until the RegBankSelect stage, then that will work for us.
The problem with the...

[..]

+1

Greetings,

Jeroen Dobbelaere

Hi David,

To summarize with my own words and feelings that gives:
To me the pointer/integer distinction is a way for you to specify the register classes you want. This is something the RegBankSelect pass will do for you and this distinction should not be necessary to produce efficient or correct code.

If that doesn’t work, you should be able to have target specific pass to select what you want directly after the translation or with a custom translation. One can envision some kind of IRTranslationKit that has all the generic translation build into to help you in such case.

Anyway, the good point with the prototype is that we will be able to experiment these things :).

As long as the pointer vs integer distinction is preserved until the RegBankSelect stage, then that will work for us. The problem with the current SelectionDAG ordering is that ‘what integer type do you use to represent pointers?’ is the first question that the generic CodeGen infastructure asks the back end during type legalisation, and the information is then gone (unless you add new MVTs, as we’ve had to do). If the initial lower preserves the difference between pointer-in-address-space-X and i64, and we are allowed register banks that overlap for final register allocation (which is almost certainly needed for less exotic use cases anyway), then this scheme would be *lot* easier for us to work with than the existing CodeGen infrastructure.

I must miss something, but I don’t get what is the problem of lower the pointer to actual integer.
As far as I can tell, what you want is to do some operation with some integers. The fact that those are used as pointer or integer is orthogonal IMO.
What you really want is to make the best use of your instruction set, meaning that if computing some pointer operations on the integer ISA is more efficient, and vice-versa, this is what we want to do.

The address space information is only relevant when you actually access the address, i.e., on memory operation, right?

What am I missing?

Cheers,
-Quentin

Pointers, in our architecture, are not integers.

David

I must miss something, but I don’t get what is the problem of lower the pointer to actual integer.

Pointers, in our architecture, are not integers.

Thanks for the clarifications.

So what you’re saying is that a inttoptr instruction is not a no-op on your architecture, is that right?
Or it can be a no-op only if the consumer of the pointer values can be done on the pointer register bank?

Don’t know if that helps, but note that the registers are not typed, they just have size. The operations are typed.

I am trying to understand the constraint to see how that would fit in the framework. That being said, anything that you could do in SDag should be possible as well in the new framework.

Cheers,
-Quentin

I must miss something, but I don’t get what is the problem of lower the pointer to actual integer.

Pointers, in our architecture, are not integers.

Thanks for the clarifications.

So what you’re saying is that a inttoptr instruction is not a no-op on your architecture, is that right?

Correct.

Or it can be a no-op only if the consumer of the pointer values can be done on the pointer register bank?

Yes (in some compilation models, we support 64-bit integers as pointers and 256-bit / 128-bit fat pointers, with the integer values being implicitly checked against a large region identified by one of the fat pointer registers, giving a coarse-grained sandbox that can communicate with the outside world via bounded pointers).

We currently have entirely separate fat pointer and integer register banks, though we’re investigating a mode where we’ll overlay the two on the same register file (though they’ll likely treat some things as sub-registers.

It also means that address space casts are not a no-op for us, which I believe is something that we share with some GPU ISAs (e.g. a 32-bit [or 16-bit] local pointer cast to a 64-bit global address space is not a simple sign/zero extension and so must be handled differently to an i32 -> i64 translation)

Don’t know if that helps, but note that the registers are not typed, they just have size. The operations are typed.

That’s fine, once you’ve assigned values to register banks. The issue is ensuring that we’re not throwing away the information that we need to do that assignment in the translation from LLVM IR to the new machine IR (i.e. which values are pointers, and which address space they are in).

I am trying to understand the constraint to see how that would fit in the framework. That being said, anything that you could do in SDag should be possible as well in the new framework.

We currently add several new nodes to SDag: INTTOPTR, PTRTOINT, and PTRADD, and a new iFATPTR MVT. The last is somewhat problematic, as we really want to have iFATPTR128 and iFATPTR256 (and, potentially, iFATPTR64 for an IoT/embedded variant).

David

Having a distinction between integers and pointers preserved into MI would be quite useful for garbage collection as well. We currently have a lowering phase (RewriteStatepointsForGC) which effectively rewrites operations of references (i.e. managed pointers) so that they can be treated as integers throughout the rest of the pipeline. If we could retain the distinction further back through the backend, it would both simply a lot of code and likely let us generate better code (spilling, etc..) around safepoints.

Philip

It’s also important for doing a load of CFI things correctly (see: https://www.ics.uci.edu/~perl/ccs15_stackdefiler.pdf)

David

Hi Philip and David,

Thanks for the inputs, we will see how the design can accommodate with that when we prototype. Having some inttoptr, etc. kind of MachineInstr with additional information like address space sounds reasonable and that should fit your constraints.

Anyway, I may ping you to check if the translation is flexible enough or perform what you want, but of course, I invite you to actively review all the incoming patches related to GISel :).

Cheers,
-Quentin

Hi Quentin,

*** Goals ***

The high level goals of the new instruction selector are:

  • Global instruction selector.
  • Fast instruction selector.

Are these separate or the same? It reads like two instruction selectors at the moment.

  • Shared code path for fast and good instruction selection.

But then I’m not sure starting here.

  • IR that represents ISA concepts better.
  • More flexible instruction selector.

Some definitions here would be good.

  • Easier to maintain/understand framework, in particular legalization.
  • Self contained machine representation, no back links to LLVM IR.
  • No change to LLVM IR.

These sound great. Would be good to get the assumptions of the legalization pass written down more explicitly as you go through this.

*** Proposed Approach ***

In this section, I describe the approach I plan to pursue in the prototype and the roadmap to get there. The final design will flow out of it.

For this prototype, we purposely exclude any work to improve or use TableGen or

I’m getting the idea that you really don’t want to work on TableGen? :wink:

** Implications **

As part of the bring-up of the prototype, we need to extend some of the core MachineInstr-level APIs:

  • Need to remember FastMath flags for each MachineInstr.

Not orthogonal to this proposal? I don’t mind lumping it in as being able to do this is probably a good goal for the prototype at least, but it seems like being able to do this is something that could be done incrementally as a separate project?

At the end of M1, the prototype will not be able to produce code, since we would only have the beginning of the Global ISel pipeline. Instead, we will test the IRTranslator on the generic output that is produced from the tested IR.

So this would be targeting Generic MachineInstr? (Better name perhaps?). Which means that it should be serializable and testable in isolation yes?

  • Design Decisions *
  • The IRTranslator is a final class. Its purpose is to move away from LLVM IR to MachineInstr world [final].
  • Lower the ABI as part of the translation process [final].
  • Design Questions the Prototype Addresses at the End of M1 *
  • Handling of aggregate types during the translation.
  • Lowering of switches.
  • What about Module pass for Machine pass?

Could you elaborate a bit more here?

  • Introduce new APIs to have a clearer separation between:
  • Legalization (setOperationAction, etc.)
  • Cost/Combine related (isXXXFree, etc.)
  • Lowering related (LowerFormal, etc.)
  • What is the contract with the backends? Is it still “should be able to select any valid LLVM IR”?

Probably :slight_smile:

As far as the prototype I think you also need to address a few additional things:

a) Calls
Calls are probably the most important part of any new instruction selector and lowering machinery and I think that the design of the call lowering infrastructure is going to be a critical part of evaluating the prototype. You might have meant this earlier when you said Lowering related, but I wanted to make sure to call it out explicitly.

b) Testing
It’s been covered a bit before, but being able to serialize and use for testing the various IR constructs is important. In particular, I worry about the existing MIR code as I and a few others have tried to use it for testcases and failed. I’m very interested in whatever ideas you have here, all of mine are much more invasive than I think we’d like.

Thanks for tackling this project and being willing to put this out there for discussion and feedback. I’m looking forward to the code and future design.

-eric

Hi Eric,

Hi Quentin,

*** Goals ***

The high level goals of the new instruction selector are:

  • Global instruction selector.
  • Fast instruction selector.

Are these separate or the same? It reads like two instruction selectors at the moment.

They are the same, sorry for the confusion. This reads, we want a global and fast instruction selector where producing the code fast and producing good code quality exercise the same basic path in the framework. I.e., producing code fast is a trimmed down version of producing good code. E.g., for fast, analysis are less precise, fewer passes are run, etc.

  • Shared code path for fast and good instruction selection.

But then I’m not sure starting here.

  • IR that represents ISA concepts better.
  • More flexible instruction selector.

Some definitions here would be good.

For IR that represents ISA concepts better, this is in opposition to SDISel or LLVM IR. In other words, the target should be able to insert target specific code (e.g., instruction, physical register) at anytime without needing some extra crust to express that (e.g., intrinsic or custom SDNode).

By more flexible we mean that targets should be able to inject target specific passes between the generic passes or replace those passes by their own.

  • Easier to maintain/understand framework, in particular legalization.
  • Self contained machine representation, no back links to LLVM IR.
  • No change to LLVM IR.

These sound great. Would be good to get the assumptions of the legalization pass written down more explicitly as you go through this.

Agree.
For now, the assumptions are there are no illegal types, just illegal pair of operation and type. But yeah, we may need to refine when we get to the legalization.

*** Proposed Approach ***

In this section, I describe the approach I plan to pursue in the prototype and the roadmap to get there. The final design will flow out of it.

For this prototype, we purposely exclude any work to improve or use TableGen or

I’m getting the idea that you really don’t want to work on TableGen? :wink:

Heh, that’s more a pragmatic approach. I don’t want we spend months improving TableGen before we start working on GlobalISel.
That being said, I think we should push as much thing as possible in tablegen when we are done with prototyping.

** Implications **

As part of the bring-up of the prototype, we need to extend some of the core MachineInstr-level APIs:

  • Need to remember FastMath flags for each MachineInstr.

Not orthogonal to this proposal? I don’t mind lumping it in as being able to do this is probably a good goal for the prototype at least, but it seems like being able to do this is something that could be done incrementally as a separate project?

That’s a good point and yes, it could be done as a separate project. The reason why this is here is because if we want to experiment with combine and such in the prototype, this is the kind of information we would need.

At the end of M1, the prototype will not be able to produce code, since we would only have the beginning of the Global ISel pipeline. Instead, we will test the IRTranslator on the generic output that is produced from the tested IR.

So this would be targeting Generic MachineInstr?

Yes.

(Better name perhaps?).

Suggestion welcome :).

Which means that it should be serializable and testable in isolation yes?

Partly. The lowering of the body of the function will be generic, but the ABI lowering will be target specific and unless we create some kind of fake target, the tests need to be bound to one target.

  • Design Decisions *
  • The IRTranslator is a final class. Its purpose is to move away from LLVM IR to MachineInstr world [final].
  • Lower the ABI as part of the translation process [final].
  • Design Questions the Prototype Addresses at the End of M1 *
  • Handling of aggregate types during the translation.
  • Lowering of switches.
  • What about Module pass for Machine pass?

Could you elaborate a bit more here?

I have quickly mentioned in my reply to Marcello why this may be interesting. Let me rephrase my answer here.
Basically, we would like to have the MachineInstr to be self contained, i.e., get rid of those back links to LLVM IR. This implies that we would need to lower globals (maybe directly to MC) as part of the translation process. Globals are not attached to function but module, therefore it seems to make sense to introduce a concept of MachineModulePass.

  • Introduce new APIs to have a clearer separation between:
  • Legalization (setOperationAction, etc.)
  • Cost/Combine related (isXXXFree, etc.)
  • Lowering related (LowerFormal, etc.)
  • What is the contract with the backends? Is it still “should be able to select any valid LLVM IR”?

Probably :slight_smile:

As far as the prototype I think you also need to address a few additional things:

a) Calls
Calls are probably the most important part of any new instruction selector and lowering machinery and I think that the design of the call lowering infrastructure is going to be a critical part of evaluating the prototype. You might have meant this earlier when you said Lowering related, but I wanted to make sure to call it out explicitly.

Yes, lowering of calls is definitely going to be evaluated in the prototype for this first milestone and the "lowering related” stuff was about that :).
(You’re good at deciphering messages ;)).

b) Testing
It’s been covered a bit before, but being able to serialize and use for testing the various IR constructs is important. In particular, I worry about the existing MIR code as I and a few others have tried to use it for testcases and failed. I’m very interested in whatever ideas you have here, all of mine are much more invasive than I think we’d like.

Honestly I haven’t used the MIR testing infrastructure yet, but yes my impression was it is not really… mature. I would love to have some serialization mechanism for the MI that really work so that we can write those testcases more easily.
As for now, I haven’t looked into it, so I cannot share any ideas. I’ve discussed a bit with Matthias and he thinks that we might not be that far away from having MIR testing useable modulo bug fixes.

It would be helpful if you could file PR on the cases where MIR was not working for you so that we can look into it at some point.

My hope is that someone could look into it before we actually need a proper MI testing in place.

(Hidden message: If you are willing to work on the MIR testing or any other mechanism that would allow us to do MI serialization deserialization, please come forward, we need you!! :D)

Indeed, for the translation part the MIR testing is not critical since we do have the LLVM IR around.
Then, if we get rid of the LLVM IR back links, serialization should become easier and maybe MIR testing could be leverage. That being said, it may be possible that we need to start that from scratch, while taking into account what we learnt from the MIR testing.

Thanks for the feedbacks,
-Quentin