Upstreaming PNaCl's IR simplification passes

The PNaCl project has implemented various IR simplification passes that simplify LLVM IR by lowering complex features to simpler features. We’d like to upstream some of these IR passes to LLVM. We’d like to explore if this acceptable, and if so, how we should go about doing this.

The immediate reason is that Emscripten is reusing PNaCl’s IR passes for its new “fastcomp” backend [1]. It would be really useful if PNaCl and Emscripten could collaborate via upstream LLVM rather than a branch.

Some background: There are two related use cases for these IR simplification passes:

  1. Simplifying the task of writing a new LLVM backend. This is Emscripten’s use case. The IR simplification passes reduce the number of cases a backend has to handle, so they would be useful for anyone else creating a new backend.

  2. Using a subset of LLVM IR as a stable distribution format for portable executables. This is PNaCl’s use case. PNaCl’s IR subset omits various complex IR features, which we lower using the IR simplification passes [2]. Renderscript is an example of another project that uses IR as a stable distribution format, though I think currently Renderscript is not subsetting IR much.

Some examples of PNaCl’s IR simplification passes are:

  • Calling conventions lowering: ExpandVarArgs and ExpandByVal lower varargs and by-value argument passing respectively. They would be useful for any backend that doesn’t want to implement varargs or by-value calling conventions.

  • Instruction-level lowering:

  • ExpandStructRegs splits up struct values into scalars, removing the “insertvalue” and “extractvalue” instructions.

  • PromoteIntegers legalizes integer types (e.g. i30 is converted to i32).

  • Module-level lowering: This implements, at the IR level, functionality that is traditionally provided by “ld”. e.g. ExpandCtors lowers llvm.global_ctors to the __init_array_start and __init_array_end symbols that are used by C libraries at startup.

PNaCl’s IR simplification passes are modular – most are independent of each other – so they allow projects to pick and choose which IR features to support and which to pre-lower. The modularity of these passes makes them low-maintenance and easy to write targeted tests for.

The code for these passes can be found here:
https://chromium.googlesource.com/native_client/pnacl-llvm/+/master/lib/Transforms/NaCl/

There seems to be plenty of precedent for IR-to-IR lowering passes – LLVM already contains passes such as LowerInvoke, LowerSwitch and LowerAtomic.

The PNaCl team (which I’m a member of) is happy to take on the work of maintaining this code, such as updating it as LLVM IR evolves and doing code reviews. We would upstream this gradually, pass by pass, so the changes would be manageable.

Cheers,
Mark

[1] https://github.com/kripken/emscripten/wiki/LLVM-Backend
[2] https://groups.google.com/forum/#!topic/llvm-dev/lk6dZzwW0ls - PNaCl Bitcode reference manual

To add to what Mark mentioned about Emscripten’s new backend [1] using the PNaCl passes: It made writing the backend much easier than it otherwise would have been, given our requirements - we are an ‘odd’ target in that we want to transform LLVM IR into JavaScript, then run it through our existing external JavaScript optimizer tool, which does very JavaScript-specific optimizations (on a JavaScript AST which is the natural form for us), and for that reason we don’t use the common backend codegen path. Basically the PNaCl simplification passes convert LLVM IR into a smaller and simpler subset of LLVM IR, which makes writing a backend that processes LLVM IR more convenient.

I think there are other use cases as well that could benefit from these passes being upstream. While typically a backend would want to use the common codegen to get register allocation and so forth, there are situations where you just want to transform LLVM IR into something else. For example in a university course you could teach people compiler optimizations using LLVM IR, then have them write a tiny backend that compiles that IR into a familiar language (Python, Java, anything else that they already know) to execute it (lli also works of course, but this might feel more “concrete” for the students, and they would learn more I suspect). Writing that backend in a way that processes LLVM IR means you only need them to understand LLVM IR and not anything about the selection DAG etc. Also, there are situations where performance is really not a concern, like someone writing a backend for a little VM they invented for fun and just want to execute small amounts of C code on it - for example this happened with the DCPU-16 spec, and people made an LLVM backend for it.

In summary, I think the shared thing in these examples is that LLVM IR is very nice to work with, and there are some situations where you’re using it and you have a reason to convert it into something else, and you want to do that in as simple a way as possible as opposed to generating the most optimal results. The PNaCl IR simplification passes are in my opinion a big help there.

  • Alon

[1] https://github.com/kripken/emscripten/wiki/LLVM-Backend

The PNaCl project has implemented various IR simplification passes that
simplify LLVM IR by lowering complex features to simpler features. We'd
like to upstream some of these IR passes to LLVM. We'd like to explore if
this acceptable, and if so, how we should go about doing this.

The immediate reason is that Emscripten is reusing PNaCl's IR passes for
its new "fastcomp" backend [1]. It would be really useful if PNaCl and
Emscripten could collaborate via upstream LLVM rather than a branch.

Some background: There are two related use cases for these IR
simplification passes:

1) Simplifying the task of writing a new LLVM backend. This is
Emscripten's use case. The IR simplification passes reduce the number of
cases a backend has to handle, so they would be useful for anyone else
creating a new backend.

FWIW, this sounds to me like a sufficiently compelling use case to support
getting this in-tree.

-- Sean Silva

The PNaCl project has implemented various IR simplification passes that
simplify LLVM IR by lowering complex features to simpler features. We'd
like to upstream some of these IR passes to LLVM. We'd like to explore if
this acceptable, and if so, how we should go about doing this.

My question is somewhat different. I'm not questioning whether these are
acceptable, I'm questioning why these are interesting and important for the
LLVM project.

Neither PNaCl nor Emscripten open source projects have extensive developer
overlap with the LLVM community, and the developers have not (so far)
become super active maintainers of LLVM, although your recent patches to
fix some bugs uncovered by PNaCl have been much appreciated. These lowering
passes are likely to have few (most likely, zero) in-tree users for the
foreseeable future. I'm not enthusiastic about the community taking on the
maintenance, update, and code review burden of these.

I would point you at the several emails I have written to folks adding new
significant features to LLVM about how to offset this by contributing
maintenance and improvements to the core infrastructure, fixing bugs and
generally making things better sufficient to offset the ongoing complexity
cost of the new features. Fortunately, the PNaCl passes seem somewhat less
complex than (for instance) the x32 backend, but they seem likely to still
add a reasonable amount of complexity. They will certainly be challenging
to review and get the design into an acceptable state across the community.
At this point, I'm not really optimistic about there being a large enough
body of community members excited about getting these passes in to offset
these costs. I'm happy to be proven wrong of course, and would also be
happy to see you, other PNaCl developers, or Emscripten developers become
more active in the community in order to build this trust and establish a
good basis for these to go into LLVM.

The immediate reason is that Emscripten is reusing PNaCl's IR passes for
its new "fastcomp" backend [1]. It would be really useful if PNaCl and
Emscripten could collaborate via upstream LLVM rather than a branch.

While this does seem like a useful thing for your two projects, it isn't
clear why this benefits the LLVM community. Perhaps it does, but I'd like
to see that clarified.

Some background: There are two related use cases for these IR
simplification passes:

1) Simplifying the task of writing a new LLVM backend. This is
Emscripten's use case. The IR simplification passes reduce the number of
cases a backend has to handle, so they would be useful for anyone else
creating a new backend.

If these simplify writing a backend, why wouldn't the patches include
commensurate simplifications to LLVM's backends? That would both give them
an in-tree customer, and more immediate value to the community and project
as a whole.

2) Using a subset of LLVM IR as a stable distribution format for portable
executables. This is PNaCl's use case. PNaCl's IR subset omits various
complex IR features, which we lower using the IR simplification passes [2].
Renderscript is an example of another project that uses IR as a stable
distribution format, though I think currently Renderscript is not
subsetting IR much.

Given that the bitcode is stable, I don't understand why this is important.
What technical problems are you solving other than making the IR match some
predetermined form chosen by PNaCl?

Some examples of PNaCl's IR simplification passes are:

I have a bunch of questions about the specific passes you mention. Perhaps
these questions are better answered in the review thread for the patches,
but they are at least things that I would think about and try to address if
and when you send out the code review.

* Calling conventions lowering: ExpandVarArgs and ExpandByVal lower
varargs and by-value argument passing respectively. They would be useful
for any backend that doesn't want to implement varargs or by-value calling
conventions.

Why wouldn't these be applicable to existing backends? What is hard about
the existing representations?

* Instruction-level lowering:
    * ExpandStructRegs splits up struct values into scalars, removing the
"insertvalue" and "extractvalue" instructions.

There are already passes that do this outside of function arguments and
return values. Why is a new one needed? How do you handle the
overflow-detecting operations?

    * PromoteIntegers legalizes integer types (e.g. i30 is converted to
i32).

Does it split up too-wide integers? Do we really want another integer
legalization framework in LLVM? I am actually interested in doing (partial)
legalization in the IR during lowering (codegenprep time) in order to
simplify the backend, but I don't think we should develop such a framework
independently of the legalization currently used in the backends.

* Module-level lowering: This implements, at the IR level, functionality
that is traditionally provided by "ld". e.g. ExpandCtors lowers
llvm.global_ctors to the __init_array_start and __init_array_end symbols
that are used by C libraries at startup.

This doesn't make any sense to me. The IR representation is strictly
simpler. It is trivially lowered in a backend. I don't understand what this
would benefit.

There seems to be plenty of precedent for IR-to-IR lowering passes -- LLVM
already contains passes such as LowerInvoke, LowerSwitch and LowerAtomic.

Note that these are quite different -- they lower from a front-end
convenient form toward the canonical IR form. You are talking about
something totally different that deals with target-oriented lowering. The
correct place to look for analogies is CodeGenPrep.

The PNaCl team (which I'm a member of) is happy to take on the work of
maintaining this code, such as updating it as LLVM IR evolves and doing
code reviews. We would upstream this gradually, pass by pass, so the
changes would be manageable.

While this is appreciated, the PNaCl team should work to much more actively
contribute to the core of LLVM if it wants to be trusted to maintain this
code.

All of that said, while I have a lot of concerns, I do want to clarify
something: I actually think that this is the correct fundamental direction
for LLVM. I *want* to see PNaCl and Emscripten both be significantly more
involved in the community, and I think that using lowering to simplify
backends is a Very Good Thing. However, I think that unless there is a
significant consensus amongst the active LLVM developers that they are OK
accepting and maintaining these patches (currently, I'm not), I think that
the community engagement needs to happen first.

-Chandler

Just in case it gets lost in my longer reply, I want to emphasize that if
these will be used to simplify the in-tree backends and those backend
maintainers are on board, then I am *totally* in favor of this going into
the tree. My concerns are heavily based on the fact that as proposed, none
of that seems likely to happen.

I like this and would love to see it in the tree. I think it’s broadly useful to projects that want to take IR as input and then do interests things with it.

-Fil

The PNaCl project has implemented various IR simplification passes that
simplify LLVM IR by lowering complex features to simpler features. We'd
like to upstream some of these IR passes to LLVM. We'd like to explore if
this acceptable, and if so, how we should go about doing this.

My question is somewhat different. I'm not questioning whether these are
acceptable, I'm questioning why these are interesting and important for the
LLVM project.

Neither PNaCl nor Emscripten open source projects have extensive developer
overlap with the LLVM community, and the developers have not (so far)
become super active maintainers of LLVM, although your recent patches to
fix some bugs uncovered by PNaCl have been much appreciated. These lowering
passes are likely to have few (most likely, zero) in-tree users for the
foreseeable future. I'm not enthusiastic about the community taking on the
maintenance, update, and code review burden of these.

I would point you at the several emails I have written to folks adding new
significant features to LLVM about how to offset this by contributing
maintenance and improvements to the core infrastructure, fixing bugs and
generally making things better sufficient to offset the ongoing complexity
cost of the new features. Fortunately, the PNaCl passes seem somewhat less
complex than (for instance) the x32 backend, but they seem likely to still
add a reasonable amount of complexity. They will certainly be challenging
to review and get the design into an acceptable state across the community.
At this point, I'm not really optimistic about there being a large enough
body of community members excited about getting these passes in to offset
these costs. I'm happy to be proven wrong of course, and would also be
happy to see you, other PNaCl developers, or Emscripten developers become
more active in the community in order to build this trust and establish a
good basis for these to go into LLVM.

The immediate reason is that Emscripten is reusing PNaCl's IR passes for
its new "fastcomp" backend [1]. It would be really useful if PNaCl and
Emscripten could collaborate via upstream LLVM rather than a branch.

While this does seem like a useful thing for your two projects, it isn't
clear why this benefits the LLVM community. Perhaps it does, but I'd like
to see that clarified.

I think Alon's point about easing the task for students/people learning (or
playing with) LLVM is pretty strong. People playing around with LLVM today
are tomorrow's contributors. If we can get them to that feeling of "win"
faster, they are more likely to stick with the project.

Some background: There are two related use cases for these IR
simplification passes:

1) Simplifying the task of writing a new LLVM backend. This is
Emscripten's use case. The IR simplification passes reduce the number of
cases a backend has to handle, so they would be useful for anyone else
creating a new backend.

If these simplify writing a backend, why wouldn't the patches include
commensurate simplifications to LLVM's backends? That would both give them
an in-tree customer, and more immediate value to the community and project
as a whole.

I'd also like to add:
If these simplify writing a backend, should there be commensurate changes
to any relevant documentation for getting started writing backends? (we
don't have much such documentation though...)

(such documentation could also be construed as an in-tree customer if
indeed this would simplify it).

2) Using a subset of LLVM IR as a stable distribution format for
portable executables. This is PNaCl's use case. PNaCl's IR subset omits
various complex IR features, which we lower using the IR simplification
passes [2]. Renderscript is an example of another project that uses IR as
a stable distribution format, though I think currently Renderscript is not
subsetting IR much.

Given that the bitcode is stable, I don't understand why this is
important. What technical problems are you solving other than making the IR
match some predetermined form chosen by PNaCl?

Some examples of PNaCl's IR simplification passes are:

I have a bunch of questions about the specific passes you mention. Perhaps
these questions are better answered in the review thread for the patches,
but they are at least things that I would think about and try to address if
and when you send out the code review.

* Calling conventions lowering: ExpandVarArgs and ExpandByVal lower
varargs and by-value argument passing respectively. They would be useful
for any backend that doesn't want to implement varargs or by-value calling
conventions.

Why wouldn't these be applicable to existing backends? What is hard about
the existing representations?

* Instruction-level lowering:
    * ExpandStructRegs splits up struct values into scalars, removing the
"insertvalue" and "extractvalue" instructions.

There are already passes that do this outside of function arguments and
return values. Why is a new one needed? How do you handle the
overflow-detecting operations?

    * PromoteIntegers legalizes integer types (e.g. i30 is converted to
i32).

Does it split up too-wide integers? Do we really want another integer
legalization framework in LLVM? I am actually interested in doing (partial)
legalization in the IR during lowering (codegenprep time) in order to
simplify the backend, but I don't think we should develop such a framework
independently of the legalization currently used in the backends.

* Module-level lowering: This implements, at the IR level,
functionality that is traditionally provided by "ld". e.g. ExpandCtors
lowers llvm.global_ctors to the __init_array_start and __init_array_end
symbols that are used by C libraries at startup.

This doesn't make any sense to me. The IR representation is strictly
simpler. It is trivially lowered in a backend. I don't understand what this
would benefit.

It might be simpler to do in the backend, but I think that the point is
that it is a recurring cost in every backend; in particular for backends
written by people starting out/playing around with LLVM (i.e. potential
future contributors), where any potential performance loss is acceptable
for the sake of simplifying things.

There seems to be plenty of precedent for IR-to-IR lowering passes --
LLVM already contains passes such as LowerInvoke, LowerSwitch and
LowerAtomic.

Note that these are quite different -- they lower from a front-end
convenient form toward the canonical IR form. You are talking about
something totally different that deals with target-oriented lowering. The
correct place to look for analogies is CodeGenPrep.

The PNaCl team (which I'm a member of) is happy to take on the work of
maintaining this code, such as updating it as LLVM IR evolves and doing
code reviews. We would upstream this gradually, pass by pass, so the
changes would be manageable.

While this is appreciated, the PNaCl team should work to much more
actively contribute to the core of LLVM if it wants to be trusted to
maintain this code.

Is eliben still on the PNaCl team? (e.g. <
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-June/063010.html>)

I'd also like to point out that IR-level passes are pretty much LLVM's
strongest point of decoupling and modularization, so of all code changes to
have no in-tree users (if indeed there are none), this is probably a
best-case scenario from a maintainability perspective (especially if it
becomes the point of collaboration for Emscripten and PNaCl).

-- Sean Silva

The PNaCl project has implemented various IR simplification passes that
simplify LLVM IR by lowering complex features to simpler features. We'd
like to upstream some of these IR passes to LLVM. We'd like to explore if
this acceptable, and if so, how we should go about doing this.

My question is somewhat different. I'm not questioning whether these are
acceptable, I'm questioning why these are interesting and important for the
LLVM project.

Neither PNaCl nor Emscripten open source projects have extensive
developer overlap with the LLVM community, and the developers have not (so
far) become super active maintainers of LLVM, although your recent patches
to fix some bugs uncovered by PNaCl have been much appreciated. These
lowering passes are likely to have few (most likely, zero) in-tree users
for the foreseeable future. I'm not enthusiastic about the community taking
on the maintenance, update, and code review burden of these.

I would point you at the several emails I have written to folks adding
new significant features to LLVM about how to offset this by contributing
maintenance and improvements to the core infrastructure, fixing bugs and
generally making things better sufficient to offset the ongoing complexity
cost of the new features. Fortunately, the PNaCl passes seem somewhat less
complex than (for instance) the x32 backend, but they seem likely to still
add a reasonable amount of complexity. They will certainly be challenging
to review and get the design into an acceptable state across the community.
At this point, I'm not really optimistic about there being a large enough
body of community members excited about getting these passes in to offset
these costs. I'm happy to be proven wrong of course, and would also be
happy to see you, other PNaCl developers, or Emscripten developers become
more active in the community in order to build this trust and establish a
good basis for these to go into LLVM.

The immediate reason is that Emscripten is reusing PNaCl's IR passes for
its new "fastcomp" backend [1]. It would be really useful if PNaCl and
Emscripten could collaborate via upstream LLVM rather than a branch.

While this does seem like a useful thing for your two projects, it isn't
clear why this benefits the LLVM community. Perhaps it does, but I'd like
to see that clarified.

I think Alon's point about easing the task for students/people learning
(or playing with) LLVM is pretty strong. People playing around with LLVM
today are tomorrow's contributors. If we can get them to that feeling of
"win" faster, they are more likely to stick with the project.

Sure, but I don't think this direction is a necessary step there, or even a
very significant one. I don't think any part of this is going to make it
easier to get up and rolling with LLVM for newcomers.

Some background: There are two related use cases for these IR
simplification passes:

1) Simplifying the task of writing a new LLVM backend. This is
Emscripten's use case. The IR simplification passes reduce the number of
cases a backend has to handle, so they would be useful for anyone else
creating a new backend.

If these simplify writing a backend, why wouldn't the patches include
commensurate simplifications to LLVM's backends? That would both give them
an in-tree customer, and more immediate value to the community and project
as a whole.

I'd also like to add:
If these simplify writing a backend, should there be commensurate changes
to any relevant documentation for getting started writing backends? (we
don't have much such documentation though...)

Very much so, yes.

(such documentation could also be construed as an in-tree customer if
indeed this would simplify it).

I won't go that far. It won't keep it well tested or correct.

2) Using a subset of LLVM IR as a stable distribution format for
portable executables. This is PNaCl's use case. PNaCl's IR subset omits
various complex IR features, which we lower using the IR simplification
passes [2]. Renderscript is an example of another project that uses IR as
a stable distribution format, though I think currently Renderscript is not
subsetting IR much.

Given that the bitcode is stable, I don't understand why this is
important. What technical problems are you solving other than making the IR
match some predetermined form chosen by PNaCl?

Some examples of PNaCl's IR simplification passes are:

I have a bunch of questions about the specific passes you mention.
Perhaps these questions are better answered in the review thread for the
patches, but they are at least things that I would think about and try to
address if and when you send out the code review.

* Calling conventions lowering: ExpandVarArgs and ExpandByVal lower
varargs and by-value argument passing respectively. They would be useful
for any backend that doesn't want to implement varargs or by-value calling
conventions.

Why wouldn't these be applicable to existing backends? What is hard about
the existing representations?

* Instruction-level lowering:
    * ExpandStructRegs splits up struct values into scalars, removing
the "insertvalue" and "extractvalue" instructions.

There are already passes that do this outside of function arguments and
return values. Why is a new one needed? How do you handle the
overflow-detecting operations?

    * PromoteIntegers legalizes integer types (e.g. i30 is converted to
i32).

Does it split up too-wide integers? Do we really want another integer
legalization framework in LLVM? I am actually interested in doing (partial)
legalization in the IR during lowering (codegenprep time) in order to
simplify the backend, but I don't think we should develop such a framework
independently of the legalization currently used in the backends.

* Module-level lowering: This implements, at the IR level,
functionality that is traditionally provided by "ld". e.g. ExpandCtors
lowers llvm.global_ctors to the __init_array_start and __init_array_end
symbols that are used by C libraries at startup.

This doesn't make any sense to me. The IR representation is strictly
simpler. It is trivially lowered in a backend. I don't understand what this
would benefit.

It might be simpler to do in the backend, but I think that the point is
that it is a recurring cost in every backend; in particular for backends
written by people starting out/playing around with LLVM (i.e. potential
future contributors), where any potential performance loss is acceptable
for the sake of simplifying things.

I don't understand this at all.

We have a *target independent* backend. There is only one, so there should
be no recurring cost.

If people are writing a totally independent backend, then the cost of
handling this very trivial construct is ... completely unimportant compared
to the challenge of a new backend.

Also, I don't think this is about performance at all. Today, we have a
clear declarative construct that marks a special "on startup" thing with a
clear spec in the langref. With this patch we'll have an ad-hoc implicit
contract with an implementation detail of some systems libc ABIs. I don't
see how the latter is easier on any level.

There seems to be plenty of precedent for IR-to-IR lowering passes --
LLVM already contains passes such as LowerInvoke, LowerSwitch and
LowerAtomic.

Note that these are quite different -- they lower from a front-end
convenient form toward the canonical IR form. You are talking about
something totally different that deals with target-oriented lowering. The
correct place to look for analogies is CodeGenPrep.

The PNaCl team (which I'm a member of) is happy to take on the work of
maintaining this code, such as updating it as LLVM IR evolves and doing
code reviews. We would upstream this gradually, pass by pass, so the
changes would be manageable.

While this is appreciated, the PNaCl team should work to much more
actively contribute to the core of LLVM if it wants to be trusted to
maintain this code.

Is eliben still on the PNaCl team? (e.g. <
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-June/063010.html>)

Nope.

I'd also like to point out that IR-level passes are pretty much LLVM's
strongest point of decoupling and modularization, so of all code changes to
have no in-tree users (if indeed there are none), this is probably a
best-case scenario from a maintainability perspective (especially if it
becomes the point of collaboration for Emscripten and PNaCl).

Yep, its definitely a best case scenario. Note that I started off saying
that this was less complex than the proposed x32 changes. I think IR passes
are reasonably well factored for this.

However, it does still have a cost. Having fixed bugs in RegionInfo (prior
to the current excellent Polly bots) and deleted a large number of stale IR
passes that were not used, they cause confusion and ongoing maintenance
headaches. These aren't extreme, they are imminently surmountable even! But
we do need to have something to overcome them, and currently I'm not seeing
it.

<snip>

The PNaCl team (which I'm a member of) is happy to take on the work of
maintaining this code, such as updating it as LLVM IR evolves and doing
code reviews. We would upstream this gradually, pass by pass, so the
changes would be manageable.

While this is appreciated, the PNaCl team should work to much more
actively contribute to the core of LLVM if it wants to be trusted to
maintain this code.

Is eliben still on the PNaCl team? (e.g. <
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-June/063010.html>)

Technically, no. While I'm still collaborating with the PNaCl team on some
tasks, I'm not likely to be maintaining these passes as part of my day job
(beyond the usual upstream gardening I do from time to time).

That said, personally I think these passes are very useful in upstream
LLVM. For example, juts recently I found the constant-expr elimination
extremely handy in a completely PNaCl-unrelated out-of-tree project I'm
working on. Having this code in upstream LLVM would be wonderful --
otherwise I just maintain my own copy. Not only the simplification passes
make LLVM IR more palatable for non-traditional backends, they also strike
into the very important conundrum of whether LLVM IR is, or can be, target
independent. PNaCl is an interesting proof of concept that LLVM IR can
indeed, under some circumstances, be useful as a target-indepentent IR. I
think this opens up many interesting opportunities for LLVM.

As for the maintenance cost... the passes are really quite simple in
essence. Moreover, if two very significant projects rely on them (PNaCl is
officially released in Chrome, Emscripten is extremely popular too), it
seems unlikely to me that they will bit-rot.

The detailed technical concerns are very interesting, of course (for
example, where it's most proper to do integer legalization). These should
definitely be discussed in detail on a case-by-case basis, but I don't see
them as a strong reason not to add this to LLVM at all.

Eli

I want to be clear, I'm not claiming they will bitrot. I'm claiming that
they are a technical burden that is being added to that of the community,
and I don't currently see the balancing contributions from the developers
on those projects to the core of LLVM.

Could the project tolerate the burden? I am not optimistic. For example, I
don't think that the community has the free bandwidth to give the technical
review to the patches that they need.

The thing is, there are simple (if not "easy" as it requires a lot of work)
ways to address this: PNaCl folks could become more active contributors to
the project, or they could make the changes have a significant positive
effect on the existing complexity of the system. Or even better, both! But
I've not yet seen real evidence of either.

I share Chandler’s concern. If these aren’t actively used by something in tree, they will bit rot. The way to counter the bit rot would be to add extensive testcases… but that would just add an even larger burden on core LLVM developers to keep them up to date.

We have seen similar “obviously useful” pieces of infrastructure fall to the same fate (e.g., the C backend, which incidentally had very similar utilities back when it was alive). Why would this be any different?

-Chris

Sorry to reply to myself, I thought of something else I should have mentioned before. Emscripten hopes to eventually upstream its JavaScript backend, if there is interest. It’s a work in progress and far from ready for that right now, and there are probably lots of issues to figure out regarding that (again, far too early to get into detail), but one thing will be the dependence of the backend on the PNaCl IR simplification passes - I guess if they are not upstream at that point, we’d have to figure things out then.

  • Alon

Just to chime in with another use case, these passes would have been useful for lowering to MSIL in our C++/CLI compiler.

We initially experimented with LLVM as the backend for our C++/CLI compiler but hit upon problems just like these -- and without the skill set to solve them at the time, we ended up resorting to just blasting out bytecode from clang IRGen.

The IRGen kludge "works for us" but it's always been a regret of mine that we missed out on a lot of what makes LLVM great at the last mile. We're kind of stuck with that decision today but +1 for facilities that help others avoid that fate.

I can see how such facilities may appear orthogonal to people working on "real" machine backends but the same could be said for JIT / MCJIT which currently doesn't have in-tree users.

If a real in-tree user would help how about expediting the inclusion of PNaCl or Emscripten? I know developers on both teams and have confidence in their ability to keep with the programme.

Alp.

The PNaCl project has implemented various IR simplification passes that simplify LLVM IR by lowering complex features to simpler features. We’d like to upstream some of these IR passes to LLVM. We’d like to explore if this acceptable, and if so, how we should go about doing this.

My question is somewhat different. I’m not questioning whether these are acceptable, I’m questioning why these are interesting and important for the LLVM project.

I share Chandler’s concern. If these aren’t actively used by something in tree, they will bit rot. The way to counter the bit rot would be to add extensive testcases… but that would just add an even larger burden on core LLVM developers to keep them up to date.

We have seen similar “obviously useful” pieces of infrastructure fall to the same fate (e.g., the C backend, which incidentally had very similar utilities back when it was alive). Why would this be any different?

My reading of the OP suggests that there are at least two projects that depend on these passes in production (and will for the foreseeable future). I wasn’t around when the CBE was added; were there such users for it then? If not, then I would consider that a major difference in the situation.

– Sean Silva

Framing the problem differently, what I see is this: PNaCl (and, by implication elsewhere in the thread, Emscripten and a hypothetical new C backend or MSIL backend) are basically backends that don’t go through the SelectionDAG mechanism and largely bypass the current backend logic that legalizes IR for a backend. The problem is that basically all the targets in the LLVM tree use SelectionDAG and associated mechanisms. Arguably the NVPTX backend might benefit from such an approach (since it ultimately needs to allocate virtual registers), but I’ve never developed any backends, so I don’t know what the tradeoffs are there. In any case, it’s extremely unlikely that code which is only useful for IR-based backends instead of SelectionDAG-based backends could be useful for any in-tree targets. In that frame of model, though, I do see a potential compromise: instead of proposing virtual clones of what is essentially IR legalization for an IR-based backend, why not attempt to generalize the current legalization logic to work for IR-based backends instead of only SelectionDAG-based backends?

Just in case it gets lost in my longer reply, I want to emphasize that if
these will be used to simplify the in-tree backends and those backend
maintainers are on board, then I am *totally* in favor of this going into
the tree. My concerns are heavily based on the fact that as proposed, none
of that seems likely to happen.

Framing the problem differently, what I see is this:
PNaCl (and, by implication elsewhere in the thread, Emscripten and a
hypothetical new C backend or MSIL backend) are basically backends that
don't go through the SelectionDAG mechanism and largely bypass the current
backend logic that legalizes IR for a backend. The problem is that basically
all the targets in the LLVM tree use SelectionDAG and associated mechanisms.
Arguably the NVPTX backend might benefit from such an approach (since it
ultimately needs to allocate virtual registers), but I've never developed
any backends, so I don't know what the tradeoffs are there. In any case,
it's extremely unlikely that code which is only useful for IR-based backends
instead of SelectionDAG-based backends could be useful for any in-tree
targets.

In that frame of model, though, I do see a potential compromise: instead of
proposing virtual clones of what is essentially IR legalization for an
IR-based backend, why not attempt to generalize the current legalization
logic to work for IR-based backends instead of only SelectionDAG-based
backends?

I agree that is should be a lot easier to create a backend.
See the below comment describing a virtual machine architecture.
A complete CPU definition in 54 lines of text!!!
For those interested, the 54 lines is taken from GCHQ.
I think it would be useful to maybe use the below as an example
backend template.

I would also welcome more IR passes in the LLVM tree, even if the
current backends don't use them.
I can probably re-use them in my project.
So long as there were tests for them, and an interested party could
claim "MAINTAINER" for each one.
A similar model to that of the Linux kernel MAINTAINER.
Anyone can submit a device driver to the linux kernel and it will go
into upstream mainline, so long as a "MAINTAINER" is identified.
If the MAINTAINER becomes absent for a period of time, and no one else
claims it, the driver is then removed again.
A single LLVM IR pass is a relatively unobtrusive piece of the LLVM code base.
How would adding a new LLVM IR pass into the upstream LLVM code base
cause problems for the core MAINTAINERS?

Kind Regards

James

exec: function()
  {
    // virtual machine architecture
    // ++++++++++++++++++++++++++++
    //
    // segmented memory model with 16-byte segment size (notation seg:offset)
    //
    // 4 general-purpose registers (r0-r3)
    // 2 segment registers (cs, ds equiv. to r4, r5)
    // 1 flags register (fl)
    //
    // instruction encoding
    // ++++++++++++++++++++
    //
    // byte 1 byte 2 (optional)
    // bits [ 7 6 5 4 3 2 1 0 ] [ 7 6 5 4 3 2 1 0 ]
    // opcode - - -
    // mod -
    // operand1 - - - -
    // operand2 - - - - - - - -
    //
    // operand1 is always a register index
    // operand2 is optional, depending upon the instruction set specified below
    // the value of mod alters the meaning of any operand2
    // 0: operand2 = reg ix
    // 1: operand2 = fixed immediate value or target segment
(depending on instruction)
    //
    // instruction set
    // +++++++++++++++
    //
    // Notes:
    // * r1, r2 => operand 1 is register 1, operand 2 is register 2
    // * movr r1, r2 => move contents of register r2 into register r1
    //
    // opcode | instruction | operands (mod 0) | operands (mod 1)
    // -------+-------------+------------------+-----------------
    // 0x00 | jmp | r1 | r2:r1
    // 0x01 | movr | r1, r2 | rx, imm
    // 0x02 | movm | r1, [ds:r2] | [ds:r1], r2
    // 0x03 | add | r1, r2 | r1, imm
    // 0x04 | xor | r1, r2 | r1, imm
    // 0x05 | cmp | r1, r2 | r1, imm
    // 0x06 | jmpe | r1 | r2:r1
    // 0x07 | hlt | N/A | N/A
    //
    // flags
    // +++++
    //
    // cmp r1, r2 instruction results in:
    // r1 == r2 => fl = 0
    // r1 < r2 => fl = 0xff
    // r1 > r2 => fl = 1
    //
    // jmpe r1
    // => if (fl == 0) jmp r1
    // else nop

    throw "VM.exec not yet implemented";
  }

All,

Thanks for the insights and thoughtful suggestions ventured here. If I may, I wanted to summarize the discussion so far, add a few small points, and suggest a step forward. I’ll begin by a recap and add a few points.

First, the central objections to placing these passes in-tree seem to center mostly around additional complexity in the code base and lack of testing, etc. As several have pointed out, the complexity of the passes themselves is very low, and in all cases is intended to reduce the complexity of the IR surface exposed to consumers of the bitcode. We believe, as others on this list have expressed, a smaller IR surface to understand and implement enables new users of the LLVM infrastructure.

Second, there are teams that are already using LLVM today for non-traditional applications outside the tree, in products that their respective companies support. For instance, LLVM is already being used efforts to make native code in the web something real. To plug the idea we’re all pursuing there – bringing native code to the web increases the reach of developers (reminder: there are now over a billion web browser users out there, and the number is growing fast) and has demonstrated advantages for deployment and update. To name one of the projects pursuing this, Mozilla’s Emscripten effort has a JavaScript backend in very active development, and Alon has stated here that he would like to upstream his work eventually. And of course, my own team has been using LLVM since the beginning and have released PNaCl, based on LLVM bitcode, as a feature available in Chrome.

Third, an objection was raised that, if these simplifications are useful, they should be used by in-tree backends. As someone noted, some of the transformations are more like legalizations for backends, and I would imagine that they would be more useful for some than for others (e.g, GEP simplifications may be a better match for a RISC like MIPS than for x86). I propose we get specific about the passes and see which backend maintainers think they might profit from using them. But even if other in-tree backends don’t want to use the passes, there seems to be a class of backend that is not yet in the tree that does want to use similar transformations. Or simplifying the IR in these ways will produce more such users.

Fourth, an objection came up that these passes were similar to ones that used to be in LLVM, but were removed. Sometimes ideas just have their right time, and we seem to have found more users than just the C backend.

Last, some have objected that the passes bake in C/C++/ELF runtime library conventions such as baking in init/fini/ctor/dtor processing and libc startup/teardown. The passes Mark described are not monolithic, and we would be happy to share any or all of them individually if such transformations aren’t deemed interesting.

Now to the steps ahead: I propose Mark sends some patches for the simpler passes out to let you, our esteemed colleagues, discuss them concretely. As several have noted, these patches could be interesting to them, and it seems reasonable to pass them along here for the potential benefit of those folks. Also as Alon noted, it would be nice to have this underway before he comes back with his backend work. Your thoughts are respectfully solicited.

Cheers,

David

Here's a data point on the size of the maintenance burden of PNaCl's IR
simplification passes:

I recently updated PNaCl's branch of LLVM from being based on LLVM 3.3 to
3.4. I made 8 changes to the passes and their tests in order to keep the
tests passing. They were:

1) Compile fix: Missing #includes of "llvm/IR/Constants.h" (
https://chromium.googlesource.com/native_client/pnacl-llvm/+/72676e03d8956af100029b42355d25b8ba102785
)
2) Compile fix: Removed code for handling case ranges in the input, since
case ranges were removed in 3.4 (
https://chromium.googlesource.com/native_client/pnacl-llvm/+/be43d266f0919675062d5b875efb370e264a07da
)
3) Compile fix: Changed a use of PassManager to PassManagerBase to fix an
ambiguity (
https://chromium.googlesource.com/native_client/pnacl-llvm/+/6a38d188d29270fc40fa6af6c6c7d36ba4a34637
)
4) Test fix: Fix test expectations after "readonly" attr was added to
memcpy intrinsic (
https://chromium.googlesource.com/native_client/pnacl-llvm/+/7f634ce6f622188cd551dbf283131b03d019583d
)

Some of the changes happened as a result of 'lit' becoming stricter in 3.4:
5) Fixed a mistake that led to a CHECK being ignored (
https://chromium.googlesource.com/native_client/pnacl-llvm/+/051a09c5e14e6dd00352483127db0d13f38d2359
)
6) Added "not" to a test where an error is expected (
https://chromium.googlesource.com/native_client/pnacl-llvm/+/803421ad03f4ac4db30581a8cff10b3d52ac4362
)
7) Fixed a bug in stripping llvm.invariant.end (
https://chromium.googlesource.com/native_client/pnacl-llvm/+/fc31cc7b91935b59b2011b06c5ed49e82e49bf9b
)

The last change was made based on a test failure in end-to-end compiler
tests rather than a test failure in "make test":
8) Changed StripAttributes to strip unrecognised attrs by default
https://chromium.googlesource.com/native_client/pnacl-llvm/+/d970bf995e01e96c3c9597f333d7d1f251eee71a

This is based on looking through the changes listed by "gitk
lib/Transforms/NaCl/ test/Transforms/NaCl/" in the PNaCl branch.

So if PNaCl's simplification passes had been upstream in LLVM 3.3,
developers committing changes to trunk would have had to make 7 additional
changes in the six months between branching 3.3 and branching 3.4 (about
1.2 changes per month, on average). The changes are fairly minor.

Cheers,
Mark

I agree, I don't think that an abstract discussion is useful here, lets talk details.

One point though: I don't understand the claims that this will make it easier to write a backend. The LLVM backend infrastructure already handles the lowering of constantexprs and others constructs for a target. AFAIK, these sorts of lowering passes would only help someone not using our target.

Supporting pnacl and emscripten are still worthwhile goals if the maintenance complexity is balanced right, I just have never understood the point about simplifying target descriptions.

-Chris

There's a lot of questions in your post, so I'll focus on the technical
questions about specific IR passes in this first reply...

Some background: There are two related use cases for these IR
simplification passes:

1) Simplifying the task of writing a new LLVM backend. This is
Emscripten's use case. The IR simplification passes reduce the number of
cases a backend has to handle, so they would be useful for anyone else
creating a new backend.

If these simplify writing a backend, why wouldn't the patches include
commensurate simplifications to LLVM's backends? That would both give them
an in-tree customer, and more immediate value to the community and project
as a whole.

That's a good question. I'll have to have a look around in the LLVM
backend code and see what parts could be replaced by one of PNaCl's
simplification passes.

One answer is that, in some cases, such as calling conventions and global
constructor arrays, LLVM's backend is constrained to follow the ABIs for
particular OSes and architectures. Compatibility makes complexity harder
to remove. I'll elaborate more below. This only applies to a few of
PNaCl's IR passes though.

2) Using a subset of LLVM IR as a stable distribution format for
portable executables. This is PNaCl's use case. PNaCl's IR subset omits
various complex IR features, which we lower using the IR simplification
passes [2]. Renderscript is an example of another project that uses IR as
a stable distribution format, though I think currently Renderscript is not
subsetting IR much.

Given that the bitcode is stable, I don't understand why this is important.

Is the bitcode format stable now? I heard talk that LLVM is trying to do
this now, but I don't remember seeing an llvmdev thread stating that for
sure. Was there a thread about it that I missed? I just remember hearing
complaints last year that the format was still getting changed. :slight_smile:

* Calling conventions lowering: ExpandVarArgs and ExpandByVal lower

varargs and by-value argument passing respectively. They would be useful
for any backend that doesn't want to implement varargs or by-value calling
conventions.

Why wouldn't these be applicable to existing backends? What is hard about
the existing representations?

For the calling conventions lowering passes, you wouldn't want to use them
in backends that have to match some existing architecture-specific ABI for
calling conventions. For example, if you use ExpandVarArgs on x86, your .o
file won't be able to successfully call the printf() function provided by
libc.so, because the varargs calling conventions won't match.

But for many targets that is not an issue, either because:

* there is no existing architecture-specific ABI that LLVM must match, or
* you're using static linking, or can make similar "closed world"
assumptions, so that a module can use any calling conventions as long as
they're used consistently within the module.

Both of these are true for PNaCl and Emscripten.

My suspicion is that one or both of these conditions will be true for other
novel backends, such as for specialised architectures like GPUs.

Aside from PNaCl and Emscripten, I am less familiar with other novel
backends. So one of the things I had hoped to learn from this discussion
was whether other backends would find these passes useful. So far we've
had some people say that yes, they would.

  * Instruction-level lowering:

    * ExpandStructRegs splits up struct values into scalars, removing the
"insertvalue" and "extractvalue" instructions.

There are already passes that do this outside of function arguments and
return values. Why is a new one needed?

Are you referring to the work that SelectionDAGBuilder.cpp does to convert
insertvalue/extractvalue to a SelectionDAG? I don't think there's an
IR-to-IR pass in LLVM for doing this, is there?

The reason PNaCl needs an IR-to-IR pass is that PNaCl's stable IR omits
insertvalue/extractvalue, in order to keep the format simple and reduce the
set of constructs that a PNaCl translator implementation needs to handle.
The reason Emscripten's fastcomp uses ExpandStructRegs is to keep
Emscripten's backend simple, in the context that it doesn't use lib/CodeGen.

And the reason we have to handle insertvalue/extractvalue at all is largely
that Clang outputs them for uses of C++ method pointers. Otherwise,
structs-as-registers aren't really used. At least, that was the case in
3.3 -- maybe some more uses have appeared since then.

How do you handle the overflow-detecting operations?

PNaCl has the ExpandArithWithOverflow pass, which lowers uses of
llvm.*.with.overflow.*.

    * PromoteIntegers legalizes integer types (e.g. i30 is converted to

i32).

Does it split up too-wide integers?

PNaCl's version currently doesn't. Emscripten's fastcomp has a version
which splits up 64-bit integer operations into 32-bit operations, which
they need because Javascript doesn't support 64-bit integer arithmetic.

PNaCl's version didn't need to do that because we were happy to support
64-bit arithmetic in PNaCl's stable ABI. However, we did find that unusual
C bitfields caused Clang to generate integer types larger than 64-bit
(which we don't support in PNaCl's stable ABI), so we started implementing
a pass to split those up. We should probably sync up with Emscripten and
reuse their code for that.

Do we really want another integer legalization framework in LLVM?

At the risk of not answering your question directly, LLVM already has two
instruction selectors, SelectionDAG and FastISel. So another question
might be, when is it OK to have multiple implementations that perform
similar tasks using different approaches, and when is it not OK? What are
the trade-offs involved here?

I am actually interested in doing (partial) legalization in the IR during
lowering (codegenprep time) in order to simplify the backend, but I don't
think we should develop such a framework independently of the legalization
currently used in the backends.

* Module-level lowering: This implements, at the IR level,
functionality that is traditionally provided by "ld". e.g. ExpandCtors
lowers llvm.global_ctors to the __init_array_start and __init_array_end
symbols that are used by C libraries at startup.

This doesn't make any sense to me. The IR representation is strictly
simpler. It is trivially lowered in a backend. I don't understand what this
would benefit.

To elaborate: In PNaCl, pexes are statically linked modules in which
running global constructors is handled by user code inside the pexe. The
special llvm.global_ctors array isn't part of PNaCl's stable subset of IR,
because there's no need for it to be. Running constructors is done in
normal IR by the pexe's entry point, without constructors needing to be
handled specially by PNaCl's IR format.

LLVM's global_ctors construct is incomplete: it provides a mechanism, at
the IR level, to declare functions to be run at startup, but it assumes
that running these functions will be done by a runtime library. At the IR
level, LLVM doesn't provide a way to implement a runtime library that can
read that constructor list. ld linker scripts provide a way to do that --
e.g. on Linux, see /usr/lib/ldscripts/elf_i386.x, which defines
__init_array_{start,end} -- but that's not at the IR level.

ExpandCtors just provides a mechanism for a runtime library to list the
constructor functions, purely at the IR level, without constructors having
to be a special feature in the PNaCl ABI or in the Emscripten backend.

There seems to be plenty of precedent for IR-to-IR lowering passes -- LLVM

already contains passes such as LowerInvoke, LowerSwitch and LowerAtomic.

Note that these are quite different -- they lower from a front-end
convenient form toward the canonical IR form.

Those three passes don't lower towards canonical IR form -- unless we are
taking "canonical IR form" to mean quite different things?

LowerInvoke and LowerAtomic both strip out information irreversibly.

LowerAtomic "lowers atomic intrinsics to non-atomic form for use in a known
non-preemptible environment". LowerInvoke strips out exception handling by
converting invokes to calls, so that landingpads, resumes, etc. become dead
and can be removed by a later pass.

(As an aside, LowerInvoke has an option for using SJLJ exception handling,
but that option appears to be unused and replaced
by lib/CodeGen/SjLjEHPrepare.cpp.)

LowerSwitch "rewrites switch instructions with a sequence of branches,
which allows targets to get away with not implementing the switch
instruction until it is convenient".

These three are very similar in function to PNaCl's IR simplification
passes, since they reduce the set of language features that must be
supported by a backend or by a stable IR format.

You are talking about something totally different that deals with
target-oriented lowering. The correct place to look for analogies is
CodeGenPrep.

CodeGenPrepare.cpp just contains optimisations, doesn't it? It doesn't
lower any language features such that the feature is removed from the
module, so it doesn't seem to be analogous to PNaCl's IR simplification
passes, which do do that. e.g. LowerAtomic strips out atomicrmw entirely
so that anything processing LowerAtomic's output doesn't have to handle
atomicrmw at all. Similarly, ExpandByVal expands out "byval" entirely.

If you're looking for backend IR-to-IR passes which lower language
features, DwarfEHPrepare and SjLjEHPrepare are analogous to PNaCl's passes.
DwarfEHPrepare only lowers resume instructions, while SjLjEHPrepare
handles more.

Cheers,
Mark