NP-hard problems in the LLVM optimizer?

Hi all.

I’ve heard a couple of times that some of the problems solved by various
passes in the optimizer are indeed NP-hard, even though the instances
are small enough to be tractable (and very quickly).

Is this true? If so, which are these problems?
Register allocation? Instruction scheduling?

Are they solved exactly or by approximations?
Or not solved at all (the need of solving them is avoided in some way)?

Greetings,
Nicola

Hi all.

I’ve heard a couple of times that some of the problems solved by various
passes in the optimizer are indeed NP-hard, even though the instances
are small enough to be tractable (and very quickly).

Is this true?

Some are NP-Hard, some are NP-complete.

If so, which are these problems?
Register allocation? Instruction scheduling?

Depends on how you define register allocation (IE optimal register
coloring, optimal spilling, optimal copy coalescing, optimal
dematerialization, etc), but even some of the subtasks are np-complete for
normal programs.

You can also prove that strict SSA programs have some properties that make
some of these subtasks *not* NP-complete when taken in isolation, for
example, register coloring can be done optimally because strict SSA
programs produce chordal graphs, *despite* the fact that register coloring
is an NP-complete problem in general.

But in general, even the simple task of "figuring out the smallest number
of registers it takes to color a given register interference graph" is
NP-complete

Scheduling is also full of NP-complete and np-hard problems.
Even if you restrict scheduling instructions to assume a fixed 2-cycle
latency (IE no funky pipeline constraints, etc), it's *still* np hard (see
http://dl.acm.org/citation.cfm?id=155183.155190)

Are they solved exactly or by approximations?

Approximations in some cases, not trying in others (IE not even
approximating optimal, or considering optimal in the formulation, but just
trying to do something that seems to turn out good in practice. I wouldn't
call these approximations since they are not related to solving the NP
complete problems, just generating good code).

The only case i'm aware of using real exact/approximation solutions is
PBQP, which solves a subclass of the NP complete problem it attacks in
linear time.
I believe it uses heuristics to approximate an optimal answer in the rest,
but don't quote me on that :slight_smile:

Or not solved at all (the need of solving them is avoided in some way)?

It only avoids them in the sense that it simply ignores optimality as a
constraint.

Hi all.

I’ve heard a couple of times that some of the problems solved by various
passes in the optimizer are indeed NP-hard, even though the instances
are small enough to be tractable (and very quickly).

Is this true?

Some are NP-Hard, some are NP-complete.

If so, which are these problems?
Register allocation? Instruction scheduling?

Depends on how you define register allocation (IE optimal register
coloring, optimal spilling, optimal copy coalescing, optimal
dematerialization, etc), but even some of the subtasks are np-complete for
normal programs.

You can also prove that strict SSA programs have some properties that make
some of these subtasks *not* NP-complete when taken in isolation, for
example, register coloring can be done optimally because strict SSA
programs produce chordal graphs, *despite* the fact that register coloring
is an NP-complete problem in general.

But in general, even the simple task of "figuring out the smallest number
of registers it takes to color a given register interference graph" is
NP-complete

Scheduling is also full of NP-complete and np-hard problems.
Even if you restrict scheduling instructions to assume a fixed 2-cycle
latency (IE no funky pipeline constraints, etc), it's *still* np hard (see
http://dl.acm.org/citation.cfm?id=155183.155190)

Are they solved exactly or by approximations?

Approximations in some cases, not trying in others (IE not even
approximating optimal, or considering optimal in the formulation, but just
trying to do something that seems to turn out good in practice. I wouldn't
call these approximations since they are not related to solving the NP
complete problems, just generating good code).

The only case i'm aware of using real exact/approximation solutions is
PBQP, which solves a subclass of the NP complete problem it attacks in
linear time.

I believe it uses heuristics to approximate an optimal answer in the rest,

but don't quote me on that :slight_smile:

Yep, that's the general idea. Optimal for some simple graphs,
domain-specific heuristic for the complex parts.

Thank you very much for your reply!

Can you point me to more papers where to find out more details
about optimization problems involved in optimization (sigh) of program’s code?

Thank you very much again,
Nicola