Should LLVM JIT default to lazy or non-lazy?

In r85295, in response to the discussion at http://llvm.org/PR5184
(Lazy JIT ain't thread-safe), I changed the default JIT from lazy to
non-lazy. It has since come to my attention that this may have been
the wrong change, so I wanted to ask you guys.

A couple reasons to make the default non-lazy compilation:
* The lack of thread-safety surprises new users
* Crashes due to this will be rare and so hard to track down
* The current lazy scheme is almost never the right answer for performance
* It's only one line of code to turn on lazy compilation when it is
the right answer for you.

And a couple to default to lazy compilation:
* It's safe for single-threaded code.
* There are existing users who have assumed this default.
* PPC and ARM don't support non-lazy compilation yet (the tests
currently run the lazy jit).
* Gratuitous changes are bad.

Thoughts?

We can choose the default for lli separately from the JIT's default if we want.

And what might break if this assumption doesn't hold?

If the objection was about changing the sense of a magic bool, why not change the argument to be an enum instead? That should make it extremely clear in the source what behavior is desired.

-Chris

They'll wind up compiling the whole tree of transitive calls at
startup, including calls that are dynamically never taken, instead of
compiling only the taken calls gradually over the running time of the
app.

No, the magic bools all keep the same sense. The change affects the
behavior of programs without a call to
JIT->DisableLazyCompilation(bool). Anyone who's already calling it
with any parameter keeps their current behavior.

I'd be reasonably happy making that call required (so no default) if
people want that instead.

From where I sit, this boils down to a very simple question (modulo

Chris's point): Either choice will surprise some users. Which surprise
is worse? Personally, I'd always prefer correct but slow behavior by
default, and explicitly enabling dangerous (but in some cases fast)
behavior.

I would also point out that it seems that most of the people new to
the JIT are surprised by the current behavior, where as those who
would be surprised by needing to enable lazy JIT are those long
familiar with past behavior. In the OSS world, I always favor easing
adoption over maintaining the status quo.

My meager 2 cents.
-Chandler

... but I'd be perfectly happy to rename that function and have it
take an enum instead.

Code that takes the default lazy behavior will still work if the
default is changed to non-lazy. The only way non-lazy breaks (as far
as I know) is if you've already done some lazy JITting and left some
lazy stubs laying around. And code that does that will still work,
after adjusting to an API change. This is one of the least disruptive
breaking API changes imaginable.

And getting rid of rare, non-deterministic crashes is always a big win
in my book. We've got enough deterministic bugs to keep us busy :slight_smile:

I should say that code that *avoids* leaving the lazy stubs in place
will still work. Code that leaves them around will still break
reliably

Jeffrey Yasskin <jyasskin@google.com> writes:

[snip]

Thoughts?

I didn't notice the existence of a parameter for disabling JIT
laziness. What I do on my code is

    for (Module::iterator it = M->begin(); it != M->end(); ++it)
      EE->getPointerToFunction(&*it);

thus forcing code emission for all functions.

Native code creation on x86/x86_64 not only is very slow, its time
complexity for N LLVM instructions is N^x (x > 1). For a project I'm
working on, the JIT takes almost 1 minute to emit native code for all
functions on a low-end machine. That application acts as a server, and
clients will time-out if it takes more than a few seconds to respond,
which is very likely to happen with a lazy JIT. This is almost as bad as
the thread-safety issue, when the time required for lazy JITting is
similar to the time-out threshold.

IMHO it would be better for LLVM to default to non-lazy JITting.

From where I sit, this boils down to a very simple question (modulo
Chris's point): Either choice will surprise some users. Which surprise
is worse? Personally, I'd always prefer correct but slow behavior by
default, and explicitly enabling dangerous (but in some cases fast)
behavior.

The behavior is only dangerous because people are using it in new and different ways.

I would also point out that it seems that most of the people new to
the JIT are surprised by the current behavior, where as those who
would be surprised by needing to enable lazy JIT are those long
familiar with past behavior. In the OSS world, I always favor easing
adoption over maintaining the status quo.

This argues for better documentation. I'd prefer for EE to abort if user is asking for a known dangerous configuration (multi-threaded and lazy).

The biggest argument I have for staying with lazy is llvm JIT is not a light weight JIT. It's designed to do all the codegen optimizations a normal static compiler would do. Non-lazy JIT is too slow.

I'd prefer not to change the behavior. If we want to start using it in new and interesting ways, we should just design a new JIT.

Evan

From where I sit, this boils down to a very simple question (modulo
Chris's point): Either choice will surprise some users. Which surprise
is worse? Personally, I'd always prefer correct but slow behavior by
default, and explicitly enabling dangerous (but in some cases fast)
behavior.

The behavior is only dangerous because people are using it in new and different ways.

The fact that an interface, when used in new ways, exposes bugs this
severe and subtle indicates that it is a poor interface. Your argument
doesn't lend much weight, because the same can be said of computed
goto, the fork() system call, etc. Surviving new and different uses
*is* the purpose of a good interface.

I would also point out that it seems that most of the people new to
the JIT are surprised by the current behavior, where as those who
would be surprised by needing to enable lazy JIT are those long
familiar with past behavior. In the OSS world, I always favor easing
adoption over maintaining the status quo.

This argues for better documentation. I'd prefer for EE to abort if user is asking for a known dangerous configuration (multi-threaded and lazy).

I do not think you can guarantee that the abort occurs if multiple
threads are present. I would also much rather a speed limit sign than
a dashboard full of speeding tickets to teach me how to drive. (Even
if in that case I opt for both.... ;])

The biggest argument I have for staying with lazy is llvm JIT is not a light weight JIT. It's designed to do all the codegen optimizations a normal static compiler would do. Non-lazy JIT is too slow.

The JIT doesn't get faster by being lazy, its slowness is just
amortized over the runtime. As several have pointed out, that's not
always desirable, and in some cases is outright terrible. We always
take the same amount of time to actually JIT the code.

At best, the lazy JIT simply sees less code, but for most dynamic
languages, the only code ever given to the JIT is what is already
known to be needed.

I'd prefer not to change the behavior. If we want to start using it in new and interesting ways, we should just design a new JIT.

We clearly do want use it in new and interesting ways, there is no
'if'. I'm not sure what you mean by 'design a new JIT'...

-Chandler

From where I sit, this boils down to a very simple question (modulo
Chris's point): Either choice will surprise some users. Which surprise
is worse? Personally, I'd always prefer correct but slow behavior by
default, and explicitly enabling dangerous (but in some cases fast)
behavior.

The behavior is only dangerous because people are using it in new and different ways.

The fact that an interface, when used in new ways, exposes bugs this
severe and subtle indicates that it is a poor interface. Your argument
doesn't lend much weight, because the same can be said of computed
goto, the fork() system call, etc. Surviving new and different uses
*is* the purpose of a good interface.

And your point is? We are talking about changing the default. That fixes the interface?

I'd like to see some concrete proposal rather than flipping default behaviors. That does actually fix the problem. That just hides the problem. There won't be a consensus since we are lots of users with different needs. LLVM community has always let people who are doing to work make design decisions. Let's stick with that policy. Whoever is signed on to tackle the thread safety issue can decide what to do.

Evan

I don't have much to add to this interesting discussion, just like to
remind people that changes in the default behaviour are only
acceptable (IMHO) when everybody (but a few legacy code) is NOT using
(or should not use) the current default behaviour.

The counter example (and a good point when not to change the default
even when it's dangerous) is the strcpy() in the libc. It's dangerous,
malicious, bearing stupid, but still there. You have the strncpy (and
all associated variants) but it's not compulsory nor the default
behaviour.

Creating another method (getLazyFunctionPointer) or passing a boolean,
enum, whatever seems like the best course of action right now.

I understand that passing lots of Function*s to the JIT and only
generating what's really executed would be "nice" but (as was pointed
out before) this is signal of bad code. You should only pass to the
JIT what's really going to be executed, and I guess it's not that hard
to do the "lazy evaluation" on your side.

My 2 cents...

cheers,
--renato

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

Hi,
where's the problem from keeping the default like it is, sticking in an assert which triggers when the JIT is initialized with lazy compilation and multithreading active?
Then you could add a paragraph to the release notes, telling everyone that the Lazy JIT is not threadsafe, and be fine with it?
If this goes with an API change to an enum param or not, who cares?
Cornelius

From where I sit, this boils down to a very simple question (modulo
Chris's point): Either choice will surprise some users. Which surprise
is worse? Personally, I'd always prefer correct but slow behavior by
default, and explicitly enabling dangerous (but in some cases fast)
behavior.

The behavior is only dangerous because people are using it in new and different ways.

I'd point out that reid thought he made the JIT thread-safe in r22404
back in 2005. Calling it from multiple threads isn't new and
different, it's just subtly broken. I suggested changing the default
because most people who run into this problem don't think they're
doing anything unusual, and in fact their code works fine with the
eager JIT. People shouldn't stumble into broken behavior, and defaults
are good to prevent stumbling.

To avoid misconceptions, Unladen Swallow added the line to turn off
the lazy jit months ago, and we'll keep that line whatever the
decision is here. We might take advantage of a thread-safe
code-patching facility eventually, but we've been designing assuming
nobody else will implement that for us. I favor changing the default
because it'll help newbies, not because we need it for anything.

I would also point out that it seems that most of the people new to
the JIT are surprised by the current behavior, where as those who
would be surprised by needing to enable lazy JIT are those long
familiar with past behavior. In the OSS world, I always favor easing
adoption over maintaining the status quo.

This argues for better documentation. I'd prefer for EE to abort if user is asking for a known dangerous configuration (multi-threaded and lazy).

I think that'd be a decent fix for the thread-safety problem. It'd
require an extra check on each call to runJITOnFunctionUnlocked since
laziness is set by a call, not on construction. Or, we could use the
JIT lock to assert that it's never entered concurrently. On the other
hand, Nicolas Geoffray objected when I suggested that in the bug.

The biggest argument I have for staying with lazy is llvm JIT is not a light weight JIT. It's designed to do all the codegen optimizations a normal static compiler would do. Non-lazy JIT is too slow.

Óscar used the cost of the JIT as an argument _against_ the lazy JIT.
Could you elaborate on why you think it's an argument in favor of lazy
JITting?

I'd prefer not to change the behavior.

I'm guessing, based on your vagueness, that there are some internal
single-threaded Apple users who think that the lazy JIT is the best
performing option for them and who don't want to add a line to their
apps overriding the default. But it's hard for me or anyone else
outside Apple to judge whether they ought to drive the default without
a little more information about why the lazy JIT is good for them.
Could you describe any features of their use that make the lazy JIT a
good idea for them?

I think that this is a great idea. Instead of making "lazy or not" be a policy maintained by the JIT, why don't we approach this as a bug in the current API. Perhaps we should remove getPointerToFunction() and introduce two new methods (one lazy and one eager)?

-Chris

A performance argument in favor of lazy JIT:

If you're creating functions that contain calls to large amounts of
seldom-used code, (or takes the address of a ton of functions, only a
few of which end up being used) then lazy JIT is a win regardless of
how carefully you only JIT what you want to call. I'm not sure how
frequent a use case this is, though.

Same as when operating on sparse matrices, split the matrix into
smaller pieces and only operate on them. If you have a huge function
(bad design from start), split into smaller functions and only JIT
what's needed, when needed.

It may not be obvious how to split the function? Split the same way
you would lazy-JIT it and you're done. You can even add it as a
FunctionPass after all optimizations are finished.

cheers,
--renato

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

Didn't want to sound too radical, but that'd be my approach... :wink:

cheers,
--renato

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm