JIT allocates global data in function body memory

So I (think I) found a bug in the JIT:
http://llvm.org/bugs/show_bug.cgi?id=4483

Basically, globals used by a function are allocated in the same buffer
as the first code that uses it. However, when you free the machine
code, you also free the memory holding the global's data. The address
is still in the GlobalValue map, so any other code using that global
will access freed memory, which will cause problems as soon as you
reallocate that memory for something else.

I tracked down the commit that introduced the bug:
http://llvm.org/viewvc/llvm-project?view=rev&revision=54442

It very nicely explains what it does, but not why it does it, which
I'd like to know before I change it. I couldn't find the author
(johannes) on IRC so ssen told me to ask LLVMdev about this behavior.
There's even a patch to work around this behavior on Apple ARM
platforms:
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/JIT/JIT.cpp?view=diff&pathrev=72630&r1=58687&r2=58688

So what should the right long-term behavior be? It makes sense to me
to use the JITMemoryManager for this so that clients of the JIT can
customize allocation instead of using malloc or new char. On the
other hand, that complicates the API and requires a homegrown malloc
implementation in the DefaultMemoryManager.

Reid

So I (think I) found a bug in the JIT:
http://llvm.org/bugs/show_bug.cgi?id=4483

Basically, globals used by a function are allocated in the same buffer
as the first code that uses it. However, when you free the machine
code, you also free the memory holding the global's data. The address
is still in the GlobalValue map, so any other code using that global
will access freed memory, which will cause problems as soon as you
reallocate that memory for something else.

I tracked down the commit that introduced the bug:
http://llvm.org/viewvc/llvm-project?view=rev&revision=54442

It very nicely explains what it does, but not why it does it, which
I'd like to know before I change it. I couldn't find the author
(johannes) on IRC so ssen told me to ask LLVMdev about this behavior.

That's me (and I'm not on IRC because I like messages to be archived). The reason everything needs to go in the same buffer is that we're JITting code on one machine, then sending it to another to be executed, and references from one buffer to another won't work in that environment. So that model needs to continue to work. If you want to generalize it so other models work as well, go ahead.

That's me (and I'm not on IRC because I like messages to be
archived). The reason everything needs to go in the same buffer is
that we're JITting code on one machine, then sending it to another to
be executed, and references from one buffer to another won't work in
that environment. So that model needs to continue to work. If you
want to generalize it so other models work as well, go ahead.

Maybe what I should do then is change
TargetJITInfo::allocateSeparateGVMemory to allocateGVsWithCode and
invert the meaning, since I feel like most users probably just want
malloc or something similar. You could then subclass the appropriate
TJI class and override that method. Would that be a reasonable API
change? No one else calls or overrides that method. In order to do
that we'd also need to hear from Evan about why Apple ARM needs to use
malloc.

Is it worth allocating global data through the memory manager, or is
better to use malloc? Currently global data lives forever (or rather,
as long as the JIT does, which makes sense since they live forever in
a statically compiled program), so I was thinking it might be good for
memory usage and locality to put small globals together into a buffer
similar to the function stub buffer. That way you don't call malloc
for every global int64, and you can lay them out sequentially in
memory. Large global arrays should probably get their own block.

Reid

So, you're moving code across machines without running any relocations
on it? How can that work? Are you just assuming that everything winds
up at the same addresses? Or is everything PC-relative on your
platform, so all that matters is that globals and the code are in the
same relative positions?

How are you getting the size of the code you need to copy?
MachineCodeInfo didn't exist when you wrote this patch, so I assume
you've written your own JITMemoryManager. Even then, if you JIT more
than one function, and they share any globals, you have to deal with
multiple calls into the MemoryManager and functions that use globals
allocated inside other buffers. You should be able to deal with having
separate calls to allocate global space and allocate code space. You'd
just remember the answers you gave and preserve them when copying to a
new system.

I'd like freeMachineCodeForFunction to avoid corrupting emitted
globals, and with the current arrangement of information within the
JIT, that means globals and code have to live in different
allocations. I think Reid's suggesting a flag of some sort, with one
setting for "freeMachineCodeForFunction works" and another for
"globals and code are allocated by a single call into the
MemoryManager." I'd like to avoid new knobs if it's possible, so do
you really need that second option? Or do you just need globals to be
allocated by some call into the MemoryManager?

Thanks!
Jeffrey

So I (think I) found a bug in the JIT:
http://llvm.org/bugs/show_bug.cgi?id=4483

Basically, globals used by a function are allocated in the same buffer
as the first code that uses it. However, when you free the machine
code, you also free the memory holding the global's data. The address
is still in the GlobalValue map, so any other code using that global
will access freed memory, which will cause problems as soon as you
reallocate that memory for something else.

I tracked down the commit that introduced the bug:
http://llvm.org/viewvc/llvm-project?view=rev&revision=54442

It very nicely explains what it does, but not why it does it, which
I'd like to know before I change it. I couldn't find the author
(johannes) on IRC so ssen told me to ask LLVMdev about this behavior.

That's me (and I'm not on IRC because I like messages to be
archived). The reason everything needs to go in the same buffer is
that we're JITting code on one machine, then sending it to another to
be executed, and references from one buffer to another won't work in
that environment. So that model needs to continue to work. If you
want to generalize it so other models work as well, go ahead.

So, you're moving code across machines without running any relocations
on it? How can that work? Are you just assuming that everything winds
up at the same addresses? Or is everything PC-relative on your
platform, so all that matters is that globals and the code are in the
same relative positions?

I am not the people actually doing this, I am the guy who changed llvm JIT handling so that this model would work. I believe everything is PC-relative, but I don't know details (and probably couldn't talk about them on a public list if I did). I don't think those guys do any freeing, so they don't have your problem.

The current model where code and data share a buffer needs to continue to work, and I have a fairly strong preference (and so will our client) that whatever you do should not require any changes to the existing client code. Beyond that, I am not the kind of person who thinks there's only one way to do things; I won't object to what you do as long as it doesn't break what we're using now.

Dale Johannesen wrote:

So I (think I) found a bug in the JIT:
http://llvm.org/bugs/show_bug.cgi?id=4483

Basically, globals used by a function are allocated in the same
buffer
as the first code that uses it. However, when you free the machine
code, you also free the memory holding the global's data. The
address
is still in the GlobalValue map, so any other code using that global
will access freed memory, which will cause problems as soon as you
reallocate that memory for something else.

I tracked down the commit that introduced the bug:
http://llvm.org/viewvc/llvm-project?view=rev&revision=54442

It very nicely explains what it does, but not why it does it, which
I'd like to know before I change it. I couldn't find the author
(johannes) on IRC so ssen told me to ask LLVMdev about this
behavior.

That's me (and I'm not on IRC because I like messages to be
archived). The reason everything needs to go in the same buffer is
that we're JITting code on one machine, then sending it to another to
be executed, and references from one buffer to another won't work in
that environment. So that model needs to continue to work. If you
want to generalize it so other models work as well, go ahead.

So, you're moving code across machines without running any relocations
on it? How can that work? Are you just assuming that everything winds
up at the same addresses? Or is everything PC-relative on your
platform, so all that matters is that globals and the code are in the
same relative positions?

I presume (hope, really) that we don't end up with code and data in the
same page. From
IntelĀ® 64 and IA-32 Architectures Optimization Reference Manual:
Assembly/Compiler Coding Rule 57. (H impact, L generality) Always put
code and data on separate pages.

Sorry, I guess you know this already.

Andrew.

I presume (hope, really) that we don't end up with code and data in the
same page. From
IntelĀ® 64 and IA-32 Architectures Optimization Reference Manual:
Assembly/Compiler Coding Rule 57. (H impact, L generality) Always put
code and data on separate pages.

Hahaha, nope. So this change could actually be a performance win!

Dale, here's a preliminary patch:
Issue 90053: Fix the LLVM JIT to allocate globals outside function bodies - Code Review (also attached)

Would this kind of change be acceptable to your mysterious client who
won't fully disclose their requirements? All they have to do is flip
a static boolean flag (it's ugly, I know, but I wanted it to be easy
to change programmatically so a CLI flag takes more work).

Reid

JITMemoryManager.diff (17.4 KB)

We have been JITing kernels and having a memory manager for globals
would be a big win there (clean up a few hacks to make things go in
the correct locations).

Andrew

We have been JITing kernels and having a memory manager for globals
would be a big win there (clean up a few hacks to make things go in
the correct locations).

I'm also guessing that Dale's client at Apple is using a custom memory
manager, since without doing that there's no way to get the size of
the code block in order to send it over the wire. Adding an
allocateGlobal method would probably allow them to trap it and also
send it over the wire, but they say they don't want to make any code
changes.

I'd like to get some guidance on what's acceptable before I send a
patch to llvm-commits, though.

Reid

I've asked Evan to look at your patch; I really haven't looked at that area since I put that patch in, and he knows it better.

Apple's security restrictions are annoying at time, trust me, I know. It's not the decision of anybody working on LLVM.

Sorry I am late to the thread. I've been meaning to find the time to respond properly.

1. Yes, the default behavior is to keep GV and function in the same memory buffer. The reason is the JIT can be used in a client and server environment. This makes it possible without doing expensive relocation on the fly. In fact, the currently implementation doesn't allow the client to do relocation since relocation is done by the JIT. That can be changed, of course.
2. There is a hook to change the behavior, i.e. allocating GV separately with malloc. The API is not fully flushed out so it's done with malloc (which is how it was done before #1).
3. We *can* consider changing the default. But we need to do it gently. I'd prefer to complete the API to support both models and *then* consider making the change. We do not want to disrupt existing clients.
4. Most *real* implementation should use its down custom memory manager. The default manager (or malloc for GVs) works for lli. But that's a command line tool for testing, it isn't necessarily how things must be done.

I am glad to see the push to design and implement a proper API to support both models. I intend to review the code later today. Sorry about the delay.

Evan

The patch looks good to me. But we cannot allow AllocateGVsWithCode to be initialized to be false yet. Can you add a mechanism to define the behavior when the JIT is created / initialized?

Also, I am not crazy about this being moved to JITMemoryManager.h. This seems like implementation detail that should be kept hidden.

+ // If the PoisonMemory parameter is true, freed memory should be overwritten
+ // with garbage. This parameter is useful for testing and debugging.
+#ifdef NDEBUG
+#define POISON true
+#else
+#define POISON false
+#endif

Thanks,

Evan

The patch looks good to me. But we cannot allow AllocateGVsWithCode to
be initialized to be false yet. Can you add a mechanism to define the
behavior when the JIT is created / initialized?

That makes four optional arguments to ExecutionEngine::create. Do you
mind if I go ahead and add an EngineBuilder?

Also, I am not crazy about this being moved to JITMemoryManager.h.
This seems like implementation detail that should be kept hidden.

+ // If the PoisonMemory parameter is true, freed memory should be
overwritten
+ // with garbage. This parameter is useful for testing and debugging.
+#ifdef NDEBUG
+#define POISON true
+#else
+#define POISON false
+#endif

I wasn't crazy about it either, but it was hard to construct a
unittest that fails properly under release mode without it. It also
seems like it would be a useful feature for writing unit tests for
other custom memory managers. I can either axe it and let the test
only do proper testing in debug mode, or I could add a
setPoisonMemory(bool poison) method to the JITMemoryManager interface.
Which would you prefer?

Reid

The patch looks good to me. But we cannot allow AllocateGVsWithCode to
be initialized to be false yet. Can you add a mechanism to define the
behavior when the JIT is created / initialized?

That makes four optional arguments to ExecutionEngine::create. Do you
mind if I go ahead and add an EngineBuilder?

Please, thanks.

Also, I am not crazy about this being moved to JITMemoryManager.h.
This seems like implementation detail that should be kept hidden.

+ // If the PoisonMemory parameter is true, freed memory should be
overwritten
+ // with garbage. This parameter is useful for testing and debugging.
+#ifdef NDEBUG
+#define POISON true
+#else
+#define POISON false
+#endif

I wasn't crazy about it either, but it was hard to construct a
unittest that fails properly under release mode without it. It also
seems like it would be a useful feature for writing unit tests for
other custom memory managers. I can either axe it and let the test
only do proper testing in debug mode, or I could add a
setPoisonMemory(bool poison) method to the JITMemoryManager interface.
Which would you prefer?

My preference would be to add a method to JITMemoryManager.

Thanks,

Evan

That makes four optional arguments to ExecutionEngine::create. Do you
mind if I go ahead and add an EngineBuilder?

Please, thanks.

This turned out to be somewhat involved and should probably be a
separate patch. I started it, so I'll send it along later. The patch
I'd like to submit now is attached and linked.
http://codereview.appspot.com/90053/show

Reid

issue90053_2001.diff (26.1 KB)

Looks fine to me. Thanks. Can you commit it?

Evan

I've committed it for Reid. Thanks for the review!