FP Constants spilling to memory in x86 code generation

Hello,

I've tracked down a new memory leak which started happening when I enabled constant propagation (I created my own passmanager which I run before calling getPointerToGlobal to JIT the function I have generated - is this OK?). The reason is that FP constants are being spilled to memory as there is no immediate fp load available. In my case this is a bit unnecessary since the constant is in fact a (constant) global variable which I have already added a memory mapping for. But the constant folding can of course end up making new constants which _have_ to be spilled to memory.

I guess what I'd like to know is if the process of spilling constants to memory could be a bit more controlled, maybe using the JIT memory manager and putting it in with the function stubs? If not, I have to add tracking to the allocations here so they can be freed when the function is deleted. This is not so easy since there is no reference to the function which we are generating machine code for inside the copyConstantToRegister function. Any ideas?

m.

PS. Regarding my earlier post about function stubs -- The stub generation is working fine for the x86 target, it was Visual Studio generating the stubs I was seing because I have Fix&Continue turned on :stuck_out_tongue:

Hello,

I've tracked down a new memory leak which started happening when I enabled constant propagation (I created my own passmanager which I run before calling getPointerToGlobal to JIT the function I have generated - is this OK?).

Yes, that's perfectly ok! The interface to do this could probably be improved, but I definitely intended for JIT applications like yours to add standard (and even custom) passes to the pass manager to be run automatically before the JIT runs.

The reason is that FP constants are being spilled to memory as there is no immediate fp load available. In my case this is a bit unnecessary since the constant is in fact a (constant) global variable which I have already added a memory mapping for. But the constant folding can of course end up making new constants which _have_ to be spilled to memory.

Okay, if you're using X86 FP math, there is really little choice except to spill the FP constant to the constant pool for a function. This should be completely transparent to you.

I guess what I'd like to know is if the process of spilling constants to memory could be a bit more controlled, maybe using the JIT memory manager and putting it in with the function stubs?

Yes, this can and should definitely be improved. If you look at ExecutionEngine/JIT/JITEmitter.cpp:emitConstantPool, you can see that the JIT is just new'ing a block of memory for every constant pool that is needed. This is, admittedly, antisocial for your application, so if you'd like to make a memory manager for it, feel free.

If not, I have to add tracking to the allocations here so they can be freed when the function is deleted. This is not so easy since there is no reference to the function which we are generating machine code for inside the copyConstantToRegister function. Any ideas?

I think that adding something like the JITMemoryManager for the constant pools would make sense. I'm not sure that reusing the JITMemoryManager is a great idea, though it could be done. In particular, some architectures have cache problems when data and code live too close to each other. I'm not familiar with the details, but it seems safe to put constants somewhere that is not intentionally close to the code. Perhaps others have a more informed opinion about this than I do.

PS. Regarding my earlier post about function stubs -- The stub generation is working fine for the x86 target, it was Visual Studio generating the stubs I was seing because I have Fix&Continue turned on :stuck_out_tongue:

Ah, ok, good deal! :slight_smile:

-Chris

Chris Lattner wrote:

I guess what I'd like to know is if the process of spilling constants to memory could be a bit more controlled, maybe using the JIT memory manager and putting it in with the function stubs?

Yes, this can and should definitely be improved. If you look at ExecutionEngine/JIT/JITEmitter.cpp:emitConstantPool, you can see that the JIT is just new'ing a block of memory for every constant pool that is needed. This is, admittedly, antisocial for your application, so if you'd like to make a memory manager for it, feel free.

>

I think that adding something like the JITMemoryManager for the constant pools would make sense. I'm not sure that reusing the JITMemoryManager is a great idea, though it could be done. In particular, some architectures have cache problems when data and code live too close to each other. I'm not familiar with the details, but it seems safe to put constants somewhere that is not intentionally close to the code. Perhaps others have a more informed opinion about this than I do.

I have made a patch along these lines. Although I reused the JITMemoryManager object, I am allocating constant pools from another block of memory. This fixes my remaining leaks. It would be nice if also the global variables were allocated in this way, but it's not needed for my application since I'm managing that memory myself and using ExecutionEngine::addGlobalMapping.

Later on I'm going to need either a way of freeing memory for functions/constant pools or a way of recovering from out of memory, as our application is going to run as a server and hopefully be happily JIT'ing away for days on end. For the moment I will just delete the whole ExecutionEngine object and recompile everything every now and again. Although not a perfect solution, at least it works.

m.

diff.txt (3.58 KB)

I have made a patch along these lines. Although I reused the JITMemoryManager object, I am allocating constant pools from another block of memory. This fixes my remaining leaks. It would be nice if also the global variables were allocated in this way, but it's not needed for my application since I'm managing that memory myself and using ExecutionEngine::addGlobalMapping.

Ok, sounds good. Here are some comments:

1. This does not apply cleanly to mainline CVS, please update and try
    again :slight_smile:
2. Please keep lines within 80 columns.
3. This will fail if the JIT wants to allocate more than 512K of
    constants. Can you just have it allocate another block of memory if it
    runs out of space? Also, it might be useful to start the initial block
    much smaller, say 4K of memory, and double it when space is
    exhausted, as most programs don't use 1/2 meg of constant pools :slight_smile:

Later on I'm going to need either a way of freeing memory for functions/constant pools or a way of recovering from out of memory, as our application is going to run as a server and hopefully be happily JIT'ing away for days on end.

There is a (currently unimplemented) method for doing this:
ExecutionEngine::freeMachineCodeForFunction.

It should be straight-forward to free the memory for a function, though it will make the JITEmitter a bit more complex (it will have to track regions of freed memory) to reallocate them.

-Chris

Chris Lattner wrote:

I have made a patch along these lines. Although I reused the JITMemoryManager object, I am allocating constant pools from another block of memory. This fixes my remaining leaks. It would be nice if also the global variables were allocated in this way, but it's not needed for my application since I'm managing that memory myself and using ExecutionEngine::addGlobalMapping.

Ok, sounds good. Here are some comments:

3. This will fail if the JIT wants to allocate more than 512K of
   constants. Can you just have it allocate another block of memory if it
   runs out of space? Also, it might be useful to start the initial block
   much smaller, say 4K of memory, and double it when space is
   exhausted, as most programs don't use 1/2 meg of constant pools :slight_smile:

I just wanted to keep it simple - the allocation of memory for functions is done in the same way, grab a huge block and hope it's enough. I think the whole JITMemoryManager needs to be improved, this is just a temporary solution to the memory leak problem. My philosophy is that if you can't do it properly, at least change as little as possible...

I attached the updated patch which fit in 80 columns.

Later on I'm going to need either a way of freeing memory for functions/constant pools or a way of recovering from out of memory, as our application is going to run as a server and hopefully be happily JIT'ing away for days on end.

There is a (currently unimplemented) method for doing this:
ExecutionEngine::freeMachineCodeForFunction.

It should be straight-forward to free the memory for a function, though it will make the JITEmitter a bit more complex (it will have to track regions of freed memory) to reallocate them.

I think the reason why it's still unimplemented is because it's not at all straight-forward. The problem is that the amount of memory needed to compile a function is only known _after_ the function is compiled. The current system just writes the functions one after another into a large block of memory, but if you want to re-use free'd space you need to know in advance that it's large enough to hold your function.

One possible solution is to do some low level code to manage memory pages. The idea is to count how much live code is on a page and if it reaches 0 you return the page to the OS thus creating a gap in the address space. This way you don't have to move anything and you can keep writing new functions at the end.

Another possible solution is to compile the functions to a buffer and then move them to the smallest free block which is big enough to contain the function when the compilation is finished and you know the size of the function. This approach require relocation information to be generated as part of the compilation process.

m.

PS. Happy new year everyone!

jitmm.patch (3.67 KB)

Ooops, sorry for the major delay on this patch, it fell into the vortex that is my mailbox. I've applied it and have some

Chris Lattner wrote:

3. This will fail if the JIT wants to allocate more than 512K of
   constants. Can you just have it allocate another block of memory if it
   runs out of space? Also, it might be useful to start the initial block
   much smaller, say 4K of memory, and double it when space is
   exhausted, as most programs don't use 1/2 meg of constant pools :slight_smile:

I just wanted to keep it simple - the allocation of memory for functions is done in the same way, grab a huge block and hope it's enough. I think the whole JITMemoryManager needs to be improved, this is just a temporary solution to the memory leak problem. My philosophy is that if you can't do it properly, at least change as little as possible...

That sounds good. The reason that I was pointing this out is that the previous code would correctly handle an unbounded number of constants, so it's technically a regression. However, it's not going to be hit in practice, so I think it's fine. If you want to keep improving the JIT MM, please do! :slight_smile:

Later on I'm going to need either a way of freeing memory for functions/constant pools or a way of recovering from out of memory, as our application is going to run as a server and hopefully be happily JIT'ing away for days on end.

There is a (currently unimplemented) method for doing this:
ExecutionEngine::freeMachineCodeForFunction.

It should be straight-forward to free the memory for a function, though it will make the JITEmitter a bit more complex (it will have to track regions of freed memory) to reallocate them.

I think the reason why it's still unimplemented is because it's not at all straight-forward. The problem is that the amount of memory needed to compile a function is only known _after_ the function is compiled. The current system just writes the functions one after another into a large block of memory, but if you want to re-use free'd space you need to know in advance that it's large enough to hold your function.

Actually it is possible. The target code generator currently emits a big blob of bits and a set of relocations. It should be possible to move the blob of bits wherever we desire to before applying the relocations. The code currently does not do this because we used to not have the relocation information.

Another possible solution is to compile the functions to a buffer and then move them to the smallest free block which is big enough to contain the function when the compilation is finished and you know the size of the function. This approach require relocation information to be generated as part of the compilation process.

We do have that relocation info, but I think it would be better to follow the approach we almost have now. IOW, we want to emit the code to a spot, *hoping* to have enough space for it. The change would be to say "oops, we ran out of space half way through. Lets copy what we have somewhere larger, then keep going". The advantage here is that (in the common case) no copy is required.

-Chris