Accounting for stack space

Given my recent posts, I think it's obvious that I'm trying to figure
out how to build a resource-aware VM for a high-level language.

I've figured out adequate solutions for most of the problems I've
encountered, including separate heaps, quotas, etc. However, I'm not
sure how I can account for a thread's stack space. Given a language
process (LP) running in a heap with a quota, a thread in that LP can
exceed the LP's quota simply by recursing infinitely since stack space
allocation is outside of my VM's control.

So how can I account for the stack space consumed by the thread
running in that LP, or control the allocation of stack space? One
solution is to CPS-transform the program, so that all activation
frames are explicitly allocated from the LP's heap. Is there another
way?

Sandro

To this end, are there any implicit allocations being done by
generated LLVM code, other than the system stack?

Sandro

To this end, are there any implicit allocations being done by
generated LLVM code, other than the system stack?

heap allocations? Only malloc/free. Note that the compiler does generate calls to runtime libraries (e.g. libstdc++ and libgcc), we don't have control over when they do allocations. The libstdc++ calls show up in the .ll file, but the libgcc ones don't. I don't think any libgcc routines do heap allocations.

-Chris

Are these calls distinguishable in the .ll file somehow? Do these
calls map to intrinsics and certain instructions?

Sandro

How about if I were to use LLVM's JIT? I suspect plenty of allocations
are performed in the JIT.

Sandro

How about if I were to use LLVM's JIT? I suspect plenty of allocations
are performed in the JIT.

The JIT does a ton of heap allocation. There is no way to approximate it from the code you give it.

-Chris

Sandro

To this end, are there any implicit allocations being done by
generated LLVM code, other than the system stack?

heap allocations? Only malloc/free. Note that the compiler does generate
calls to runtime libraries (e.g. libstdc++ and libgcc), we don't have
control over when they do allocations. The libstdc++ calls show up in the
.ll file, but the libgcc ones don't. I don't think any libgcc routines do
heap allocations.

-Chris

_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-Chris

I don't need to approximate it, but I'd like to be able to track it,
in the sense of being able to measure the heap allocations as they are
being performed.

For instance, is it possible to entirely replace the malloc/free
called by the LLVM libraries with my own implementation? This would
achieve my goals, as I'm not against heap allocations, I just need to
be able to measure them.

Sandro

Not really. The JIT allocates from the same heap as the program being used. Your choices are to either override malloc/free for both the JIT and the program or for neither of them.

-Chris

I want to 'intercept' ALL allocations actually, including the stack if
possible, so the above suits me just fine.

Sandro

Ok, just provide your own malloc/free. :slight_smile:

-Chris

One of the existing malloc replacements should provide hints for how to go about that. www.hoard.org for one. The details on Windows are especially complicated.

Be aware that even doing this won’t reliably intercept all allocations. With Mac OS X, Mach’s vm_allocate call is accessible. With Unix, mmap with flags = MAP_ANON allocates memory. And even still, the implementations of these aren’t magic; it’s possible to make the syscalls without linking to these symbols per se. There are likely further facilities to distrust on any given platform, as well. If you’re intending to sandbox untrusted third-party code, then you need to consider these issues.

Good luck,
Gordon

Ouch. Sandboxing untrusted code would certainly be an interesting
application, but I'm not that ambitious. My intended usage is a custom
VM with a custom bytecode JIT'd using LLVM.

I was going to track activation frames by CPS-transforming my input
program, so the LLVM stack should run in constant space. Please let me
know if there's some type of allocation LLVM might perform that
doesn't fall under these two categories.

Sandro