How to lower the intrinsic function 'llvm.objectsize'?

The documentation of LLVM says that “The llvm.objectsize intrinsic is lowered to a constant representing the size of the object concerned”. I’m attempting to lower this intrinsic function to a constant in a pass. Below is the code snippet that I wrote:

for (BasicBlock::iterator i = b.begin(), ie = b.end();
   (i != ie) && (block_split == false);) {
 IntrinsicInst *ii = dyn_cast<IntrinsicInst>(&*i);
 ++i;  
 if(ii) {
 switch (ii->getIntrinsicID()) {
  case Intrinsic::objectsize: {
   IRBuilder<> builder(ii->getParent(), ii);
   Value *op1 = ii->getArgOperand(0); //i8*
   uint64_t bit_size = op1->getType()->getPointerElementType()->getPrimitiveSizeInBits();
   Value *result = ConstantInt::get(ii->getType(), bit_size);
   ii->replaceAllUsesWith(result);
   ii->removeFromParent();
   delete ii;
   break;
  }
 }
}

I’m new to LLVM and not sure whether the implementation is correct. Can anybody tell me whether the implementation is correct?

Thanks in advance.

Why do you need to handle this yourself? This should already be handled for you (see InstCombineCalls.cpp). However, you have a few problems with this. First, you can’t always determine the size. Just looking at the pointer element type isn’t enough. This requires finding the object definition, which can fail, and the existing handling uses llvm::getObjectSize to for. In general when looking at type sizes you don’t want to use getPrimitiveSizeInBits, and should use the DataLayout for various reasons.

The documentation of LLVM says that "The llvm.objectsize intrinsic is
lowered to a constant representing the size of the object concerned". I'm
attempting to lower this intrinsic function to a constant in a pass. Below
is the code snippet that I wrote:

for (BasicBlock::iterator i = b.begin(), ie = b.end();
   (i != ie) && (block_split == false):wink: {
IntrinsicInst *ii = dyn_cast<IntrinsicInst>(&*i);
++i;
if(ii) {
switch (ii->getIntrinsicID()) {
  case Intrinsic::objectsize: {
   IRBuilder<> builder(ii->getParent(), ii);
   Value *op1 = ii->getArgOperand(0); //i8*
   uint64_t bit_size =
op1->getType()->getPointerElementType()->getPrimitiveSizeInBits();
   Value *result = ConstantInt::get(ii->getType(), bit_size);
   ii->replaceAllUsesWith(result);
   ii->removeFromParent();
   delete ii;
   break;
  }
}
}

I'm new to LLVM and not sure whether the implementation is correct. Can
anybody tell me whether the implementation is correct?

If you don't want to do it yourself you can probably just use
getObjectSize, declared in include/llvm/Analysis/MemoryBuiltins.h.

Also, two things regarding your implementation:
- the intrinsic returns the size in bytes, not bits; generally, when
dealing with memory, bytes are the relevant unit anyway.
- llvm.objectsize returns the size of the "object", not just the
pointee type. So for instance, if you have an i8* pointer to a 10 byte
array, it would return 10, not 1. Before what you quoted, the
documentation says "An object in this context means an allocation of a
specific class, structure, array, or other object.".

Good luck,

- Ahmed

Thanks for your reply.

I’m attempting to expand KLEE to support this intrinsic function.

That’s why I need to handle this myself.

According to the reply, the correct implementation should first find the definition of the object and then determine the

size of the object.

BTW, can I just refer to the implementation in InstCombineCalls.cpp.

You may also want to take a look at this:
http://llvm.org/docs/doxygen/html/classllvm_1_1ObjectSizeOffsetEvaluator.html

In summary, this can lower objectsize computation to either a constant (if possible) or introduce additional instructions in the IR to compute the value at run time. Sounds pretty much what you would want for KLEE.
This lowering is intra-procedural only. Anything else is too complicated.

Nuno

Hi Dingbao,

Thanks for your reply.
I'm attempting to expand KLEE to support this intrinsic function.
That's why I need to handle this myself.
According to the reply, the correct implementation should first find the
definition of the object and then determine the
size of the object.
BTW, can I just refer to the implementation in InstCombineCalls.cpp.

Please see this [1] issue on KLEE's issue tracker.

After talking to Daniel (CC'ed) I was under the impression that the
easiest thing to do would be to treat these as no-ops rather than
trying to lower them.

Reading the docs on llvm.objectsize couldn't we just be really lazy
and return 0 or -1 (indicating unknown) depending on the value of the
"min" argument? If there are programs where control flow depends on
the return value of llvm.objectsize() this would probably break things
but I don't know if clang ever generates IR in that form.

What Nuno just suggested sounds quite promising way to do this that
isn't as lazy as what I suggested.

[1] https://github.com/klee/klee/issues/33
[2] http://llvm.org/docs/LangRef.html#llvm-objectsize-intrinsic

Thanks,
Dan.

Hi,

It depends on what you're trying to accomplish.
I guess for KLEE it would be sufficient to ignore the intrinsic (as you say, replace it with 0/-1). InstCombine will try to lower it properly. If it fails, it means that later CodeGen will ignore the intrinsic as well (i.e., replace it with 0/-1 and fold comparisons/branches depending on it). So, in this way you would mimic LLVM's behavior.
If you want to do full-blown verification, then you'd need to consider the cases where the compiler might be able to lower the intrinsic and the cases it won't. For these, lowering the intrinsic with the API I mentioned would be better, but then you would need to fork and execute both branches: when the compiler can and cannot lower the intrinsic. Probably not worth it.

Nuno