Miscompilation of sizeof

In ASTContext:::getTypeInfo(const Type *T) const we have:

  uint64_t Width=0;

  case Type::ConstantArray: {
    const ConstantArrayType *CAT = cast<ConstantArrayType>(T);

    std::pair<uint64_t, unsigned> EltInfo =
getTypeInfo(CAT->getElementType());
    Width = EltInfo.first*CAT->getSize().getZExtValue();

But this multiplication can overflow (because for reasons that I don't
known getTypeInfo return width specified in bits).

If there are no objections I'd add an assert: probably we'll induce some
crashes, but I believe that this would be *far* better than to
miscompile the code (and assertions will be triggered only when code
would be miscompiled).

Ping.

Did you mean to attach a patch showing the assert you intend to add?

Did you mean to attach a patch showing the assert you intend to add?

I believed it was not needed, but I've attached it now for review.

With that patch applied the following testcase (specific for 64 bit
architecture) now triggers the assertion instead to be horribly miscompiled.

#include <stdio.h>

typedef int x[1UL<<59];

int main() {
  printf("%lu\n", sizeof(x));
}

PATCH (704 Bytes)

Most people use clang with assertions disabled, so this doesn’t really solve the underlying problem.

Reid

Most people use clang with assertions disabled, so this doesn't really
solve the underlying problem.

Yes, it helps only to realize the issue (consider that there is a
pending issue in bugzilla submitted more than one year ago).

I guess this has not been yet solved because it impacts on several things.

This is the reason why I've proposed the patch: hiding the problem while
code is miscompiled does not help anyone, perhaps if the problem is more
visible it will be easy to find someone that will fix it.

Also to decide (once seen the issue) that type size should be evaluated
in bytes and not in bits would help. However I don't know if this is
acceptable... I've asked, but I got no answer about the reason why it
was considered that to compute the size in bits was the right thing to do.

Most people use clang with assertions disabled, so this doesn’t really
solve the underlying problem.

Yes, it helps only to realize the issue (consider that there is a
pending issue in bugzilla submitted more than one year ago).

At a quick glance I couldn’t find the bug you’re referring to - what’s the bug number?

I guess this has not been yet solved because it impacts on several things.

This is the reason why I’ve proposed the patch: hiding the problem while
code is miscompiled does not help anyone, perhaps if the problem is more
visible it will be easy to find someone that will fix it.

I tend to agree - at least having the assert gives some uses some value until the bug is addressed if it’s been around for this long. Adding a FIXME next to the assert so it’s clear that it’s just temporary (possibly even referring to the bug number - I’m not sure if LLVM/Clang coding conventions have rules around whether to avoid or prefer referring to specific bug numbers in the production code). It’d make it easier to diagnose duplicates of the bug, for example - simply by running the code against a debug build.

Also to decide (once seen the issue) that type size should be evaluated
in bytes and not in bits would help. However I don’t know if this is
acceptable… I’ve asked, but I got no answer about the reason why it
was considered that to compute the size in bits was the right thing to do.

I haven’t looked closely - but I expect one of the reasons that sizes are handled in bits is for codegen to LLVM IR which represents the size of types in bits (eg: a Boolean value can be represented as an ‘i1’). I don’t know whether LLVM IR has limitations on the size of its types (eg: if LLVM itself parses these sizes & stores them in uint64s too - in which case the size of your array may be too big to even be described in current LLVM IR which would seem… problematic).

  • David

http://llvm.org/bugs/show_bug.cgi?id=8256

Seems reasonable to me. I'm not a code owner so I can't officially
sign off on anything, but I think it's post-commit reviewable (or that
people have had long enough to voice an opinion on it, if they cared,
already), really.

- David

In the interests of my own record keeping (& anyone else who stumbles
across this thread) the assert was checked in as r146482