inttoptr weirdness

Hi again.

I have a complex type system in my custom language that isn't easily
representable as LLVM IR types, so I figured I could mostly get along
with treating my types as i8* and doing the appropriate bitcasts and
inttoptr instructions, and doing pointer arithmetic myself (by casting
the pointers to ints, adding the appropriate byte offsets, and then
casting back to pointers).

However, I've found some oddities. While I have no problems
generating IR, when I run it through the optimizer (opt -O3), it
generates what appears to be totally wrong code.

Here's my test case:

I have a global called *testObj*. It would look like this in C:

struct TestObjClass
{
  int32 dummy1, dummy2;
  int32* m_array;
}

extern TestObjClass* testObj;

and I'm trying to access:

testObj->m_array[1] = 10

Theoretically, this should be a load to get the pointer to testObj
(since it's a global), I should add 8 bytes, then do another load (to
load the address of the array), add 4 bytes to get the array[1]
element, and then store at that pointer the number 10.

Here's my original output:

@"compile-test::*testObj*" = external constant i8* ; <i8**> [#uses=1]

define void @"compile-test::__toplevel-main"() {
entry:
  store i8* null, i8** @"compile-test::*testObj*"
  %1 = load i8** @"compile-test::*testObj*" ; <i8*> [#uses=1]
  %2 = ptrtoint i8* %1 to i32 ; <i32> [#uses=1]
  %3 = add i32 %2, 8 ; <i32> [#uses=1]
  %4 = inttoptr i32 %3 to i8* ; <i8*> [#uses=1]
  %5 = load i8* %4 ; <i8> [#uses=1]
  %6 = inttoptr i8 %5 to i8* ; <i8*> [#uses=1]
  %7 = ptrtoint i8* %6 to i32 ; <i32> [#uses=1]
  %8 = add i32 %7, 4 ; <i32> [#uses=1]
  %9 = inttoptr i32 %8 to i8* ; <i8*> [#uses=1]
  %10 = bitcast i8* %9 to i32* ; <i32*> [#uses=1]
  store i32 10, i32* %10
  ret void
}

This seems right to me. However, when I run it through opt -O3 and
the through llvm-dis:

@"compile-test::*testObj*" = external constant i8* ; <i8**> [#uses=1]
define void @"compile-test::__toplevel-main"() {
entry:
  %0 = load i8* inttoptr (i64 8 to i8*), align 8 ; <i8> [#uses=1]
  %1 = inttoptr i8 %0 to i8* ; <i8*> [#uses=1]
  %2 = ptrtoint i8* %1 to i32 ; <i32> [#uses=1]
  %3 = add i32 %2, 4 ; <i32> [#uses=1]
  %4 = inttoptr i32 %3 to i32* ; <i32*> [#uses=1]
  store i32 10, i32* %4
  ret void
}

Notice how there's no mention of compile-test::*testObj* at all.
Instead, the first line is loading from (i64 8)! What am I doing
wrong?

Thanks in advance,
Scott

define void @"compile-test::__toplevel-main"() {
entry:
store i8* null, i8** @"compile-test::*testObj*"
%1 = load i8** @"compile-test::*testObj*" ; <i8*> [#uses=1]

Here, %1 is guaranteed to be null.

   %2 = ptrtoint i8\* %1 to i32             ; &lt;i32&gt; \[\#uses=1\]
   %3 = add i32 %2, 8              ; &lt;i32&gt; \[\#uses=1\]
   %4 = inttoptr i32 %3 to i8\*             ; &lt;i8\*&gt; \[\#uses=1\]
   %5 = load i8\* %4                ; &lt;i8&gt; \[\#uses=1\]

So therefore, this load loads from null+8.

-Eli

Never mind, I'm stupid. I forgot that I was setting the global to
NULL immediately beforehand - so LLVM was optimizing this out.

Scott

Hi again.

I have a complex type system in my custom language that isn't easily
representable as LLVM IR types, so I figured I could mostly get along
with treating my types as i8* and doing the appropriate bitcasts and
inttoptr instructions, and doing pointer arithmetic myself (by casting
the pointers to ints, adding the appropriate byte offsets, and then
casting back to pointers).

However, I've found some oddities. While I have no problems
generating IR, when I run it through the optimizer (opt -O3), it
generates what appears to be totally wrong code.

Here's my test case:

I have a global called *testObj*. It would look like this in C:

struct TestObjClass
{
int32 dummy1, dummy2;
int32* m_array;
}

extern TestObjClass* testObj;

and I'm trying to access:

testObj->m_array[1] = 10

Theoretically, this should be a load to get the pointer to testObj
(since it's a global), I should add 8 bytes, then do another load (to
load the address of the array), add 4 bytes to get the array[1]
element, and then store at that pointer the number 10.

Here's my original output:

@"compile-test::*testObj*" = external constant i8* ; <i8**> [#uses=1]

define void @"compile-test::__toplevel-main"() {
entry:
store i8* null, i8** @"compile-test::*testObj*"

I'm surprised this store got optimized out, even though LLVM can
optimize away the subsequent load. Writing to an external global
variable is a visible side-effect, and unless there's other undefined
behavior, LLVM shouldn't remove it.

   %1 = load i8\*\* @&quot;compile\-test::\*testObj\*&quot;               ; &lt;i8\*&gt; \[\#uses=1\]
   %2 = ptrtoint i8\* %1 to i32             ; &lt;i32&gt; \[\#uses=1\]
   %3 = add i32 %2, 8              ; &lt;i32&gt; \[\#uses=1\]
   %4 = inttoptr i32 %3 to i8\*             ; &lt;i8\*&gt; \[\#uses=1\]

You may be able to save some instructions (and maybe give the
optimizers more information) by replacing the above with

  %4 = getelementptr i8* %1, i32 8

That'll be equivalent to the inttoptr(ptrtoint(%4) + 8) on systems
with 8-bit bytes.

   %5 = load i8\* %4                ; &lt;i8&gt; \[\#uses=1\]
   %6 = inttoptr i8 %5 to i8\*              ; &lt;i8\*&gt; \[\#uses=1\]
   %7 = ptrtoint i8\* %6 to i32             ; &lt;i32&gt; \[\#uses=1\]

The above two lines look odd to me. Aren't they equivalent to

  %7 = zext i8 %5 to i32

?

   %8 = add i32 %7, 4              ; &lt;i32&gt; \[\#uses=1\]
   %9 = inttoptr i32 %8 to i8\*             ; &lt;i8\*&gt; \[\#uses=1\]
   %10 = bitcast i8\* %9 to i32\*            ; &lt;i32\*&gt; \[\#uses=1\]
   store i32 10, i32\* %10

And then here, if I'm not mistaken, you're converting %5, which is an
i8, that is 0<=%5<256, to a pointer, and then storing through it.
Unless you're on an embedded system, that's a guaranteed segfault,
right?

All that said, in the future you can produce a better bug report by
trying to find which pass is making the surprising transformation. See
LLVM bugpoint tool: design and usage — LLVM 18.0.0git documentation for instructions on using bugpoint
to automatically reduce the list of passes.

Sure it can, llvm can delete any non-volatile redundant load, or any non-volatile redundant store. It doesn't matter whether it is to a global or not, LLVM (as with many compilers) memory models are for single threaded programs. We do try to conform to the C++'0x memory model by not introducing memory accesses where they did not exist before, but deleting non-volatile accesses is always fine.

-Chris

That's true, but it's often hard to figure out a store to a global can be deleted, as a use may be arbitrarily far away, in a different file for example.

In this case the global is declared constant, so storing to it is invalid. That is probably why it got removed.

Sure, the compiler can only remove it if it can see the redefinition. In practice, stores are only deleted when they are undefined behavior (e.g. the target is read only) or if it sees a subsequent store to the same location (with no uses between them).

-Chris