Relational operators on pointers

Hi:

I got a problem with C++ semantics.

As far as I understand from the standard (Sec. 5.9) and various discussions on comp.std.c++ the use of >=, < etc. is not generally defined for pointers.

However, it says, it should be defined for pointers on the same array.

So, what I would expect to work is a comparison of char* >= char* in any circumstances.
No matter from where I obtain those pointers.

My situation is an interpreter implementation with a fixed heap of 2GB size in a 32bit program.

The heap is allocated with mmap. The example snippet is given below.
`read_mostly_memory_base` and so on are defined as char*.

When the test started failing after switching from GCC 4.2 to LLVM-GCC 4.2 and later Clang, I tried to compensate for that with casting the pointers to uintptr_t.
The reason for that was, that from my testing, I concluded that the compiler treats the pointer values as signed integers. This causes problems, since with a heap of 2GB, we obviously reach into address with the 32nd bit/sign bit set.

And indeed, masking and testing the sign bit explicitly allows me to distinguish these cases.
However, all kind of casts to unsigned integers did not have any observable effect.

So, my question now is, how can I make sure that all pointer comparisons are treated as interpreting the memory address as an unsigned 32bit value?
As you can imagine, there are quite a couple of relational tests all over the place in our code base.

Here the code in question:

void Memory_System::map_read_write_and_read_mostly_memory(int pid,
                                                          size_t total_read_write_memory_size,
                                                          size_t total_read_mostly_memory_size) {
  read_mostly_memory_base = map_heap_memory(total_read_mostly_memory_size + total_read_write_memory_size,
                                                pid, MAP_SHARED);
  read_mostly_memory_past_end = read_mostly_memory_base + total_read_mostly_memory_size;
  
  read_write_memory_base = read_mostly_memory_past_end;
  read_write_memory_past_end = read_write_memory_base + total_read_write_memory_size;
  
  if (On_Tilera) {
    mark_incoherent(read_mostly_memory_base, total_read_mostly_memory_size);
  }
  
  if (read_write_memory_base >= read_write_memory_past_end) {
    unlink(mmap_filename);
    fatal("begining should be before the end");
  }
  
  /* Make sure that the addresses are layed out as expected:
               read-mostly heap first, then read/write heap */
  if (read_mostly_memory_base >= read_write_memory_past_end) {
    printf("grand_total: %lu, read_mostly_memory_base: 0x%llx, read_mostly_memory_past_end: 0x%llx\n"
           " read_write_memory_base: 0x%llx, read_write_memory_past_end: 0x%llx\n",
           total_read_write_memory_size + total_read_mostly_memory_size,
           (u_int64_t)read_mostly_memory_base, (u_int64_t)read_mostly_memory_past_end,
           (u_int64_t)read_write_memory_base, (u_int64_t)read_write_memory_past_end);
    unlink(mmap_filename);
    fatal("contains will fail");
  }
}

On more thing: I also looked at what the Clang produces with -O4 -S for that case, and that actually does not reflect my earlier conclusion but makes me even more confused.
Perhaps someone can explain me why the mark parts (with !!) are valid.

Thanks a lot
Stefan

define void @_ZN13Memory_System37map_read_write_and_read_mostly_memoryEimm(%class.Memory_System* nocapture %this, i32 %pid, i32 %total_read_write_memory_size, i32 %total_read_mostly_memory_size) ssp align 2 {
  %1 = add i32 %total_read_mostly_memory_size, %total_read_write_memory_size
  %2 = tail call i8* @_ZN13Memory_System15map_heap_memoryEmii(i32 %1, i32 %pid, i32 1)
  %3 = getelementptr inbounds %class.Memory_System* %this, i32 0, i32 4
  store i8* %2, i8** %3, align 4
  %4 = getelementptr inbounds i8* %2, i32 %total_read_mostly_memory_size
  %5 = getelementptr inbounds %class.Memory_System* %this, i32 0, i32 5
  store i8* %4, i8** %5, align 4
  %6 = getelementptr inbounds %class.Memory_System* %this, i32 0, i32 2
  store i8* %4, i8** %6, align 4
  %7 = getelementptr inbounds i8* %2, i32 %1
  %8 = getelementptr inbounds %class.Memory_System* %this, i32 0, i32 3
  store i8* %7, i8** %8, align 4

!! ; To my understanding this should correspond to
!! ; `if (read_write_memory_base >= read_write_memory_past_end) {`, but I don't see why that would be correct
!! %9 = icmp sgt i32 %1, %total_read_mostly_memory_size
  br i1 %9, label %12, label %10

; <label>:10 ; preds = %0
  %11 = tail call i32 @unlink(i8* getelementptr inbounds ([1024 x i8]* @_ZN13Memory_System13mmap_filenameE, i32 0, i32 0))
  tail call void @_Z14assert_failurePKcS0_iS0_S0_(i8* getelementptr inbounds ([38 x i8]* @__func__._ZN13Memory_System37map_read_write_and_read_mostly_memoryEimm, i32 0, i32 0), i8* getelementptr inbounds ([74 x i8]* @.str5, i32 0, i32 0), i32 703, i8* getelementptr inbounds ([2 x i8]* @.str6, i32 0, i32 0), i8* getelementptr inbounds ([41 x i8]* @.str30, i32 0, i32 0)) noreturn
  unreachable

; <label>:12 ; preds = %0
!! ; This should be `if (read_mostly_memory_base >= read_write_memory_past_end) { `, I think
!! ; but I also do not see why that would be correct
!! %13 = icmp sgt i32 %1, 0
  br i1 %13, label %20, label %14

; <label>:14 ; preds = %12
  %tmp = ptrtoint i8* %2 to i32
  %15 = zext i32 %tmp to i64
  %tmp1 = ptrtoint i8* %4 to i32
  %16 = zext i32 %tmp1 to i64
  %tmp3 = ptrtoint i8* %7 to i32
  %17 = zext i32 %tmp3 to i64
  %18 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([180 x i8]* @.str31, i32 0, i32 0), i32 %1, i64 %15, i64 %16, i64 %16, i64 %17)
  %19 = tail call i32 @unlink(i8* getelementptr inbounds ([1024 x i8]* @_ZN13Memory_System13mmap_filenameE, i32 0, i32 0))
  tail call void @_Z14assert_failurePKcS0_iS0_S0_(i8* getelementptr inbounds ([38 x i8]* @__func__._ZN13Memory_System37map_read_write_and_read_mostly_memoryEimm, i32 0, i32 0), i8* getelementptr inbounds ([74 x i8]* @.str5, i32 0, i32 0), i32 721, i8* getelementptr inbounds ([2 x i8]* @.str6, i32 0, i32 0), i8* getelementptr inbounds ([26 x i8]* @.str32, i32 0, i32 0)) noreturn
  unreachable

; <label>:20 ; preds = %12
  ret void
}

The transformation in question is assuming a single memory allocation
is less than 2GB for 32-bit code. That's usually correct, but I guess
it isn't in your case. IIRC, -fwrapv will force the compiler to do
what you want here.

-Eli

Hi Eli:

Assuming that the C++ specification says the same thing as the C specification, comparisons between pointers in different object (allocations) is undefined. This is independent of what you cast the pointers to. I believe this part of the specification was intended to allow for segmented memory architectures, where pointers do not necessarily have a meaningful ordering outside of an allocation and can be moved around (e.g. Burroughs Large Systems architecture).

If you want to compare two unrelated pointer types, then you should first cast them to an integer type. Pointer to integer casts (that do not involve truncation) are required to give a unique integer representation of the pointer. So, if you want to do a comparison between two unrelated pointers as unsigned values, the safest thing to do is cast them to uintptr_t and then do the comparison.

David

-- Sent from my STANTEC-ZEBRA

Hi:

So, if you want to do a comparison between two unrelated pointers as unsigned values, the safest thing to do is cast them to uintptr_t and then do the comparison.

Right, done, and failed. In this particular case, as Eli pointed out, some optimization is applied.

All my casting to what ever type, and with whatever mechanism, did not help.
Not even using temp variables of 64bit types with which ever signage.
That all got just optimized away...
Leaving me with a failing test without apparent reason.

Best regards
Stefan

What if I have an object that happens to include addresses both with and without the sign bit set? Does pointer comparison still work correctly? If I read Stefan's mail correctly, that is a possibility since the heap spans that boundary and thus objects allocated within it could as well...
  
Steven Vormwald

Hi:

So, if you want to do a comparison between two unrelated pointers as unsigned values, the safest thing to do is cast them to uintptr_t and then do the comparison.

Right, done, and failed. In this particular case, as Eli pointed out, some optimization is applied.

We shouldn't be messing with integer comparisons...

All my casting to what ever type, and with whatever mechanism, did not help.
Not even using temp variables of 64bit types with which ever signage.
That all got just optimized away...
Leaving me with a failing test without apparent reason.

What version of clang are you using?

-Eli

Hi:

Hi:

So, if you want to do a comparison between two unrelated pointers as unsigned values, the safest thing to do is cast them to uintptr_t and then do the comparison.

Right, done, and failed. In this particular case, as Eli pointed out, some optimization is applied.

We shouldn't be messing with integer comparisons...

I would guess, the transformation implied by assuming certain allocation behavior comes first, and then leaves nothing for the casts to do their thing.

I attached a smaller self-contained test-case.

Executed with:
$ /usr/bin/clang++ -m32 -Wextra -Wno-write-strings -fdiagnostics-show-option -Wno-unused-variable -Wno-unused-value -fno-rtti -O4 bug-report.cpp -o bug-report.bin ; ./bug-report.bin

Results in the output:
beginning should be before the ends

Expected output:
done

The 'incorrect' output is consistently issued by binaries compiled by all compilers listed below, except of i686-apple-darwin10-g++-4.2.1.
And similar, -O0 does produce the correct output, too.

Adding casts like this:

if ((uintptr_t)read_write_memory_base >= (uintptr_t)read_write_memory_past_end) {
  printf("beginning should be before the end\n");
  exit(-1);
}

if ((uintptr_t)read_mostly_memory_base >= (uintptr_t)read_write_memory_past_end) {

does not change the results: incorrect behavior for llvm-gcc, clang 2.9 and 3.0.
  

All my casting to what ever type, and with whatever mechanism, did not help.
Not even using temp variables of 64bit types with which ever signage.
That all got just optimized away...
Leaving me with a failing test without apparent reason.

What version of clang are you using?

$ /usr/bin/clang++ --version
Apple clang version 3.0 (tags/Apple/clang-211.10.1) (based on LLVM 3.0svn)
Target: x86_64-apple-darwin10.8.0
Thread model: posix

$ /opt/local/bin/clang++ --version
clang version 2.9 (tags/RELEASE_29/final)
Target: x86_64-apple-darwin10
Thread model: posix

$ /usr/bin/llvm-g++-4.2 --version
i686-apple-darwin10-llvm-g++-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.1.00)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ g++-4.2 --version
i686-apple-darwin10-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Side note:
For llvm-g++-4.2 I had to use -O3 instead of -O4 because it aborted with the following error:

ld: lto: could not merge in /var/folders/In/InPYSzVvHvGU0ndnO3LOyU+++TI/-Tmp-//cc9rOCdg.o because Unknown instruction for architecture i386
collect2: ld returned 1 exit status
-bash: ./bug-report.bin: No such file or directory

Best regards
Stefan

bug-report.cpp (2.1 KB)

Hi:

Hi:

So, if you want to do a comparison between two unrelated pointers as unsigned values, the safest thing to do is cast them to uintptr_t and then do the comparison.

Right, done, and failed. In this particular case, as Eli pointed out, some optimization is applied.

We shouldn't be messing with integer comparisons...

I would guess, the transformation implied by assuming certain allocation behavior comes first, and then leaves nothing for the casts to do their thing.

Oh, right, we make assumptions about the result of the pointer
arithmetic. Again, -fwrapv should suppress those assumptions and make
everything work.

-Eli