c const

How is c's const keyword translated when compiling c into llvm bytecode. I'm specifically interested in const pointer function arguments. Consider a function declared as follows in c:

void f(const int* arg);

When I examine f in llvm bytecode, how can I tell that arg is a pointer, whose contents can only be read, not written.

Regards,
Ryan

How is c's const keyword translated when compiling c into
llvm bytecode.

It isn't. You can verify this quite simply with the following
test program:

void a(const void *p)
{
}

void b(void *p)
{
}

$ clang --emit-llvm test.c
; ModuleID = 'foo'

define void @a(i8* %p) {
entry:
        %p.addr = alloca i8* ; <i8**> [#uses=1]
        %allocapt = bitcast i32 undef to i32 ; <i32> [#uses=0]
        store i8* %p, i8** %p.addr
        ret void
}

define void @b(i8* %p) {
entry:
        %p.addr = alloca i8* ; <i8**> [#uses=1]
        %allocapt = bitcast i32 undef to i32 ; <i32> [#uses=0]
        store i8* %p, i8** %p.addr
        ret void
}

Now, I tried this with the gcc 4.0 frontend as well:

$ gcc -c --emit-llvm test.c -o - | llvm-dis -o -
; ModuleID = '<stdin>'
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32"
target triple = "i686-pc-linux-gnu"

define void @a(i8* %p) {
entry:
        %p_addr = alloca i8* ; <i8**> [#uses=1]
        %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0]
        store i8* %p, i8** %p_addr
        br label %return

return: ; preds = %entry
        ret void
}

define void @b(i8* %p) {
entry:
        %p_addr = alloca i8* ; <i8**> [#uses=1]
        %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0]
        store i8* %p, i8** %p_addr
        br label %return

return: ; preds = %entry
        ret void
}

As you can see, with both C compilers the generated intermediate language
is the same for both function a() and b().

Thanks for the help. I'm surprised that llvm doesn't preserve the const property. I thought that it was supposed to be useful for various compiler optimizations.

Holger Schurig wrote:

This property isn't preserved on the llvm ir, because const can always be cast away. If you want mod information, then I suggest using the aliasanalysis interface to get mod ref info for a call.

-Chris

http://nondot.org/sabre
http://llvm.org

Hi,

I think I found a bug. I don't know if it's in upstream gcc or llvm-gcc4.

int func()
{
    const int *arr;
    arr[0] = 1;
}

$ llvm-gcc main.c -c; echo $?
0

$ gcc main.c -c
main.c: In function 'func':
main.c:4: error: assignment of read-only location

The difference disappears when arr[0] is replaced by *arr.

(I tried the above with gcc 4.1.2, 3.4.6, 4.0.3. (I don't have access
to 4.0.1, off which llvm-gcc seems to be derived.)

nikhil

I think I found a bug. I don't know if it's in upstream gcc or llvm-gcc4.

Looks like a bug, please file a bugzilla entry.

-Chris

int func()
{
   const int *arr;
   arr[0] = 1;
}

$ llvm-gcc main.c -c; echo $?
0

$ gcc main.c -c
main.c: In function 'func':
main.c:4: error: assignment of read-only location

The difference disappears when arr[0] is replaced by *arr.

(I tried the above with gcc 4.1.2, 3.4.6, 4.0.3. (I don't have access
to 4.0.1, off which llvm-gcc seems to be derived.)

nikhil

This property isn't preserved on the llvm ir, because const can always
be cast away. If you want mod information, then I suggest using the
aliasanalysis interface to get mod ref info for a call.

-Chris

Chris Lattner's Homepage
http://llvm.org

How is c's const keyword translated when compiling c into llvm
bytecode.
I'm specifically interested in const pointer function arguments.
Consider a function declared as follows in c:

void f(const int* arg);

When I examine f in llvm bytecode, how can I tell that arg is a
pointer,
whose contents can only be read, not written.

Regards,
Ryan
_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-Chris

This certainly doesn't occur in gcc mainline.
In fact, I improved the error message, and added a error test to gcc
just yesterday.

This certainly doesn't occur in gcc mainline.
In fact, I improved the error message, and added a error test to gcc
just yesterday.

Yep, clang reports:

t.c:4:12: error: read-only variable is not assignable
     arr[0] = 1;
     ~~~~~~ ^
1 diagnostic generated.

so this is specific to llvm-gcc somehow.

-Chris

I have filed the bug, and added your comments.
    1603 – llvm-gcc accepts illegal assignment to const ptr

nikhil

I don't mean to be a pain, but I was thinking about this a bit more. Does gcc ignore the const keyword? If not, why has LLVM chosen to deviate from gcc with respect to the const keyword? If so, then why do we bother using const in LLVM API code? I'm just curious and wanted to understand the thinking behind not preserving const.

Thanks,
Ryan

Chris Lattner wrote:

Constness is preserved by the front-end. It's the front-end's job to check that a "const" attribute for a language (and its meaning differs among them) is used correctly according to the semantics of that language. So, if you have a language that prevents you from assigning a value to a const variable, then there will never be an assignment to that variable in LLVM's IR.

-bw

I don't follow what you mean - gcc doesn't ignore const and llvm doesn't deviate from gcc nor from the relevant language standards. Note that if you declare a global as const that we do capture this in the ir - what specifically do you want? Please provide an example.

-Chris

http://nondot.org/sabre
http://llvm.org

I guess Bill Wendling cleared it up a little more for me. Neither llvm nor gcc ignore const. The type checking in each of their frontends makes sure that const is not violated.

The reason I was asking about const is as follows. I was under the impression that const was part of c to aid the compiler with optimization and not just for type checking purposes. As you previously pointed out, I could get mod info from the llvm alias analysis interface. However, I was thinking why go through the expense of computing alias analysis and bother with the imprecision of alias analysis, if the programmer is going to tell you that the memory pointed to by a pointer argument is never written.

Chris Lattner wrote:

if the programmer is going to tell you that the memory pointed
to by a pointer argument is never written.

There are ways to cast away a const.

Even if there weren't, a const value can change, just not through a const reference. Consider this code:

   int counter = 0;
   bool f(const int *ci) {
     int i = *ci;
     counter++;
     return *ci == i;
   }

Naively, it seems trivial to observe *ci is const and i == *ci, therefore i == *ci is always true:

   int counter = 0;
   bool f(const int *ci) {
     counter++;
     return true;
   }

However, that is a miscompilation since ci and &counter may alias:

   f(&counter); // must return false

So even in the absence of const_cast, most uses of const are not very useful for optimizers. As has been pointed out, alias analysis provides similar data that can be used for optimization.

— Gordon

You don't even need casts:

void foo(const int *P, int *Q) {
   x = *P;
   *Q = 1;
   y = *P // redundant?
}

void bar() {
   int X = 0;
   foo(&X, &X);
}

-Chris

If you use a const * __restrict pointer then you should get the benefit of this as the alias analysis will assume that the pointed to object is neither aliased nor written.

Hi Christopher,

>>> if the programmer is going to tell you that the memory pointed
>>> to by a pointer argument is never written.

If you use a const * __restrict pointer then you should get the
benefit of this as the alias analysis will assume that the pointed to
object is neither aliased nor written.

can you please explain more about what restrict means: it may help
in improving code quality for Ada. In Ada you have runtime constants
that are really constant, for example array bounds. The bounds are
passed around by pointer, which causes LLVM to think they may be
aliased and changed by function calls (which is impossible). This
results on rotten code since (for example) the array length and
bounds checks are recalculated again and again when one calculation
would do. [The front-end outputs a bound check for every array access,
expecting the optimizers to remove redundant checks, which LLVM often
does not do]. If I could teach LLVM that array bounds are really constant
that would presumably solve the problem.

Thanks,

Duncan.

Here’s a thread about it:

http://lists.cs.uiuc.edu/pipermail/llvmdev/2007-March/thread.html#8291

I don’t think anything has been implemented.

— Gordon

can you please explain more about what restrict means: it may help in improving code quality for Ada. In Ada you have runtime constants that are really constant, for example array bounds. The bounds are passed around by pointer, which causes LLVM to think they may be aliased and changed by function calls (which is impossible). This results on rotten code since (for example) the array length and bounds checks are recalculated again and again when one calculation would do. [The front-end outputs a bound check for every array access, expecting the optimizers to remove redundant checks, which LLVM often does not do]. If I could teach LLVM that array bounds are really constant that would presumably solve the problem.

The benefits of a const * __restrict come from two different places. The const part is essentially enforced by the front-end and the restrict part is used to inform the alias analysis (it becomes a noalias parameter attribute). The noalias parameter attribute may be of use to you eventually, but full noalias implementation isn’t yet complete. Specifically the case where a function with noalias parameter attributes is inlined does not currently preserve the noalias information.

Here’s a thread about it:

http://lists.cs.uiuc.edu/pipermail/llvmdev/2007-March/thread.html#8291

You should also take a look at PR 1373, as that is where progress is being tracked. http://llvm.org/bugs/show_bug.cgi?id=1373

I don’t think anything has been implemented.

Per the discussion and PR there has been work done to implement the ‘noalias’ parameter attribute in LLVM, and currently BasicAA will use this attribute to inform alias queries that are made. There has also been work to map __restrict C/C++ pointer and reference parameters onto the noalias parameter attribute. There is still much work to be done to fully implement noalias in LLVM, notably the intrinsic and updates to tolerate/use it, as well as to fully support all uses of the __restrict qualifier in the C/C++ front end.