Minimum Array Size

Hello,

clang currently seems to generate the same code for both:

double something_a(char A[const static 256]) {
  ...
}

and for:

double something_b(char (*const A)) {
  ...
}

even though in the first case the programmer has told us that the array
A is at least 256 bytes in length (and, thus, will not be null). Do we
currently have a way to pass this information to LLVM?

Thanks again,
Hal

No.

-Eli

Hal Finkel wrote:

Hello,

clang currently seems to generate the same code for both:

double something_a(char A[const static 256]) {
   ...
}

and for:

double something_b(char (*const A)) {
   ...
}

even though in the first case the programmer has told us that the array
A is at least 256 bytes in length (and, thus, will not be null). Do we
currently have a way to pass this information to LLVM?

No, but I'm interested in this. C++ references imply that 'n' bytes of the pointer may be dereferenced, and we have no way of capturing that fact. It would feed into llvm::isSafeToLoadUnconditionally().

Nick

Hal Finkel wrote:
> Hello,
>
> clang currently seems to generate the same code for both:
>
> double something_a(char A[const static 256]) {
> ...
> }
>
> and for:
>
> double something_b(char (*const A)) {
> ...
> }
>
> even though in the first case the programmer has told us that the
> array A is at least 256 bytes in length (and, thus, will not be
> null). Do we currently have a way to pass this information to LLVM?

No, but I'm interested in this. C++ references imply that 'n' bytes
of the pointer may be dereferenced, and we have no way of capturing
that fact. It would feed into llvm::isSafeToLoadUnconditionally().

At the risk of starting yet another metadata proposal discussion, shall
we add some metadata to describe this? These cases could be handled by
metadata with an integer parameter. Is there a reasonable way of
handling cases like:
void foo(int n, int q[n][n]) { ... } ?

-Hal

Hal Finkel wrote:

Hal Finkel wrote:

Hello,

clang currently seems to generate the same code for both:

double something_a(char A[const static 256]) {
    ...
}

and for:

double something_b(char (*const A)) {
    ...
}

even though in the first case the programmer has told us that the
array A is at least 256 bytes in length (and, thus, will not be
null). Do we currently have a way to pass this information to LLVM?

No, but I'm interested in this. C++ references imply that 'n' bytes
of the pointer may be dereferenced, and we have no way of capturing
that fact. It would feed into llvm::isSafeToLoadUnconditionally().

At the risk of starting yet another metadata proposal discussion, shall
we add some metadata to describe this? These cases could be handled by
metadata with an integer parameter.

How about function attributes?

  Is there a reasonable way of

handling cases like:
void foo(int n, int q[n][n]) { ... } ?

Hmm, I hadn't thought of that.

Does that ever give us useful information with which to optimize? I don't see how isSafeToLoadUnconditionally could ever answer "yes" if it doesn't know what 'n' is at compile time.

I suppose the cases where it's useful would involve a loop-aware (SCEV powered?) version of isSafeToLoadUnconditionally where we know that we're going through 'i = [0..n)', or where 'n' is solved through other optimizations and gets inlined.

What about a new intrinsic indicating that right here (where the intrinsic is called) it is safe to load-unconditionally 'n' bytes from pointer 'p'?

Nick

Hal Finkel wrote:
>
>> Hal Finkel wrote:
>>> Hello,
>>>
>>> clang currently seems to generate the same code for both:
>>>
>>> double something_a(char A[const static 256]) {
>>> ...
>>> }
>>>
>>> and for:
>>>
>>> double something_b(char (*const A)) {
>>> ...
>>> }
>>>
>>> even though in the first case the programmer has told us that the
>>> array A is at least 256 bytes in length (and, thus, will not be
>>> null). Do we currently have a way to pass this information to
>>> LLVM?
>>
>> No, but I'm interested in this. C++ references imply that 'n' bytes
>> of the pointer may be dereferenced, and we have no way of capturing
>> that fact. It would feed into llvm::isSafeToLoadUnconditionally().
>
> At the risk of starting yet another metadata proposal discussion,
> shall we add some metadata to describe this? These cases could be
> handled by metadata with an integer parameter.

How about function attributes?

Good point. A parameter attribute would make sense; something like
validfor(<n>) with n being the number of bytes (or would it be better
to specify things in terms of sizeof(type))? Your intrinsic suggestions
below is, however, probably better.

  Is there a reasonable way of
> handling cases like:
> void foo(int n, int q[n][n]) { ... } ?

Hmm, I hadn't thought of that.

Does that ever give us useful information with which to optimize?

I'm not sure. There are certainly cases when vectorizing where that
might be useful. I have no idea if any of these cases (in real code) are
in function where the parameters are declared like that.

I
don't see how isSafeToLoadUnconditionally could ever answer "yes" if
it doesn't know what 'n' is at compile time.

I suppose the cases where it's useful would involve a loop-aware
(SCEV powered?) version of isSafeToLoadUnconditionally where we know
that we're going through 'i = [0..n)', or where 'n' is solved through
other optimizations and gets inlined.

What about a new intrinsic indicating that right here (where the
intrinsic is called) it is safe to load-unconditionally 'n' bytes
from pointer 'p'?

I like that idea; we can start by accepting and ignoring all of the
complicated cases. Something like:
i32 @llvm.pointervalidfor.i32(i8* <object>, i32 <size>)
i64 @llvm.pointervalidfor.i64(i8* <object>, i64 <size>) ?

-Hal

Is this really the case? In what language(s)? As I understand it, the C99
standard simply states that a function argument declared as "array of T" is
equivalent to "pointer to T", so it does not really guarantee anything about
whether the argument is a non-null pointer to a valid array of the given
number of elements.

Is this really the case? In what language(s)? As I understand it, the C99
standard simply states that a function argument declared as "array of T" is
equivalent to "pointer to T",

"static" is special though. 6.7.5.3p7 says:

"If the keyword static also appears within the [ and ] of the array
type derivation, then for each call to the function, the value of the
corresponding actual argument shall provide access to the first
element of an array with at least as many elements as specified by the
size expression".

In practice, this mostly just means a compiler can access the
specified number of elements with impunity. In theory you could have a
situation where arrays were allocated differently to normal pointers
and do even less checks (or perhaps look at a cookie before each array
giving its real length...)

Tim.