GSOC proposal submission

How do I submit my GSOC proposal to the llvmdev mailing list ?

Hi,

  You can refer to this GSOC proposal.

  GSoC: PTX Back-End for LLVM
  http://groups.google.com/group/llvm-dev/msg/a64b2bd9700ac41f

Regards,
chenwj

How do I submit my GSOC proposal to the llvmdev mailing list ?

Posting your proposal to the mailing list is just to get feedback before submitting it (which is very much recommended). To actually submit your proposal, you need to register on the Google Summer of Code website and submit it there.

http://www.google-melange.com/gsoc/homepage/google/gsoc2011

Hi,

I'm trying to know if an llvm::Value Object is signed or unsigned (when its Type is integer).
I don't find where this information is located.
Do you have an idea ?

Julien Henry

Hi Julien,

I'm trying to know if an llvm::Value Object is signed or unsigned (when
its Type is integer).
I don't find where this information is located.
Do you have an idea ?

this information does not exist: integer types are not signed or unsigned.
Instead, operations on integers that work differently depending on the sign
have signed and unsigned versions in LLVM. For example LLVM has both udiv
and sdiv instructions (unsigned and signed divisions).

Ciao, Duncan.

Hello

I'm trying to know if an llvm::Value Object is signed or unsigned (when
its Type is integer).
I don't find where this information is located.
Do you have an idea ?

Values cannot be signed or unsigned since they represent some object
in memory / register. It's the operation which says whether the value
should be treated as signed or not.

Values cannot be signed or unsigned since they represent some object
in memory / register. It's the operation which says whether the value
should be treated as signed or not.

Ok. But typically if I compile a program with a variable of type "unsigned" and another one of type "int":

void f() {
  unsigned x;
  int y;
  ...
}

The compiler remembers for debugging purpose that x is defined as unsigned, and y as signed, no ? I'm not familiar with LLVM debug info, but maybe I can find this info there ?

Hi Julien,

Values cannot be signed or unsigned since they represent some object
in memory / register. It's the operation which says whether the value
should be treated as signed or not.

Ok. But typically if I compile a program with a variable of type
"unsigned" and another one of type "int":

void f() {
  unsigned x;
  int y;
  ...
}

The compiler remembers for debugging purpose that x is defined as
unsigned, and y as signed, no ? I'm not familiar with LLVM debug info,
but maybe I can find this info there ?

probably it can be extracted from debug info, but what if there is no debug
info? Can you please explain what you intend to do with this information -
then maybe we can suggest another way to solve your problem.

Ciao, Duncan.

The compiler remembers for debugging purpose that x is defined as
unsigned, and y as signed, no ? I'm not familiar with LLVM debug info,
but maybe I can find this info there ?

probably it can be extracted from debug info, but what if there is no debug
info? Can you please explain what you intend to do with this information -
then maybe we can suggest another way to solve your problem.

Thanks for your help.

Actually, I'm working on a static analyzer that computes invariants at each basicBlock: "In basicBlock B, what is the set of possible assignments for each live values ?"
and I obtains results such as "In B, we have 0 <= x <= 42"

For an function's argument of type T, at the beginning of my analysis, I consider its set of possible values is all the set of all elements of type T.
In the case of an int, it is [-2^31; 2^31-1], whereas it is [0, 2^32-1] for an unsigned...
That's the reason why I was searching in the llvm bitcode something distinguishing these two types.

Julien Henry

Hi Julien,

For an function's argument of type T, at the beginning of my analysis, I
consider its set of possible values is all the set of all elements of
type T.
In the case of an int, it is [-2^31; 2^31-1], whereas it is [0, 2^32-1]
for an unsigned...
That's the reason why I was searching in the llvm bitcode something
distinguishing these two types.

there is no such information. You can still consider every type to have values
in, say, T = [-2^31; 2^31-1]. Probably you are trying to deduce an interval of
possible values for each register. You will need to allow intervals to wrap
around the end of T since (eg) the basic "add" operator in LLVM uses modulo
arithmetic, i.e. if you add 1 to 2^31-1 you get -2^31, which forces you to use
wrapping intervals. Having wrapping intervals makes interval arithmetic more
tricky but it can still be done. Whether an operation is signed (eg: sdiv, i.e.
signed division) or unsigned (eg: udiv, i.e. unsigned division) you need to
calculate the smallest interval that contains all possible result values given
the intervals the arguments are known to belong to, or a conservative
approximation to it. This is perfectly do-able, it's just that your interval
for the possible values for the result probably won't always be as precise as
you would like [*]. Some operations may be annotated with the "nsw" (no signed
wrap) or "nuw" (no unsigned wrap) flags which allows you to do a better job
if you are willing to assume that the program does not perform undefined
behaviour.

Ciao, Duncan.

[*] LLVM has a ConstantRange that implements logic of this kind, so you don't
even need to write it yourself!

Well, you have to find that from the comparisons that control the
branches to the blocks, so you determine signedness from the
signedness of the comparison instructions. (And get to have the fun
of figuring out the right thing to do when the instructions mix
signedness.)

~ Scott

me22 wrote:

Actually, I'm working on a static analyzer that computes invariants at
each basicBlock: "In basicBlock B, what is the set of possible
assignments for each live values ?"
and I obtains results such as "In B, we have 0<= x<= 42"

Well, you have to find that from the comparisons that control the
branches to the blocks, so you determine signedness from the
signedness of the comparison instructions. (And get to have the fun
of figuring out the right thing to do when the instructions mix
signedness.)

It's not that hard :slight_smile: There's already two ways to get a ConstantRange out of an ICmpInst; ICmpInst::makeConstantRange(predicate, apint) (eg., produce the range representing everything signed-less than 100) and ConstantRange::makeICmpRegion(predicate, constant-range) (eg., produce the range of all values that *could satisty* the predicate (signed less-than, etc.) any of the values in a range).

Then you union them (if either way leads to block) or intersect them (if the conditions are and'd together).

Nick

there is no such information. You can still consider every type to have values
in, say, T = [-2^31; 2^31-1]. Probably you are trying to deduce an interval of
possible values for each register. You will need to allow intervals to wrap
around the end of T since (eg) the basic "add" operator in LLVM uses modulo
arithmetic, i.e. if you add 1 to 2^31-1 you get -2^31, which forces you to use
wrapping intervals. Having wrapping intervals makes interval arithmetic more
tricky but it can still be done. Whether an operation is signed (eg: sdiv, i.e.
signed division) or unsigned (eg: udiv, i.e. unsigned division) you need to
calculate the smallest interval that contains all possible result values given
the intervals the arguments are known to belong to, or a conservative
approximation to it. This is perfectly do-able, it's just that your interval
for the possible values for the result probably won't always be as precise as
you would like [*].

Thanks for your suggestions.

Actually, for now, I consider integers as mathematical integers (in ]-oo; +oo[), and that operators such as add don't overflow.
(indeed, that's incorrect, but I'll work on this later :wink: )

Suppose I approximate the set of possible values for my variable "x" with intervals.
I think there are some cases for which I'll be unable to find precise enough invariants, for example, I'd like this:

unsigned x;
// x >= 0
if (x != 0) {
  // x >= 1
  ...
}

but since I consider x in ]-oo; +oo[, I have this:

unsigned x;
// x in ]-oo; +oo[
if (x != 0) {
  // x in ]-oo; +oo[ (because of the abstraction by intervals)
  ...
}

Unfortunately, I think this loss of precision is unavoidable. Maybe if I consider programs that never do comparisons between signed and unsigned, I could retrieve types from the signedness of the comparison, as me22 suggests.

Some operations may be annotated with the "nsw" (no signed
wrap) or "nuw" (no unsigned wrap) flags which allows you to do a better job
if you are willing to assume that the program does not perform undefined
behaviour.

I tried some examples to understand when these flags are set, because maybe it could help me retrieve types as well, but I confess I didn't understand well in which case they are set or not.

Julien

Hi Julian,

Suppose I approximate the set of possible values for my variable "x" with
intervals.
I think there are some cases for which I'll be unable to find precise enough
invariants, for example, I'd like this:

unsigned x;
// x >= 0
if (x != 0) {
// x >= 1
...
}

but since I consider x in ]-oo; +oo[, I have this:

unsigned x;
// x in ]-oo; +oo[
if (x != 0) {
// x in ]-oo; +oo[ (because of the abstraction by intervals)
...
}

in fact there is no problem here, because "x >= 1" becomes an unsigned uge
comparison in LLVM. The values for which "x uge 1" are true are
]-oo; -1], [1, +oo[ (this range is an interval if you allow intervals to
wrap around the end), so you can deduce that "x uge 1" is true from x != 0.

Ciao, Duncan.