How to get llvm SSA IR?

Hi,

Now I want use llvm to do pointer analysis, and I use -mem2reg to get the SSA IR, but when I analysis source code with pointer, the IR will not in SSA format. For example, the following C code:

void test() {
int a, b, c, d;
int *p, *q, *m, *n;

p = &a;
get2(&p);
q = &b;
get2(&q);
m = &c;
get2(&m);
n = &d;
get2(&n);
p = q;
m = p;
p = n;
get2(&n);
}

I use the following command to get IR result:

clang -S -emit-llvm -Xclang -disable-O0-optnone -c new.c -o new.ll
opt -S -mem2reg new.ll -o new.opt.ll

It’s IR:

define dso_local void @test() #0 {
entry:
%a = alloca i32, align 4
%b = alloca i32, align 4
%c = alloca i32, align 4
%d = alloca i32, align 4
%p = alloca i32*, align 4
%q = alloca i32*, align 4
%m = alloca i32*, align 4
%n = alloca i32*, align 4
store i32* %a, i32** %p, align 4
call void @get2(i32** %p)
store i32* %b, i32** %q, align 4
call void @get2(i32** %q)
store i32* %c, i32** %m, align 4
call void @get2(i32** %m)
store i32* %d, i32** %n, align 4
call void @get2(i32** %n)
%0 = load i32*, i32** %q, align 4
store i32* %0, i32** %p, align 4
%1 = load i32*, i32** %p, align 4
store i32* %1, i32** %m, align 4
%2 = load i32*, i32** %n, align 4
store i32* %2, i32** %p, align 4
call void @get2(i32** %n)
ret void
}

I want to distinguish varitable p in p = q;p = n, like p0 = q;p1 = n. How can I get the SSA IR that variables defined multiple times in the original representation will be split into separate instances?

Also, when I use -mem2reg to get IR, they would do some optimization, such as remove some unnecessary stack allocation. Can I get SSA IR without other optimization?

Thank you,
panqaq

Hi Panqaq,

I want to distinguish varitable p in p = q;p = n, like p0 = q;p1 = n. How can I get the SSA IR that variables defined multiple times in the original representation will be split into separate instances?

LLVM doesn't try to model memory in any kind of SSA form, so I don't
think it even has the representation for what you're asking unless it
can see through the memory accesses to p entirely. But &p has escaped
via the first call to get2 so LLVM has very little flexibility to do
that (get2 could have saved &p somewhere and later calls could access
it, and the data stored there).

What IR would you expect to see for this example if LLVM was doing
what you wanted?

Also, when I use -mem2reg to get IR, they would do some optimization, such as remove some unnecessary stack allocation. Can I get SSA IR without other optimization?

As far as I know there's nothing more minimal than mem2reg for this
purpose, but on the other hand removing stack allocations is exactly
what mem2reg is designed for so I'm not really sure what you're after.

Cheers.

Tim.

Qiuhong Pan via llvm-dev <llvm-dev@lists.llvm.org> writes:

Now I want use llvm to do pointer analysis, and I use -mem2reg to get
the SSA IR, but when I analysis source code with pointer, the IR will
not in SSA format.

LLVM IR is always in SSA form. All of the temporaries (the %<name>
values) are assigned exactly once in the static representation.

It's IR:
define dso_local void @test() #0 {
entry:
  %a = alloca i32, align 4
  %b = alloca i32, align 4
  %c = alloca i32, align 4
  %d = alloca i32, align 4
  %p = alloca i32*, align 4
  %q = alloca i32*, align 4
  %m = alloca i32*, align 4
  %n = alloca i32*, align 4
  store i32* %a, i32** %p, align 4
  call void @get2(i32** %p)
  store i32* %b, i32** %q, align 4
  call void @get2(i32** %q)
  store i32* %c, i32** %m, align 4
  call void @get2(i32** %m)
  store i32* %d, i32** %n, align 4
  call void @get2(i32** %n)
  %0 = load i32*, i32** %q, align 4
  store i32* %0, i32** %p, align 4
  %1 = load i32*, i32** %p, align 4
  store i32* %1, i32** %m, align 4
  %2 = load i32*, i32** %n, align 4
  store i32* %2, i32** %p, align 4
  call void @get2(i32** %n)
  ret void
}

I want to distinguish varitable p in p = q;p = n, like p0 = q;p1 =
n. How can I get the SSA IR that variables defined multiple times in
the original representation will be split into separate instances?

Also, when I use -mem2reg to get IR, they would do some optimization,
such as remove some unnecessary stack allocation. Can I get SSA IR
without other optimization?

allocas in LLVM IR aren't really stack locations (though isel may
translate them to such), they are just a way to generate a pointer value
for something that lives in memory.

mem2reg attempts to promote loads and stores of "well-known" locations
like allocas to temporaries and introduces phi nodes to handle the data
flow through control structures. By its nature it will remove the
allocas because the temporaries no longer live in memory and so their
addresses are meaningless.

                               -David