I got the code to work in a LLVM 2.5. I think the reason your code wasn't
working is that the function definition and declaration for foo were
missing 'void'. This is perfectly legal C but maybe there's a bug in the
inliner pass that sees the function in the module and the symbol table as
different. Here are the files I used:
[test.c]
int a;
void foo(void);
int main(int argc, int *argv) {
foo();
return 0;
}
__attribute__((always_inline)) void foo(void) {
a++;
}
[test.ll]
; ModuleID = 'test.c'
target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32"
target triple = "i386-pc-win32"
%struct.__block_descriptor = type { i32, i32 }
%struct.__block_literal_generic = type { i8*, i32, i32, i8*,
%struct.__block_descriptor* }
@a = common global i32 0, align 4 ; <i32*> [#uses=2]
define i32 @main(i32 %argc, i32* %argv) nounwind {
entry:
call void @foo()
ret i32 0
}
define void @foo() nounwind alwaysinline {
entry:
%tmp = load i32* @a ; <i32> [#uses=1]
%inc = add i32 %tmp, 1 ; <i32> [#uses=1]
store i32 %inc, i32* @a
ret void
}
[test_out.ll]
; ModuleID = 'test_out.bc'
target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32"
target triple = "i386-pc-win32"
%struct.__block_descriptor = type { i32, i32 }
%struct.__block_literal_generic = type { i8*, i32, i32, i8*,
%struct.__block_descriptor* }
@a = common global i32 0, align 4 ; <i32*> [#uses=4]
define i32 @main(i32 %argc, i32* %argv) nounwind {
entry:
%tmp.i = load i32* @a ; <i32> [#uses=1]
%inc.i = add i32 %tmp.i, 1 ; <i32> [#uses=1]
store i32 %inc.i, i32* @a
ret i32 0
}
define void @foo() nounwind alwaysinline {
entry:
%tmp = load i32* @a ; <i32> [#uses=1]
%inc = add i32 %tmp, 1 ; <i32> [#uses=1]
store i32 %inc, i32* @a
ret void
}
For bonus points, if you want the definition of foo to be removed after it
has been inlined you have to add linkonce_odr to foo's llvm definition (see
below). I don't know how to get the same result from the .c file.
[test2.ll]
; ModuleID = 'test.c'
target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32"
target triple = "i386-pc-win32"
%struct.__block_descriptor = type { i32, i32 }
%struct.__block_literal_generic = type { i8*, i32, i32, i8*,
%struct.__block_descriptor* }
@a = common global i32 0, align 4 ; <i32*> [#uses=2]
define i32 @main(i32 %argc, i32* %argv) nounwind {
entry:
call void @foo()
ret i32 0
}
define linkonce_odr void @foo() nounwind alwaysinline {
entry:
%tmp = load i32* @a ; <i32> [#uses=1]
%inc = add i32 %tmp, 1 ; <i32> [#uses=1]
store i32 %inc, i32* @a
ret void
}
Javier
Thanks for your message. However, even with using version 2.5 of llvm the
function is not inlined. Can you try this with your llvm with the same
opt
flags, which are as follows?
-O3 -debug-only=inline -mem2reg
If it does work for you, can you tell me the exact flags you used for
clang
or llvm-gcc to produce the .bc file and the exact flags you specified
when
you invoked opt?
Does the inlining happen at clang level or after opt?
Dale Johannesen wrote:
I have the following code:
static inline void foo() __attribute((always_inline));
int a;
static inline void foo()
{
a++;
}
int main(int argc, char *argv)
{
foo();
return 0;
}
This works fine in current sources. You should upgrade; 2.5 has been
out for a while and 2.6 will be soon.
However, the code generated by llvm 2.4 toolchain does not inline this
function. opt retains the function call. Here is the output when I
try to
compile with -debug-only=inline:
..\..\..\win32\bin\win32\debug\opt.exe -O3 -debug-only=inline -
mem2reg -f -o
test_opt.bc test.bc
Inliner visiting SCC: foo: 0 call sites.
Inliner visiting SCC: main: 1 call sites.
Inlining: cost=always, Call: call void (...)* @foo()
Inliner visiting SCC: INDIRECTNODE: 0 call sites.
Here is the .ll file:
; ModuleID = 'test_opt.bc'
target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-
f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32"
target triple = "i386-pc-win32"
@a = common global i32 0, align 4 ; <i32*> [#uses=2]
define i32 @main(i32 %argc, i8** %argv) nounwind {
entry:
tail call void (...)* @foo()
ret i32 0
}
define internal void @foo(...) nounwind alwaysinline {
entry:
%tmp = load i32* @a ; <i32> [#uses=1]
%inc = add i32 %tmp, 1 ; <i32> [#uses=1]
store i32 %inc, i32* @a
ret void
}
What am I doing wrong here? Is there a way to force a function to be
inlined
by opt?
Best Regards,
--
View this message in context:
http://www.nabble.com/Forcing-function-inline-not-working-tp25483934p25483934.html