Performance alloca + memcpy vs alloca + inline

Hi,

While I was looking for vectorisation solutions I stumbled of a strange performance difference between clang and gcc.

The attached code can be compile with:

clang -DSINGLE -std=c99 -O3 expanding-inline-generic.c
(poor performance > ~)

vs

clang -DINLINE -std=c99 -O3 expanding-inline-generic.c
(performs fast < 1s)

Using GCC 4.7/4.8 both examples have no noticible differences.

Stefan

expanding-inline-generic.c (1.11 KB)

From: "Stefan de Konink" <stefan@konink.de>
To: cfe-dev@cs.uiuc.edu
Sent: Saturday, February 1, 2014 3:34:10 PM
Subject: [cfe-dev] Performance alloca + memcpy vs alloca + inline

Hi,

While I was looking for vectorisation solutions I stumbled of a
strange
performance difference between clang and gcc.

The attached code can be compile with:

clang -DSINGLE -std=c99 -O3 expanding-inline-generic.c
(poor performance > ~)

Approximately what (At least in my mail client, I see no number here)?

vs

clang -DINLINE -std=c99 -O3 expanding-inline-generic.c
(performs fast < 1s)

Using GCC 4.7/4.8 both examples have no noticible differences.

Please file a bug report (http://llvm.org/bugs/) and we'll look at it.

Thanks,
Hal

Submitted as:

http://llvm.org/bugs/show_bug.cgi?id=18695

Stefan