Here are my notes on the LLVM 1.7 release, which will go into the final release announcement. As Tanya mentioned, it has been far too long since the last release, and there have been a lot of CVS commits since Novemeber. I went through them all and pulled out some of the major improvements, which I've listed below. I'm certain that I have forgotten some things, so please let me know if I have and I'm happy to add it. I'm going to start working on the release notes now.
If you're interested in helping out with the release, please take a look at Tanya's release plan:
In particular, we will have tarballs ready real soon now, and we'd appreciate it if people could give them a try and report back as soon as possible (include platform, OS, system compiler version, and whether you tested a debug or release build of LLVM). Our goal is to get testing done by April 18th.
Finally, if you'd like to check in something to the release branch, please check it into mainline CVS first, then ask Tanya if it is okay to check it into the branch. Assuming it's ok, either you or she can pull it in.
----------------- 8< ----------------- 8< --------------------
<will insert overview blurb here> Big new things: llvm-gcc4, new sparc backend, Generic vector/SSE/Altivec support, X86 Scalar SSE support, debugging support, many target-independent codegen improvements, inline asm, llvm.org/web-reg.
Core LLVM IR Improvements:
* The LLVM IR now has full support for representing target-specific
inline assembly code, as general as GCC's inline assembly.
* Rob Bocchino added new LLVM insertelement and extractelement
instructions, for accessing/updating scalar elements of a vector.
* LLVM now has a new shufflevector instruction, for permuting the
elements of a vector. http://llvm.org/docs/LangRef.html#vectorops
* LLVM now supports first class global ctor/dtor initialization lists, no
longer forcing targets to use "__main".
* LLVM supports assigning globals and functions to a particular section
in the result executable.
LLVM Intrinsic Improvements:
* Adding target-specific intrinsics to LLVM is now really easy: entries
are added to .td files and necessary support code is generated from it.
* Reid contributed flexible support for "autoupgrading" intrinsics. This
is useful when we decide to change an intrinsic in a new releases of
LLVM: .ll and .bc files from old releases get upgraded to the new form. * Andrew added support for a new LLVM "readcyclecounter" intrinsic, for
accessing low-level target timing interfaces.
* LLVM now supports llvm.stacksave/llvm.stackrestore intrinsics, for
proper C99 Variable Length Array support.
* Reid changed many intrinsics to have fixed types instead of being
overloaded based on type.
Mid-Level Analysis and Transformation Improvements:
* The -loop-unswitch pass has had several bugs fixed, has several new
features, and is enabled by default now.
* Evan improved the loop strength reduction pass to use a parameterized
target interface and to take advantage of strided loads on targets
that support them (e.g. X86).
* The -instcombine pass has a framework and implementation for simplifying
code based on whether computed bits are demanded or not, based on
Nate's design and implementation in the code generator.
* Nate reimplemented post-dominator analysis using the Lengauer and
Tarjan algorithm, replacing the old iterative implementation. On one
extreme example his implementation is 40x faster than the old one
(PR681) and uses far less memory.
* Daniel Berlin contributed an ET-Forest implementation, which
replaces the old LLVM DominatorSet with a far more efficient data
structure (in both space and time).
* Andrew wrote a new "reg2mem" which transforms an LLVM function so that
there are no SSA values live across basic blocks.
* The -scalarrepl pass can now promote simple unions to registers.
* The inliner can now inline functions that have dynamic 'alloca'
instructions in them (without increasing stack usage).
* The -reassociate pass knows how to factor expressions in several ways,
e.g. turning (A*A+A*B) into (A*(A+B)) and (X+X+Y+Y) into ((X+Y) << 1)
* Saem Ghani contributed support to allow different implementations of
the abstract callgraph interface, e.g., based on pointer analysis.
Debugging Support Improvements:
* Jim implemented almost complete debugging support in the llvm-gcc 4.0
front-end and the x86/powerpc darwin backends. This includes line
number information, variable information, function information, frame
information etc. This is a huge leap in debug support over previous
releases, the only major missing piece is support for debugging
* Jim added support to the C backend for turning line number information
into #line directives in the output C file.
* Jim expanded http://llvm.org/docs/SourceLevelDebugging.html and filled
in many details.
Target-Independent Code Generator Improvements:
* Nate contributed the foundation of vector support including instruction
selection and tblgen pieces.
* Evan contributed a new target-independent bottom-up list scheduler.
* The new list scheduler was enhanced to support top-down scheduling and
to support target-specific priority functions and resource conflict
* The code generator now supports many simple inline assembly
expressions, though there are still cases that are not handled. If you
get errors or assertions using inline assembly, please file a bugzilla
bug. Inline assembly is not currently supported in the JIT or C backend.
* Evan contributed extensive additions to 'tblgen', the code
generator generator, providing more expressive .td files.
* Nate integrated switch statement lowering directly into the
SelectionDAG machinery, instead of depending on the lower-switch pass
to reduce them to branches. In the process, he improved to algorithm
to avoid emiting some dead comparisons.
* Evan significantly improved SelectionDAG support for chain and flag
handling, and added support for describing these nodes in .td files.
* Nate contributed a framework and implementation for simplifying code
based on whether computed bits are demanded or not, which works well on
bitfield manipulations and other bit-twiddling code, particularly for
removing unneeded sign extensions.
* Evan added support for adding per-instruction predicates that
enable/disable specific instructions. This is used to disable
instructions that are not supported by specific subtargets, etc.
* LLC has a new -fast option, instructing it to generate code quickly
instead of optimizing the generated code.
* Many compile-time speedups in the code generator.
* The target-independent AsmPrinter module has many new features, such as
support for emitting ".asciz" instead of ".ascii" when possible,
support for .zerofill, support for targets that accept quoted labels,
etc, and it reduces the amount of target-specific code that needs to
* Nate added support for byte-swap and bit rotate nodes.
* The legalizer pass is now non-iterative (==faster), more simple, and
several nasty libcall insertion bugs are now fixed.
* The register spiller is better at optimizing inserted spill code.
* Evan modified the instruction selector generator to produce code that
doesn't run out of stack space when compiled with GCC 4.x.
* Evan added support for lowering memset/memcpy with small fixed sizes
into discrete load and store instructions.
* LLVM can now inline the copysign C99/FORTRAN functions.
X86-Specific Code Generator Improvements:
* Evan added a new DAG-DAG instruction selector for X86,
replacing the 'pattern' selector.
* Evan added Scalar SSE support, which provides significantly
better performance than the LLVM FP stack code.
* Evan added a register-pressure reducing scheduler priority function,
which is now used by default on X86.
* Evan added support for -fpic and -static codegen on Darwin.
* Evan added initial support for subtargets in the X86 backend, including
a broad range of -mcpu=* values.
* Evan improved the loop strength reduction on X86, and it is now turned
on by default.
* Evan added support for generation of SSE3 instructions (e.g. fisttp) on
subtargets that support it.
PowerPC-Specific Code Generator Improvements:
* Full support for the Altivec instruction set, accessible with the GCC
generic vector extension and the altivec.h intrinsics (llvmgcc4 only),
including support for -faltivec and -maltivec.
* Nate greatly simplified the PowerPC branch selector, making it more
aggressive and removing support code from the target-independent code
in the process.
* Support for -static and -fpic codegen on Darwin.
* Many improvements in the generated code.
IA64-Specific Code Generator Improvements:
* Duraid transitioned the code generator to the new DAG-to-DAG isel
framework, which is more reliable and produces better code.
* The Itanium backend now has a bundling pass, which improves performance
by ~10% and reduces code size. Bundling can be improved in the future
by implementing a hazard recognizer for the scheduler to build better
* LLVM has been built with the HP aCC compiler and stdcxx, the Apache C++
Standard Library (see http://incubator.apache.org/stdcxx/ ). While
building with compilers other than g++ is not supported, doing so
should now be more straightforward.
Alpha-specific Code Generator Improvements:
* Andrew rewrote the alpha instruction selector to use the new DAG-to-DAG
instruction selection framework.
* Andrew fixed several bugs handling weak and linkonce linkage.
SPARC-Specific Code Generator Improvements:
* LLVM 1.7 includes a completely rewritten SPARC backend. This backend
has several advantages over the previous LLVM SPARC backend, and will
replace it entirely in LLVM 1.8. This backend is about 3700 lines of
code (making it a good reference for new targets), supports Sparc V8
and V9 instructions, and produces code that is slightly better than GCC
on SPEC2000. For more details:
* llvm-gcc4 is a new C/C++/ObjC/ObjC++ front-end, rewritten from scratch,
based on GCC 4.0.1. This front-end is currently only supported on
Mac OS/X PowerPC and Intel systems, but we hope to extend support to
the other LLVM-supported systems in the future.
* Supports for the GCC "section", "used" and "align" attributes.
* Full support for the GCC generic vector extension.
* Full support for PowerPC/Altivec and IA32/SSE intrinsics.
* Full support for GCC inline assembly (note that there are currently
some limitations in the code generator though).
* Full support for C99 Variable Length Arrays.
* llvm-gcc 4.0 fixes a broad range of long term bugs that have afflicted
llvm-gcc3 in areas such as ABI compliance, union layout, and bitfield
handling. There are 28 bugs dependent on http://llvm.org/PR498.
* The primary LLVM domain name is now http://llvm.org/.
* Web form registration is no longer required to download LLVM releases.
* Eric Kidd contributed the llvm-config utility, to make it easier to
build and link programs against the LLVM libraries:
* Saem Ghani extended the PostOrderIterator class to permit external
* The nightly tester output now color codes performance deltas to make it
easier to read at a glance.
* Reid added support for multiple -rpath options to the linker.
* Reid finished consolidating the host specific code into the libsystem
* Reid removed use of fork() from bugpoint, allowing it to work on Win32
* Andrew improved bugpoint's handling of dynamically loaded
* Morten contributed patches for better support of Visual C++ 2005.
In addition to the new features and infrastructure we have built, we
have also fixed many minor bugs and have made many small optimization improvements. LLVM 1.7 is the clearly our best release yet, and upgrading from a previous release is highly recommended.
As usual, if you have any questions or comments about LLVM or any of the
features in this status update, please feel free to contact the LLVMdev
mailing list (llvmdev at cs.uiuc.edu)!
Finally, here is the previous status report, the LLVM 1.6 announcement: