UNREACHABLE executed! error while trying to generate PTX

I am trying to generate PTX code for 'nbody' sample program's kernel
(nbody_kernel.cu) using clang/LLVM version 3.2. The nbody CUDA program is
available in Nvidia's SDK.

I am referring to https://github.com/jholewinski/llvm-ptx-samples project.

Following are my commands,

clang++ -O4 -S -I/usr/local/cuda/include -emit-llvm -target nvptx64
nbody_kernel.cu -o nbody_kernel.ll

opt -O3 -loop-unroll -unroll-allow-partial nbody_kernel.ll -o
nbody_kernel.ll

llc nbody_kernel.ll -o nbody_kernel.ptx

After execution of the last command(llc) I get a UNREACHABLE executed! error
with the following stack trace

[DEVICE-C++] nbody.kernel.cpp
unexpected address space
UNREACHABLE executed at
/home/pratnali/LLVM/llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:1317!
0 libLLVM-3.3svn.so 0x00007f3857bdf0cb
llvm::sys::PrintStackTrace(_IO_FILE*) + 43
1 libLLVM-3.3svn.so 0x00007f3857bde74a
2 libpthread.so.0 0x00007f3856c3c460
3 libc.so.6 0x00007f3855a90b15 gsignal + 53
4 libc.so.6 0x00007f3855a91f96 abort + 390
5 libLLVM-3.3svn.so 0x00007f3857bc30f7 llvm::llvm_unreachable_internal(char
const*, char const*, unsigned int) + 359
6 libLLVM-3.3svn.so 0x00007f385722967d
7 libLLVM-3.3svn.so 0x00007f385722b6d7
8 libLLVM-3.3svn.so 0x00007f3857341723
llvm::FPPassManager::doInitialization(llvm::Module&) + 99
9 libLLVM-3.3svn.so 0x00007f385734639d
llvm::MPPassManager::runOnModule(llvm::Module&) + 205
10 libLLVM-3.3svn.so 0x00007f3857349b7c
llvm::PassManagerImpl::run(llvm::Module&) + 268
11 llc 0x000000000040b534
12 llc 0x000000000040d131 main + 465
13 libc.so.6 0x00007f3855a7d4bd __libc_start_main + 253
14 llc 0x0000000000406e59
Stack dump:
0. Program arguments: llc nbody.kernel.ll -o nbody.kernel.ptx
make: *** [nbody.kernel.ptx] Aborted

I replaced the global indexes like for e.g threadIdx.x with
__builtin_ptx_read_tid_x() and others. There are no problems in generating
LLVM IR (i.e .ll). The error pops up while trying to generate PTX from the
IR using llc.

Any pointers on what might be going on here ? Will appreciate any help in
going forward

I have attached my program and observations in a README here.
LLVM_PTX_nbody.tar.gz
<http://clang-developers.42468.n3.nabble.com/file/n4031020/LLVM_PTX_nbody.tar.gz&gt;

You can easily reproduce the problem using this.