this is Jeehoon Kang, a Ph.D. student of Software Foundations Laboratory (http://sf.snu.ac.kr), Dept. of Computer Science & Engineering, Seoul National University. Our group studied the mem2reg pass, and we got a question on its algorithm.
As far as I understand, the mem2reg pass essentially uses the SSA construction algorithm to promote allocas into registers, but there are shortcuts for some special cases. One of the special cases is when an alloca is “only used within a single basic block.” (http://llvm.org/docs/doxygen/html/PromoteMemoryToRegister_8cpp_source.html#l00435)
But currently, I cannot understand the algorithm for this special case. In this case, the mem2reg pass “perform[s] a single linear pass over the basic block using the Alloca.” In other words, a load is replaced by a read from a register corresponding to the nearest preceding store. The logic I cannot understand is: “If there is no store before this load, the load takes the undef value.” (http://llvm.org/docs/doxygen/html/PromoteMemoryToRegister_8cpp_source.html#l00471). If the block is inside a loop, just writing the undef value is unsound, I think.
Here is an example C code that I think Clang mis-compiles: