On modifying the test case for vector size 64, even with flag -msse test case fails. Incase of vector size 16 gcc uses xmm0 to pass args. If vector size is 32 gcc passes args on stack whereas clang uses xmm1 & xmm0 regs to pass value. Issue is visibile for vector size > 16 calling conv differences between gcc and clang are still there if vector_size > 16.