Why Clang is performing bad with sve?

I am using clang 16.0.0 in which it gives comparatively bad performance with sve.
I tried with tsvc program.

Could you provide some details, what do you mean by bad performance, what hardware did you use, and which parameters did you pass to Clang?

I’ve been using clang 16.0.0 in a local arm cluster, and using matrix multiplication as an example program with order 5000*5000 (data type- double) and using execution time as parameter for performance check.
…/clang -O3 -march=armv8.2a+sve -msve-vector-bits=512