These flags should be enough for Clang to vectorise your code based on
AVX2 support.
If that doesn't give you the vectorisation you want, you can try to
force the vector width or unroll factor by either command line options
or pragmas in the code:
If you don't understand why a loop is not being vectorised, you can
try the Clang diagnostics:
And, of course, if you spot a loop or a basic block that could have
been vectorised but wasn't, please open a bug on our bugzilla with the
results of the diagnostics and your experimentation with widths and
factors: