Should we enable TBAA by default in clang cl mode?

Notice that clang-cl mainly serves as drop-in replacement for MSVC cl.exe . The current code base turn strict aliasing off by default when in clang cl mode.
Refer to lib/Driver/ToolChains/Clang.cpp

  // We turn strict aliasing off by default if we're in CL mode, since MSVC
  // doesn't do any TBAA.
  bool TBAAOnByDefault = !D.IsCLMode();
  if (!Args.hasFlag(options::OPT_fstrict_aliasing, StrictAliasingAliasOption,
                    options::OPT_fno_strict_aliasing, TBAAOnByDefault))
    CmdArgs.push_back("-relaxed-aliasing");

When invoking through clang-cl, we are using TBAA related optimizations from LLVM not from MSVC. clang-cl need to be compatible with MSVC cl.exe in terms of command-line options. But do we really want to drop optimizations to match with MSVC capabilities?

Although user can pass -fstrict-aliasing to clang-cl to turn on TBAA. But this can be easily missed.
From a recent customer code study, I noticed a missing vectorization at clang-cl O3. Customer didn’t pass -fstrict-aliasing, and didn’t realize TBAA is off at O3 and on at Ofast in clang-cl mode.

Use reduced test below t.cpp
Run: clang-cl --target=aarch64 -clang:“-mcpu=cortex-a57” -c /clang:-O3 t.cpp -clang:“-Rpass=vectorize”

When replace O3 with Ofast, or adding -fstrict-aliasing to O3 , you will see the while loop getting vectorized. (The while loop is vectorizable at O3 if not in clang-cl mode.)
LICM failed to promote load/store of m_size outside of loop without TBAA information. Current loop vectorizer is not recognizing m_size as reduction, and fail to vectorize.

class MyVector
{
public:
    typedef unsigned short*             iterator;
    MyVector();

    void
    insert(
            iterator  theFirst,
            iterator  theLast)
    {
            unsigned short*     thePointer = m_data1 + m_size;
            while (theFirst != theLast)
            {
                *thePointer++ = *theFirst++;
                ++m_size;
            }
    }

    float*                  m_data0;
    unsigned int           m_size;
    unsigned short*         m_data1;
};

unsigned int test(MyVector t, MyVector lookahead) {
    t.insert(lookahead.m_data1, lookahead.m_data1 + lookahead.m_size);
    return t.m_size;
}

Should we enable TBAA by default for clang-cl mode ? Or will this cause other issues I am not aware of ?
Ccing @rnk @hansw2000

1 Like

(For reference, the current behaviour was implemented in clang-cl: Disable TBAA by default for MSVC compatibility · llvm/llvm-project@2a24e3a · GitHub)

I think the reasoning in that commit (and in the comment) is still correct.

clang-cl is meant to be used as a drop-in replacement for MSVC. Enabling TBAA could silently miscompile code that’s been developed with MSVC in mind, including system header files.

I do think we could do a better job at documenting this, and pointing out that -fstrict-aliasing is available in clang-cl. Would you like to send a patch to the clang-cl section of clang/docs/UsersManual.rst?

We would probably only reconsider this if MSVC changed its behavior. The question of TBAA seems to keep coming up, for example in Developer Community but I don’t believe there’s been any change so far.

Thanks Hans!
Created pull request [clang-cl] Document behavior difference of strict aliasing in clang-cl vs clang. by huihzhang · Pull Request #68460 · llvm/llvm-project · GitHub
Please help take a look, let me know if any changes needed.

1 Like