fhahn
October 20, 2022, 8:06pm
1
Hi,
A while ago, @hfinkel proposed a new sanitizer for type-based aliasing violations. He also shared an initial implementation as well:
Recently I spent some time rebasing the patches, ported them to current main
and also made them work in macOS.
At the moment, there are a few key limitations on type-based aliasing in LLVM/Clang and one factor blocking progress in that area is concerns for correctness; in particular the fact that we don’t have a way to check a program is free of type-based aliasing violations makes it difficult the determine if a mis-compile is caused by a new bug in the TBAA implementation or UB in the source.
Therefore I propose to revive the type sanitizer patches and try to get an initial version submitted. To do that I would need help, especially on the sanitizer runtime side. At the moment, the all tysan tests added in the runtime patch work on macOS, but it would be great if someone could give it a try on Linux as well.
Please let me know if you are interested in helping out, there certainly are many areas that would need further tuning and testing.
cc’ing some of the people on the original review on discourse: @jdoerfert @kcc
14 Likes
Enna1
October 21, 2022, 1:37am
2
Hi @fhahn , I’m familiar with ASan and TSan but not familiar with TySan. I’m very interested in helping this.
Glad to see this is getting picked up again. We had this on queue as an intern project. What kind of help would you need landing these patches?
fhahn
November 2, 2022, 12:23pm
4
Thanks both @Enna1 & @leonardchan !
I think one of the main things is that I am only able to test this on macOS so far. Any help with checking the tests (in particular the end-to-end tests in compiler-rt) on Linux would be great. Then there’s the technical review of the implementation.
It would also be great if we could collect additional test cases for violations we want to detect or should not detect.
Finally, performance is also an area that will likely need improvements. Both that and increasing the test coverage is probably easier done once we have a baseline in tree.
Enna1
November 4, 2022, 1:08pm
5
Hi @fhahn , I tried TypeSanitizer on my Linux environment, I think I make TySan build and tests passed on Linux.
Should I update the diff in the initial implementation patch or send a new patch( ⚙ D137414 [TySan] Fix Type Sanitizer build on Linux )?
Thanks!
1 Like
fhahn
November 15, 2022, 11:33pm
6
That’s great, thank you very much!
I think I’ll commandeer the reviews and restart the reviews.
3 Likes
Hiya. What’s the current status of this? If there’s any blockers or reviewers are needed, I’d be happy to volunteer.
fhahn
March 10, 2023, 8:36pm
8
I recently rebased the patches and addressed the latest comments (thanks everyone!).
Any additional help with the reviews + testing on Linux/X86 and Linux/AArch64 would be very much appreciated
1 Like
xry111
December 3, 2023, 12:45pm
9
Any hope to continue the work in GitHub PR era?
IMO this would be very helpful for “legacy” code like OpenSSL…
2 Likes
fhahn
December 22, 2023, 7:23pm
10
Yes hopefully!
Just moved the patches to GH as PRs:
llvm:main
← llvm:users/fhahn/tysan-a-type-sanitizer-llvm
opened 07:18PM - 22 Dec 23 UTC
This patch introduces the LLVM components of a type sanitizer: a sanitizer for t… ype-based aliasing violations.
C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help.
For each TBAA type-access descriptor, encoded in LLVM's IR using metadata, the corresponding instrumentation pass generates descriptor tables. Thus, for each type (and access descriptor), we have a unique pointer representation. Excepting anonymous-namespace types, these tables are comdat, so the pointer values should be unique across the program. The descriptors refer to other descriptors to form a type aliasing tree (just like LLVM's TBAA metadata does). The instrumentation handles the "fast path" (where the types match exactly and no partial-overlaps are detected), and defers to the runtime to handle all of the more-complicated cases. The runtime, of course, is also responsible for reporting errors when those are detected.
The runtime uses essentially the same shadow memory region as tsan, and we use 8 bytes of shadow memory, the size of the pointer to the type descriptor, for every byte of accessed data in the program. The value 0 is used to represent an unknown type. The value -1 is used to represent an interior byte (a byte that is part of a type, but not the first byte). The instrumentation first checks for an exact match between the type of the current access and the type for that address recorded in the shadow memory. If it matches, it then checks the shadow for the remainder of the bytes in the type to make sure that they're all -1. If not, we call the runtime. If the exact match fails, we next check if the value is 0 (i.e. unknown). If it is, then we check the shadow for the remainder of the byes in the type (to make sure they're all 0). If they're not, we call the runtime. We then set the shadow for the access address and set the shadow for the remaining bytes in the type to -1 (i.e. marking them as interior bytes). If the type indicated by the shadow memory for the access address is neither an exact match nor 0, we call the runtime.
The instrumentation pass inserts calls to the memset intrinsic to set the memory updated by memset, memcpy, and memmove, as well as allocas/byval (and for lifetime.start/end) to reset the shadow memory to reflect that the type is now unknown. The runtime intercepts memset, memcpy, etc. to perform the same function for the library calls.
The runtime essentially repeats these checks, but uses the full TBAA algorithm, just as the compiler does, to determine when two types are permitted to alias. In a situation where access overlap has occurred and aliasing is not permitted, an error is generated.
Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer.
When the sanitizer is active, we disable actually using the TBAA metadata for AA. This way we're less likely to use TBAA to remove memory accesses that we'd like to verify.
As a note, this implementation does not use the compressed shadow-memory scheme discussed previously (http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That scheme would not handle the struct-path (i.e. structure offset) information that our TBAA represents. I expect we'll want to further work on compressing the shadow-memory representation, but I think it makes sense to do that as follow-up work.
[TySan] LLVM part.
Move from https://reviews.llvm.org/D32198
llvm:users/fhahn/main.tysan-a-type-sanitizer-clang
← llvm:users/fhahn/tysan-a-type-sanitizer-clang
opened 07:18PM - 22 Dec 23 UTC
This patch introduces the runtime components of a type sanitizer: a sanitizer fo… r type-based aliasing violations.
C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help.
The Clang changes seems mostly formulaic, the one specific change being that when the TBAA sanitizer is enabled, TBAA is always generated, even at -O0.
Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer.
Based on https://reviews.llvm.org/D32199.
https://github.com/llvm/llvm-project/pull/76259 (LLVM support)
llvm:users/fhahn/main.tysan-a-type-sanitizer-runtime-library
← llvm:users/fhahn/tysan-a-type-sanitizer-runtime-library
opened 07:18PM - 22 Dec 23 UTC
This patch introduces the runtime components of a type sanitizer: a sanitizer fo… r type-based aliasing violations.
C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help.
For each TBAA type-access descriptor, encoded in LLVM's IR using metadata, the corresponding instrumentation pass generates descriptor tables. Thus, for each type (and access descriptor), we have a unique pointer representation. Excepting anonymous-namespace types, these tables are comdat, so the pointer values should be unique across the program. The descriptors refer to other descriptors to form a type aliasing tree (just like LLVM's TBAA metadata does). The instrumentation handles the "fast path" (where the types match exactly and no partial-overlaps are detected), and defers to the runtime to handle all of the more-complicated cases. The runtime, of course, is also responsible for reporting errors when those are detected.
The runtime uses essentially the same shadow memory region as tsan, and we use 8 bytes of shadow memory, the size of the pointer to the type descriptor, for every byte of accessed data in the program. The value 0 is used to represent an unknown type. The value -1 is used to represent an interior byte (a byte that is part of a type, but not the first byte). The instrumentation first checks for an exact match between the type of the current access and the type for that address recorded in the shadow memory. If it matches, it then checks the shadow for the remainder of the bytes in the type to make sure that they're all -1. If not, we call the runtime. If the exact match fails, we next check if the value is 0 (i.e. unknown). If it is, then we check the shadow for the remainder of the byes in the type (to make sure they're all 0). If they're not, we call the runtime. We then set the shadow for the access address and set the shadow for the remaining bytes in the type to -1 (i.e. marking them as interior bytes). If the type indicated by the shadow memory for the access address is neither an exact match nor 0, we call the runtime.
The instrumentation pass inserts calls to the memset intrinsic to set the memory updated by memset, memcpy, and memmove, as well as allocas/byval (and for lifetime.start/end) to reset the shadow memory to reflect that the type is now unknown. The runtime intercepts memset, memcpy, etc. to perform the same function for the library calls.
The runtime essentially repeats these checks, but uses the full TBAA algorithm, just as the compiler does, to determine when two types are permitted to alias. In a situation where access overlap has occurred and aliasing is not permitted, an error is generated.
Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer.
As a note, this implementation does not use the compressed shadow-memory scheme discussed previously (http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That scheme would not handle the struct-path (i.e. structure offset) information that our TBAA represents. I expect we'll want to further work on compressing the shadow-memory representation, but I think it makes sense to do that as follow-up work.
(This includes build fixes for Linux from Mingjie Xu)
Based on https://reviews.llvm.org/D32197.
2 Likes
fhahn
April 19, 2024, 11:36am
11
Just rebased the patches again and addressed the current comments.
I also did a quick search of the issue tracker to find strict-alias violations that caused mis-compiles (reported by users) to check if the current implementation catches them.
Caught:
https://github.com/llvm/llvm-project/issues/86685
https://github.com/llvm/llvm-project/issues/68655
https://github.com/llvm/llvm-project/issues/62828
https://github.com/llvm/llvm-project/issues/62544
https://github.com/llvm/llvm-project/issues/47137
https://github.com/llvm/llvm-project/issues/45282
Not caught (Did not check closely yet if they may not be violations)
https://github.com/llvm/llvm-project/issues/78154
https://github.com/llvm/llvm-project/issues/51837
On the SingleSource subset of the test-suite on ARM64 macOS, there currently are the following failures. I had a quick look at some of them (SingleSource/UnitTests/Vectorizer/*
and the violations there are in libc++'s random number generator, not yet confirmed if this is a false positive)
Failed Tests (30):
test-suite :: SingleSource/Benchmarks/BenchmarkGame/fannkuch.test
test-suite :: SingleSource/Benchmarks/CoyoteBench/fftbench.test
test-suite :: SingleSource/Benchmarks/Misc-C++/Large/ray.test
test-suite :: SingleSource/Benchmarks/Misc-C++/stepanov_container.test
test-suite :: SingleSource/Benchmarks/Misc/flops.test
test-suite :: SingleSource/Benchmarks/Misc/himenobmtxpa.test
test-suite :: SingleSource/Benchmarks/Misc/richards_benchmark.test
test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists.test
test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists1.test
test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-sieve.test
test-suite :: SingleSource/Benchmarks/Shootout/Shootout-methcall.test
test-suite :: SingleSource/Regression/C++/EH/Regression-C++-class_hierarchy.test
test-suite :: SingleSource/Regression/C++/EH/Regression-C++-ctor_dtor_count.test
test-suite :: SingleSource/Regression/C++/EH/Regression-C++-exception_spec_test.test
test-suite :: SingleSource/Regression/C++/EH/Regression-C++-inlined_cleanup.test
test-suite :: SingleSource/Regression/C++/EH/Regression-C++-throw_rethrow_test.test
test-suite :: SingleSource/Regression/C/Regression-C-2003-05-21-BitfieldHandling.test
test-suite :: SingleSource/UnitTests/2003-05-02-DependentPHI.test
test-suite :: SingleSource/UnitTests/2004-11-28-GlobalBoolLayout.test
test-suite :: SingleSource/UnitTests/2006-01-23-UnionInit.test
test-suite :: SingleSource/UnitTests/2009-04-16-BitfieldInitialization.test
test-suite :: SingleSource/UnitTests/2020-01-06-coverage-008.test
test-suite :: SingleSource/UnitTests/SignlessTypes/Large/cast.test
test-suite :: SingleSource/UnitTests/Vectorizer/VPlanNativePath/outer-loop-vect.test
test-suite :: SingleSource/UnitTests/Vectorizer/dependences.test
test-suite :: SingleSource/UnitTests/Vectorizer/index-select.test
test-suite :: SingleSource/UnitTests/Vectorizer/recurrences.test
test-suite :: SingleSource/UnitTests/Vectorizer/runtime-checks.test
test-suite :: SingleSource/UnitTests/byval-alignment.test
test-suite :: SingleSource/UnitTests/initp1.test
4 Likes
I’ve tested TySan on Linux/X86 and the tests are passing. I also tested it on some historical internal issues that involved type aliasing and it caught all of them.
The two cases you say aren’t caught both involve unions. In the pull requests you mention that Clang’s TBAA representation currently has issues representing unions. Do we know if that’s still the case?
fhahn
May 2, 2024, 6:12pm
13
Great, thanks for checking!
That’s an interesting observation! Clang doesn’t generate TBAA metadata when accessing union fields, but at least in wrong code at -O2 and above on x86_64-linux-gnu (affecting clang-6.0 and above versions) · Issue #51837 · llvm/llvm-project · GitHub the problematic accesses are through short & long pointer variables, which get assigned a pointer to the union members.
Need to check if there’s anything that’s causing the missed tracking of the issue.
I’ve looked into this and Clang is generating metadata for the union access, just the wrong metadata. It defines accesses to union fields to be “omnipotent char”. All scalar types alias with omnipotent char.
While the problematic accesses are through short and long pointer variables, the first access is through a union field. This means the shadow memory says this memory space is of type omnipotent char. As short and long both alias with omnipotent char, the alias violation is not caught.
I went through the IR generated pre-TySan passes, changed the omnipotent char tags to be the correct types, and then ran the TySan passes. When running the resulting program, TySan correctly pick up the strict aliasing violations. Gist is linked below.
Old llvm issues which have strict aliasing violations, yet are not caught by the current TySan. They are put here with edited LLVM IR explaining why they are not caught. (github.com)
It would be nice if we could catch issues like this, but I think this is currently a limitation of Clang
From Clang’s side, accesses to a union fields visibly through the union type are indeed allowed to type-pun between any of the types in the union. That’s why it annotates such accesses as ‘omnipontent char’.
But, I’d expect the behavior from TySan to be:
d.c = 42;
, stored through the union: OK, and sets the memory type to omnipotent char
.
*e = 413;
: OK, and modifies the type to unsigned long
.
*i = 612;
: report an error, as it’s attempting to store type short
to memory typed with unsigned long
.
Hi @fhahn , in light of the test investigation, do you feel more work on TySan is needed for its current state to be accepted into upstream?
Hi
Was trying to test this on a project called PCSX2 (https://github.com/PCSX2/pcsx2/tree/master )
When specifying -fsanitize=type
, Clang gets stuck compiling MTGS.cpp
(MTGS.cpp )
Clang also consumes up an excessive amount of ram attempting to compile it.
The clang command is as follows;
/home/air/llvm-project/build/bin/clang++ -DCPUINFO_SUPPORTED_PLATFORM=1 -DENABLE_OPENGL -DENABLE_VULKAN -DPCSX2_DEVBUILD -DSOUNDTOUCH_FLOAT_SAMPLES -DST_NO_EXCEPTION_HANDLING -DWAYLAND_API -DX11_API -DXBYAK_NO_EXCEPTION -DZYCORE_STATIC_DEFINE -DZYDIS_STATIC_DEFINE -D_DEVEL -D_M_X86=1 -D__POSIX__ -I/home/air/pcsx2/pcsx2 -I/home/air/pcsx2/build/pcsx2 -I/home/air/pcsx2/build/common/include -I/home/air/pcsx2/3rdparty/xbyak -I/home/air/pcsx2/3rdparty/glad/include -I/home/air/pcsx2/3rdparty/vulkan-headers/include -I/home/air/pcsx2/common/../3rdparty/include -I/home/air/pcsx2/common/.. -I/home/air/pcsx2/3rdparty/fmt/fmt/include -I/home/air/pcsx2/3rdparty/rapidyaml/rapidyaml/ext/c4core/src/c4/ext/fast_float/include -I/home/air/pcsx2/3rdparty/imgui/include -I/home/air/pcsx2/3rdparty/rapidyaml/rapidyaml/src -I/home/air/pcsx2/3rdparty/rapidyaml/rapidyaml/ext/c4core/src -I/home/air/pcsx2/3rdparty/libchdr/include -I/home/air/pcsx2/3rdparty/libzip/lib -I/home/air/pcsx2/build/3rdparty/libzip -I/home/air/pcsx2/3rdparty/cpuinfo/include -I/home/air/pcsx2/3rdparty/zydis/dependencies/zycore/include -I/home/air/pcsx2/3rdparty/zydis/include -I/home/air/pcsx2/3rdparty/cubeb/include -I/home/air/pcsx2/3rdparty/rcheevos/include -I/home/air/pcsx2/3rdparty/discord-rpc/include -I/home/air/pcsx2/3rdparty/rapidjson/include -I/home/air/pcsx2/3rdparty/demangler/include -I/home/air/pcsx2/3rdparty/simpleini/include -I/home/air/pcsx2/3rdparty/freesurround/include -I/home/air/pcsx2/3rdparty/soundtouch/soundtouch -I/home/air/pcsx2/3rdparty/lzma/include -isystem /home/air/pcsx2/deps/include -isystem /usr/include/freetype2 -isystem /home/air/pcsx2/deps/include/SDL2 -DFMT_EXCEPTIONS=0 -O2 -g -DNDEBUG -march=native -pipe -fvisibility=hidden -pthread -fno-rtti -fno-exceptions -Wall -Wextra -Wno-unused-function -Wno-unused-parameter -Wno-missing-field-initializers -ffp-contract=fast -fno-strict-aliasing -fsanitize=type -Wstrict-aliasing -Wno-parentheses -Wno-missing-braces -Wno-unknown-pragmas -std=gnu++20 -Winvalid-pch -Xclang -include-pch -Xclang /home/air/pcsx2/build/pcsx2/CMakeFiles/PCSX2.dir/cmake_pch.hxx.pch -Xclang -include -Xclang /home/air/pcsx2/build/pcsx2/CMakeFiles/PCSX2.dir/cmake_pch.hxx -MD -MT pcsx2/CMakeFiles/PCSX2.dir/MTGS.cpp.o -MF pcsx2/CMakeFiles/PCSX2.dir/MTGS.cpp.o.d -o pcsx2/CMakeFiles/PCSX2.dir/MTGS.cpp.o -c /home/air/pcsx2/pcsx2/MTGS.cpp
PCSX2 is configured for build type Devel
This project is a little unusual, as you need to build some of its dependences.
You can use the dependences build script as mentioned here (build-dependencies ) to do so
For Clang, I had locally merged the LLVM, Clang & Runtime TySAN PRs (in that order) onto commit 5f243b3fffca42ed320529a54aefd86087aa85f8
Edit: Should mention I’m running on Ubuntu 22.04 LTS
fhahn
June 5, 2024, 9:59am
18
Thanks for looking into this. As long as we understand why we don’t catch those cases I don’t think that should be a blocker.
1 Like
fhahn
June 5, 2024, 10:01am
19
Thanks for trying this out! TySan generates a large amount of instrumentation code, so in some cases that can be a challenge w.r.t. to compile-time and memory usage. I don’t think we will be able to mitigate this in all circumstances, but I think there are a number of efficiency improvements we could iteration on.
1 Like
I had looked further into it
Modifying this structure in MTGS.cpp
struct BufferedData
{
u128 m_Ring[RingBufferSize];
u8 Regs[Ps2MemSize::GSregs];
u128& operator[](uint idx)
{
pxAssert(idx < RingBufferSize);
return m_Ring[idx];
}
};
to this
struct BufferedData
{
u128* m_Ring;
u8 Regs[Ps2MemSize::GSregs];
BufferedData()
{
m_Ring = new u128[RingBufferSize];
}
u128& operator[](uint idx)
{
pxAssert(idx < RingBufferSize);
return m_Ring[idx];
}
~BufferedData()
{
delete m_Ring;
}
};
Resolved the excessive memory utilisation during compile
RingBufferSize
is 524288, and with the size u128
, gives an array that is 8mb big.
It seems there is an issue dealing with (the use of) large static arrays, or large structures/classes.
Edit: Also building as Debug rather than Devel/ReleaseWithDebug Info seems to help