[libcxx] Win32 port must not use __popcnt intrinsect

Hello,

While browsing win32 support headers of the libcxx, I found these macros:

#define __builtin_popcount __popcnt
#define __builtin_popcountl __popcnt
#define __builtin_popcountll(__i) static_cast(__popcnt64(__i))

While using _BitScanReverse and other built-in is fine, using __popcnt is not safe at all. Whatever the cpu target is, the compiler will generate a __popcnt instruction, even if it does not support it.

From MSDN documentation:

To determine hardware support for the popcnt instruction, call the __cpuid intrinsic with InfoType=0x00000001 and check bit 23 ofCPUInfo[2] (ECX). This bit is 1 if the instruction is supported, and 0 otherwise. If you run code that uses this intrinsic on hardware that does not support the popcnt instruction, the results are unpredictable.

Where “unpredictable” generally means raising an “invalid instruction” exception.

This macros should be either removed if there are not used, or replaced by a safe implementation.

– Jean-Daniel

Hello,
While browsing win32 support headers of the libcxx, I found these macros:
#define __builtin_popcount __popcnt
#define __builtin_popcountl __popcnt
#define __builtin_popcountll(__i) static_cast<int>(__popcnt64(__i))
While using _BitScanReverse and other built-in is fine, using __popcnt is
not safe at all. Whatever the cpu target is, the compiler will generate a
__popcnt instruction, even if it does not support it.
From MSDN documentation:
To determine hardware support for the popcnt instruction, call
the __cpuid intrinsic with InfoType=0x00000001 and check bit 23 ofCPUInfo[2]
(ECX). This bit is 1 if the instruction is supported, and 0 otherwise. If
you run code that uses this intrinsic on hardware that does not support
the popcnt instruction, the results are unpredictable.
Where "unpredictable" generally means raising an "invalid instruction"
exception.
This macros should be either removed if there are not used, or replaced by a
safe implementation.

That's my fault. After further investigation, you are completely
correct: I thought I found that all x86 processors supported these
instructions, but that was only valid for the _BitScan* functions). I
don't know how to fix this (I'm just a humble hobbyist), so patches
are welcome.

The problem is only theoretical though, as libc++ is unusable with
current (and near-future) versions of MSVC, as their C++11 support is
not good enough for the rest of the library as far as I can tell.
There's problems with decltype and overload resolution (which is
really to compiler-level to work around). The intrinsics in question
are only used for MSVC/non-Clang/non-GCC compiles.

I'm working (or rather, as working) on using a MSVC-based Clang to
further enhance libc++ compatibility with Visual Studio headers, but
haven't gotten to useful parts yet. Help is welcome!

Thanks for the interest,

Ruben

Hello,
While browsing win32 support headers of the libcxx, I found these macros:
#define __builtin_popcount __popcnt
#define __builtin_popcountl __popcnt
#define __builtin_popcountll(__i) static_cast<int>(__popcnt64(__i))
While using _BitScanReverse and other built-in is fine, using __popcnt is
not safe at all. Whatever the cpu target is, the compiler will generate a
__popcnt instruction, even if it does not support it.
From MSDN documentation:
To determine hardware support for the popcnt instruction, call
the __cpuid intrinsic with InfoType=0x00000001 and check bit 23 ofCPUInfo[2]
(ECX). This bit is 1 if the instruction is supported, and 0 otherwise. If
you run code that uses this intrinsic on hardware that does not support
the popcnt instruction, the results are unpredictable.
Where "unpredictable" generally means raising an "invalid instruction"
exception.
This macros should be either removed if there are not used, or replaced by a
safe implementation.

That's my fault. After further investigation, you are completely
correct: I thought I found that all x86 processors supported these
instructions, but that was only valid for the _BitScan* functions). I
don't know how to fix this (I'm just a humble hobbyist), so patches
are welcome.

The problem is only theoretical though, as libc++ is unusable with
current (and near-future) versions of MSVC, as their C++11 support is
not good enough for the rest of the library as far as I can tell.
There's problems with decltype and overload resolution (which is
really to compiler-level to work around). The intrinsics in question
are only used for MSVC/non-Clang/non-GCC compiles.

You're right. I recently read about the new feature of VS 2011, and it look like we may have to wait for a long time before it append.
I just sketched a patch with a standard implementation of popcnt, but as I didn't try to compile it.
But as this code cannot really be tested until the MS compiler is able to compile libc++, may commenting out these macros is enough (with a comment explaining why).

I'm working (or rather, as working) on using a MSVC-based Clang to
further enhance libc++ compatibility with Visual Studio headers, but
haven't gotten to useful parts yet. Help is welcome!

As many people, I really lack of spare time to do all the things I'd like, and hacking on clang and libc++ on windows is part of this things.
So if I find time to help, be sure I will let you know :wink:

Thanks for the interest,

Ruben

-- Jean-Daniel

popcnt.patch (2.1 KB)

There are several portable implementations of popcount on http://graphics.stanford.edu/~seander/bithacks.html (under Counting bits set), with various speed/space tradeoffs.

Committed revision 145684.

Sorry it took me so long to review this.

Howard