HI all,
What’s the default size for the stack of the OpenMP threads in x86 and and ppc64 in the OpenMP runtime?
I have a Fortran program which allocates more than a 1MB on the stack for each thread. On x86 runs correctly, but on ppc64 it seg faults.
I am seeing that each thread in x86 has a stacksize of 4096kb while on ppc64 is 1024kb.
If I increase the stacksize with OMP_STACKSIZE the program runs correctly also on Power, but it fails if I only set “ulimit -s unlimited”, which make sense since ulimit should only change the stacksize for the main thread but not for the other threads.
Is this the expected behavior and the correct solution is to set OMP_STACKSIZE?
Thanks.
Simone
I do not know what the default OpenMP stack size is on PPC, but the rest of your logic seems sound, and setting OMP_STACKSIZE seems entirely reasonable.
The standard says that the default stack size is “implementation defined”, so it is entirely reasonable that different implementations have different default values.
It looks though the LLVM OpenMP runtime has two different stack size for x86 (4096kb) and power (1024kb), so I thought that this is not implementation related but maybe architecture dependent?
Anyway, makes sense to set the OMP_STACKSIZE.
Thanks.
Simone
I cannot answer the question of why the PPC port chose a different stack size 
But that doesn’t affect the solution, which, as you suggested, is to force the issue with OMP_STACKSIZE.
1M is the default stack size. x86 and x86_64 have large stack size defaults, presumably because somebody at Intel decided that was better for some reason.
It would certainly improve the portability of the LLVM OpenMP runtime if the non-x86 default was 4M, at least on 64-bit systems.
From kmp.h:
#if KMP_ARCH_X86
#define KMP_DEFAULT_STKSIZE ((size_t)(2 * 1024 * 1024))
#elif KMP_ARCH_X86_64
#define KMP_DEFAULT_STKSIZE ((size_t)(4 * 1024 * 1024))
#define KMP_BACKUP_STKSIZE ((size_t)(2 * 1024 * 1024))
#else
#define KMP_DEFAULT_STKSIZE ((size_t)(1024 * 1024))
#endif
If I was going to change it, I would group all the 32-bit and all the 64-bit architectures to be the same, at least until somebody proves that one of them deserves specialization. I don’t know what KMP_BACKUP_STKSIZE does but if that is x86_64-specific, then one can uncomment the relevant code below.
#if KMP_ARCH_X86 || KMP_ARCH_ARM || KMP_ARCH_MIPS
#define KMP_DEFAULT_STKSIZE ((size_t)(2 * 1024 * 1024))
#elif KMP_ARCH_X86_64 || KMP_ARCH_AARCH64 || KMP_ARCH_PPC64 || KMP_ARCH_MIPS64
#define KMP_DEFAULT_STKSIZE ((size_t)(4 * 1024 * 1024))
//#if KMP_ARCH_X86_64
#define KMP_BACKUP_STKSIZE ((size_t)(2 * 1024 * 1024))
//#endif
#else
#define KMP_DEFAULT_STKSIZE ((size_t)(4*1024 * 1024))
#endif
Jeff