Linux/ARM: Segfault issue when we build clang sources including __thread variable using -O2 flag

A few days ago, I tried to change the optimization flag from -O0 to -O2 to speed up the execution of the application on Ubuntu/ARM 14.04 32 bit.
When I compiled the source code with -O2 flag instead of -O0 flag, I could not run the application normally by getting always the segmentation fault.

Here is debugging information with GDB command in case of that. As you can see, we could not execute simple hello!! console application
because of the bug of clang/LLVM when we try to use “static __thread” variable with -O2 flag.

Now, There are four thread local storage (TLS) models in Clang/LLVM as following:

  1. global-dynamic TLS model
  2. local-dynamic TLS model
  3. local-exec TLS model
  4. initial-exec TLS model
    and emulated-TLS (for Android S/W platform)??

Even though, We can build run normally with the static relocation method by selecting the initial-exec TLS model instead of global-dynamic TLS model (by default) , We need to run the clang application code with global-dynamic (or local-dynamic) TLS model in order that we consider an application code is working with dlopen(3) library call.

If someone have already found the appropriate solution for some clang/LLVM applications including 1) __thread variables and 2) -O2/-O3 of the clang language, Could you share us?

In order to share more information, below is the machine instructions of the “PAL_ThrowExceptionFromContext” function including __thread variables from the CoreCLR’s seh.cpp source. I wonder why Clan/LLVM can not guarantee the normal execution of the application code including __thread variables in case of the aggressive optimization levels (e.g., -O2 and -O3)

========= source: __thread variable in the src/pal/src/exception/seh.cpp ===============================

171 –/
172 VOID
173 PALAPI
174 PAL_ThrowExceptionFromContext(CONTEXT
context, PAL_SEHException* ex)
175 {
176 // We need to make a copy of the exception off stack, since the “ex” is located in one of the stack
177 // frames that will become obsolete by the ThrowExceptionFromContextInternal and the Th rowExceptionHelper
178 // could overwrite the “ex” object by stack e.g. when allocating the low level exceptio n object for “throw”.
179 static __thread BYTE threadLocalExceptionStorage[sizeof(PAL_SEHException)];
180 ThrowExceptionFromContextInternal(context, new (threadLocalExceptionStorage) PAL_SEHExc eption(*ex));
181 }

======== debug: -O0 ← Build Ok, Running Success =============================================
3561791
3561792 0085bc14 <PAL_ThrowExceptionFromContext>:
3561793 85bc14: b580 push {r7, lr}
3561794 85bc16: 466f mov r7, sp
3561795 85bc18: b08a sub sp, #40 ; 0x28
3561796 85bc1a: 460a mov r2, r1
3561797 85bc1c: 4603 mov r3, r0
3561798 85bc1e: 9009 str r0, [sp, #36] ; 0x24
3561799 85bc20: 9108 str r1, [sp, #32]
3561800 85bc22: 2000 movs r0, #0
3561801 85bc24: 4601 mov r1, r0
3561802 85bc26: f8dd c024 ldr.w ip, [sp, #36] ; 0x24
3561803 85bc2a: 2800 cmp r0, #0
3561804 85bc2c: 9207 str r2, [sp, #28]
3561805 85bc2e: 9306 str r3, [sp, #24]
3561806 85bc30: f8cd c014 str.w ip, [sp, #20]
3561807 85bc34: 9104 str r1, [sp, #16]
3561808 85bc36: d10d bne.n 85bc54 <PAL_ThrowExceptionFromContext+0x40>
3561809 85bc38: e7ff b.n 85bc3a <PAL_ThrowExceptionFromContext+0x26>
3561810 85bc3a: 9908 ldr r1, [sp, #32]
3561811 85bc3c: 480a ldr r0, [pc, #40] ; (85bc68 <PAL_ThrowExceptionFromContext+0x54>)
3561812 85bc3e: 4478 add r0, pc
3561813 85bc40: 9103 str r1, [sp, #12]
3561814 85bc42: f7db cf96 blx 37b70 <_init+0xb34>
3561815 85bc46: 9002 str r0, [sp, #8]
3561816 85bc48: 9903 ldr r1, [sp, #12]
3561817 85bc4a: f66d f1ef bl 2c902c <ZN16PAL_SEHExceptionC2ERKS>
3561818 85bc4e: 9802 ldr r0, [sp, #8]
3561819 85bc50: 9004 str r0, [sp, #16]
3561820 85bc52: e7ff b.n 85bc54 <PAL_ThrowExceptionFromContext+0x40>
3561821 85bc54: 9804 ldr r0, [sp, #16]
3561822 85bc56: 9905 ldr r1, [sp, #20]
3561823 85bc58: 9001 str r0, [sp, #4]
3561824 85bc5a: 4608 mov r0, r1
3561825 85bc5c: 9901 ldr r1, [sp, #4]
3561826 85bc5e: f0aa fa47 bl 9060f0
3561827 85bc62: b00a add sp, #40 ; 0x28
3561828 85bc64: bd80 pop {r7, pc}
3561829 85bc66: bf00 nop
3561830 85bc68: 00293806 .word 0x00293806
3561831

========== debug: -O1 ← Build Ok, Running Success ====================================

1955139
1955140 005103f0 <PAL_ThrowExceptionFromContext>:
1955141 5103f0: e92d 48f0 stmdb sp!, {r4, r5, r6, r7, fp, lr}
1955142 5103f4: af03 add r7, sp, #12
1955143 5103f6: 4605 mov r5, r0
1955144 5103f8: 4807 ldr r0, [pc, #28] ; (510418 <PAL_ThrowExceptionFromContext+0x28>)
1955145 5103fa: 460c mov r4, r1
1955146 5103fc: 4478 add r0, pc
1955147 5103fe: f727 e06e blx 374dc <_init+0xba0>
1955148 510402: 4606 mov r6, r0
1955149 510404: 4621 mov r1, r4
1955150 510406: f49f f8cf bl 1af5a8 <ZN16PAL_SEHExceptionC2ERKS>
1955151 51040a: 4628 mov r0, r5
1955152 51040c: 4631 mov r1, r6
1955153 51040e: e8bd 48f0 ldmia.w sp!, {r4, r5, r6, r7, fp, lr}
1955154 510412: f073 bb87 b.w 583b24
1955155 510416: bf00 nop
1955156 510418: 00270038 .word 0x00270038
1955157

========== debug: -O2 ← Build Ok, Running Failed ======================================

2557238
2557239 006ba338 <PAL_ThrowExceptionFromContext>:
2557240 6ba338: e92d 41f0 stmdb sp!, {r4, r5, r6, r7, r8, lr}
2557241 6ba33c: af03 add r7, sp, #12
2557242 6ba33e: 4680 mov r8, r0
2557243 6ba340: 4812 ldr r0, [pc, #72] ; (6ba38c <PAL_ThrowExceptionFromContext+0x54>)
2557244 6ba342: 460c mov r4, r1
2557245 6ba344: 4478 add r0, pc
2557246 6ba346: f577 e13a blx 315bc <_init+0xca4>
2557247 6ba34a: 4606 mov r6, r0
2557248 6ba34c: f04f 31ff mov.w r1, #4294967295 ; 0xffffffff
2557249 6ba350: 6031 str r1, [r6, #0]
2557250 6ba352: f106 000c add.w r0, r6, #12
2557251 6ba356: f104 010c add.w r1, r4, #12
2557252 6ba35a: 2250 movs r2, #80 ; 0x50
2557253 6ba35c: f106 0560 add.w r5, r6, #96 ; 0x60
2557254 6ba360: 6070 str r0, [r6, #4]
2557255 6ba362: 60b5 str r5, [r6, #8]
2557256 6ba364: f576 e7ea blx 3133c <_init+0xa24>
2557257 6ba368: f104 0160 add.w r1, r4, #96 ; 0x60
2557258 6ba36c: 4628 mov r0, r5
2557259 6ba36e: f44f 72d0 mov.w r2, #416 ; 0x1a0
2557260 6ba372: f576 e7e4 blx 3133c <_init+0xa24>
2557261 6ba376: f8d4 0200 ldr.w r0, [r4, #512] ; 0x200
2557262 6ba37a: 4631 mov r1, r6
2557263 6ba37c: f8c6 0200 str.w r0, [r6, #512] ; 0x200
2557264 6ba380: 4640 mov r0, r8
2557265 6ba382: e8bd 41f0 ldmia.w sp!, {r4, r5, r6, r7, r8, lr}
2557266 6ba386: f080 bd45 b.w 73ae14
2557267 6ba38a: bf00 nop
2557268 6ba38c: 002820e4 .word 0x002820e4
2557269

========== debug: -O3 ← Build Ok, Running Failed =====================================

2642852 006f3d54 <PAL_ThrowExceptionFromContext>:
2642853 6f3d54: e92d 41f0 stmdb sp!, {r4, r5, r6, r7, r8, lr}
2642854 6f3d58: af03 add r7, sp, #12
2642855 6f3d5a: 4680 mov r8, r0
2642856 6f3d5c: 4812 ldr r0, [pc, #72] ; (6f3da8 <PAL_ThrowExceptionFromContext+0x54>)
2642857 6f3d5e: 460c mov r4, r1
2642858 6f3d60: 4478 add r0, pc
2642859 6f3d62: f53d e410 blx 31584 <_init+0xca4>
2642860 6f3d66: 4606 mov r6, r0
2642861 6f3d68: f04f 31ff mov.w r1, #4294967295 ; 0xffffffff
2642862 6f3d6c: 6031 str r1, [r6, #0]
2642863 6f3d6e: f106 000c add.w r0, r6, #12
2642864 6f3d72: f104 010c add.w r1, r4, #12
2642865 6f3d76: 2250 movs r2, #80 ; 0x50
2642866 6f3d78: f106 0560 add.w r5, r6, #96 ; 0x60
2642867 6f3d7c: 6070 str r0, [r6, #4]
2642868 6f3d7e: 60b5 str r5, [r6, #8]
2642869 6f3d80: f53d e2c0 blx 31304 <_init+0xa24>
2642870 6f3d84: f104 0160 add.w r1, r4, #96 ; 0x60
2642871 6f3d88: 4628 mov r0, r5
2642872 6f3d8a: f44f 72d0 mov.w r2, #416 ; 0x1a0
2642873 6f3d8e: f53d e2ba blx 31304 <_init+0xa24>
2642874 6f3d92: f8d4 0200 ldr.w r0, [r4, #512] ; 0x200
2642875 6f3d96: 4631 mov r1, r6
2642876 6f3d98: f8c6 0200 str.w r0, [r6, #512] ; 0x200
2642877 6f3d9c: 4640 mov r0, r8
2642878 6f3d9e: e8bd 41f0 ldmia.w sp!, {r4, r5, r6, r7, r8, lr}
2642879 6f3da2: f081 bee1 b.w 775b68
2642880 6f3da6: bf00 nop
2642881 6f3da8: 002906c8 .word 0x002906c8

End of line.

Hi Geusnik,

Ah, I see you meant you had the whole corerun VM trying to execute a
trivial example. I'm afraid we're going to need more information to
help. Ideally a small C test-case that has the problem, though it
might be interesting to compare the libcoreclr.so binaries in the -O1
and -O2 cases if you can upload or attach them somewhere.

Tim.