[newbie] trouble with global variables and CreateLoad/Store in JIT

Nikodemus_Siivola · June 4, 2017, 8:39pm

Emitting calls to these functions (written in an .ll file linked in) works fine, and does the right thing.

%Any = type { i8*, i32 }

define dllexport void @setGlobal(%Any* %ptr, %Any %value) {
store %Any %value, %Any* %ptr
ret void
}

define dllexport %Any @getGlobal(%Any* %ptr) {
%val = load %Any, %Any* %ptr
ret %Any %val
}

Trying to replace the setGlobal call with what should be equivalent

builder.CreateStore(value, ptr)

results in what should end up in the second (i32) slot being stored in the first (i8*).

I’ve added ::dump() calls where the CreateStore is, and this is what I get:

{ i8*, i32 } { i8* @FixnumClass, i32 32 } ; for value
@foo = external global { i8*, i32 } ; for ptr

Even more bizarrely trying to replace the getGlobal call with

builder.CreateLoad(val)

results in what has been stored in the first (i8*) slot being loaded correctly, but the second (i32) getting garbage out despite the correct value being stored in memory. Dump call there reports the @foo pointer identically.

This is using LLVM 4.0.0

Just so I’m not leaving anything out, what follows are IR dumps of the sample functions using either the direct store / load, or the setGlobal getGlobal
functions. As far as I can tell they should do exactly the same thing…

; Does NOT do the right thing

define { i8*, i32 } @“__anonToplevel/2”() {
entry:
%.unpack = load i8*, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
%0 = insertvalue { i8*, i32 } undef, i8* %.unpack, 0
%.unpack1 = load i32, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
%1 = insertvalue { i8*, i32 } %0, i32 %.unpack1, 1
ret { i8*, i32 } %1
}

; Does NOT do the right thing

define { i8*, i32 } @“__anonToplevel/0”() {
entry:
store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4

store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
ret { i8*, i32 } { i8* @FixnumClass, i32 123 }

}

; DOES the right thing

define { i8*, i32 } @“__anonToplevel/0”() {
entry:
call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })

ret { i8*, i32 } { i8* @FixnumClass, i32 123 }

}

; DOES the right thing

define { i8*, i32 } @“__anonToplevel/1”() {
entry:
%0 = call { i8*, i32 } @getGlobal({ i8*, i32 }* nonnull @foo)
ret { i8*, i32 } %0
}

I’m at my wit’s end. Any hints as to what I might be messing up would be much appreciated. I expect it is something ridiculously obvious…

Cheers,

– nikodemus

_sean_silva · June 5, 2017, 2:02am

This is a bit mystifying. Can you also show the assembly? What offsets are actually used for the stores in the “bad” versions? In other words, try verifying that the offsets that the getelementptr’s should generate match your expectations (and if they deviate, in what ways they deviate).

– Sean Silva

Nikodemus_Siivola · June 5, 2017, 7:57pm

Since the getelementptrs were implicitly generated by the CreateStore/Load I’m not sure how to get access to them.

So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the “big” CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.

It seems the following GEPs compute the same address?! I can buy myself not understanding how GEP works and doing it wrong, but builder.CreateStore() creates what look like identical GEPs implicitly…

i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4

The details.

This is the relevant part from my codegen:

auto ty = val->getType();
cout << “val type:” << endl;
ty->dump();
cout << “ptr type:” << endl;
ptr->getType()->dump();
// Print memory
ctx.EmitCall1(“debugPointer”, ptr);
// Set class pointer
auto c = ctx.bld.CreateExtractValue(val, 0, “class”);
auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
ctx.EmitCall1(“debugInt”, cx);
ctx.bld.CreateStore(c, cp);
// Set datum
auto d = ctx.bld.CreateExtractValue(val, 1, “datum”);
auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
ctx.EmitCall1(“debugInt”, dx);
ctx.bld.CreateStore(d, dp);
// Print memory
ctx.EmitCall1(“debugPointer”, ptr);
// Do the same with a single store
ctx.bld.CreateStore(val, ptr);
// Print memory
ctx.EmitCall1(“debugPointer”, ptr);
// Call out
ctx.EmitCall2(“setGlobal”, ptr, val);
// Print memory
ctx.EmitCall1(“debugPointer”, ptr);

Here is the compile-time output showing types of the value and the pointer:

val type:
{ i8*, i32 }
ptr type:
{ i8*, i32 }*

Here is the IR dump for the function (after a couple of passes), right before it’s fed to the JIT:

define { i8*, i32 } @“__anonToplevel/0”() prefix { i8*, i32 } (i32)* @“XEP:__anonToplevel/0” {
entry:
%0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
%1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to i32))
store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
%2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
%3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
%4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
%5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

Here is the runtime from calling the JITed function, including memory addresses and contents, with my annotations:

Before

p = 03C10000
class: 00000000
datum: 00000000

Should be address of the class slot → correct

x = 03C10000

Should be address of the datum slot, ie address of class slot + 4 → incorrect

x = 03C10000

Yeah, both values want to class slot, so actual class pointer got clobbered

p = 03C10000
class: 0000007B
datum: 00000000

Same result from the single CreateStore

p = 03C10000
class: 0000007B
datum: 00000000

Calling out to setGlobal as in my first email works

p = 03C10000
class: 039D2E98
datum: 0000007B

Finally, I didn’t manage nice disassembly yet, so here is the last output from --print-after-all for the function. The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!

BB#0: derived from LLVM BB %entry
PUSHi32 ga:foo, %ESP, %ESP
CFI_INSTRUCTION
CALLpcrel32 ga:debugPointer, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP, %ESP, %EAX<imp-def,dead>, %EDX<imp-def,dead>
%ESP<def,tied1> = ADD32ri8 %ESP, 4, %EFLAGS<imp-def,dead>
CFI_INSTRUCTION
PUSHi32 ga:foo, %ESP, %ESP
CFI_INSTRUCTION
CALLpcrel32 ga:debugInt, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP, %ESP, %EAX<imp-def,dead>, %EDX<imp-def,dead>
%ESP<def,tied1> = ADD32ri8 %ESP, 4, %EFLAGS<imp-def,dead>
CFI_INSTRUCTION
MOV32mi %noreg, 1, %noreg, ga:foo, %noreg, ga:JazzFixnumClass; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
PUSHi32 ga:foo+4, %ESP, %ESP
CFI_INSTRUCTION
CALLpcrel32 ga:debugInt, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP, %ESP, %EAX<imp-def,dead>, %EDX<imp-def,dead>
%ESP<def,tied1> = ADD32ri8 %ESP, 4, %EFLAGS<imp-def,dead>
CFI_INSTRUCTION
MOV32mi %noreg, 1, %noreg, ga:foo+4, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
PUSHi32 ga:foo, %ESP, %ESP
CFI_INSTRUCTION
CALLpcrel32 ga:debugPointer, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP, %ESP, %EAX<imp-def,dead>, %EDX<imp-def,dead>
%ESP<def,tied1> = ADD32ri8 %ESP, 4, %EFLAGS<imp-def,dead>
CFI_INSTRUCTION
MOV32mi %noreg, 1, %noreg, ga:foo, %noreg, ga:JazzFixnumClass; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
MOV32mi %noreg, 1, %noreg, ga:foo+4, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
PUSHi32 ga:foo, %ESP, %ESP
CFI_INSTRUCTION
CALLpcrel32 ga:debugPointer, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP, %ESP, %EAX<imp-def,dead>, %EDX<imp-def,dead>
%ESP<def,tied1> = ADD32ri8 %ESP, 4, %EFLAGS<imp-def,dead>
CFI_INSTRUCTION
PUSH32i8 123, %ESP, %ESP
CFI_INSTRUCTION
PUSHi32 ga:JazzFixnumClass, %ESP, %ESP
CFI_INSTRUCTION
PUSHi32 ga:foo, %ESP, %ESP
CFI_INSTRUCTION
CALLpcrel32 ga:setGlobal, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP, %ESP
%ESP<def,tied1> = ADD32ri8 %ESP, 12, %EFLAGS<imp-def,dead>
CFI_INSTRUCTION
PUSHi32 ga:foo, %ESP, %ESP
CFI_INSTRUCTION
CALLpcrel32 ga:debugPointer, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP, %ESP, %EAX<imp-def,dead>, %EDX<imp-def,dead>
%ESP<def,tied1> = ADD32ri8 %ESP, 4, %EFLAGS<imp-def,dead>
CFI_INSTRUCTION
%EAX = MOV32ri ga:JazzFixnumClass
%EDX = MOV32ri 123
RETL %EAX, %EDX

Also, I have essentially identical code working perfectly fine when the memory being written to is from @alloca.

I am completely clueless. Any suggestions most welcome.

Cheers,

– nikodemus

Nikodemus_Siivola · June 5, 2017, 8:34pm

Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it through an opaque identity function … then everything works fine.

Is this a bug in LLVM or is there some magic involving globals I’m misunderstanding?

define { i8*, i32 } @“__anonToplevel/0”() prefix { i8*, i32 } (i32)* @“XEP:__anonToplevel/0” {
entry:
%0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
%1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
%2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
%3 = ptrtoint { i8*, i32 }* %0 to i32
%4 = call { i8*, i32 } @debugInt(i32 %3)
store i8* @FixnumClass, i8** %2, align 4
%5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
%6 = ptrtoint i32* %5 to i32
%7 = call { i8*, i32 } @debugInt(i32 %6)
store i32 123, i32* %5, align 4
%8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
store i8* @FixnumClass, i8** %2, align 4
store i32 123, i32* %5, align 4
%9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8* @FixnumClass, i32 123 })
%10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

Output, now with correct addresses out of the GEPs, and memory being modified as expected:

p = 02F80000
class: 00000000
datum: 00000000
x = 02F80000
x = 02F80004
p = 02F80000
class: 028D3E98
datum: 0000007B
p = 02F80000
class: 028D3E98
datum: 0000007B
p = 02F80000
class: 028D3E98
datum: 0000007B

Cheers,

– nikodemus

_sean_silva · June 6, 2017, 12:18am

Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it
through an opaque identity function ... then everything works fine.

Is this a bug in LLVM or is there some magic involving globals I'm
misunderstanding?

This looks like a bug in the handling of constant GEP's. Specifically the
`getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)`
used to calculate the address of the integer inside the struct. Your
observation "The bizarre thing is that even this looks correct: the
debugInt is called first with @foo, then @foo+4, and the stores seem to be
going to the right addresses as well: @foo and @foo+4!" at the level of the
MachineInstr dump rules out problems before that.

After MachineInstr comes MC to emit the object file, but `foo+4` is one of
the most basic relocation types, so I doubt that there's a bug in the
lowering there or else "everything" would be broken.
Just to verify though, checking assembly of a small example across 32-bit
targets of all 3 object file formats looks fine at a glance (MC is getting
the +4 addend, though you would need to run `llvm-objdump -d -r` to see the
actual relocation in the binary) .

Beyond MC, you already have your static object file. If that is fine, then
in a JIT context you might be running into issues with RuntimeDyld. The
actual GEP's that clang generates are identical to the ones in your code,
further suggesting that this is JIT specific and that static links are
unaffected (if you could verify that, it would help to narrow down the
possibilities).
Maybe look at the output of `llvm-objdump -d -r` on a static .o file
generated from your IR and see where the relocation is handled
in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform;
grepping for the name of the relocation shown by llvm-objdump should find
the right code to look at).

By the way, what platform are you JIT'ing on? I noticed that it is a 32-bit
target, and I suspect that the 32-bit support in the JIT infrastructure
isn't as well tested / commonly used as the 64-bit code, possibly
explaining why this sort of bug could sneak through.

-- Sean Silva

Nikodemus_Siivola · June 6, 2017, 8:09am

This is on Windows 10: didn’t yet manage to get a 64-bit toolchain set up that agreed on everything necessary.

Dumped bitcode, but when I did that everything landed in the same module (normally the global is defined in a different module then its uses) → the relocations are different… different enough that when I loaded the bitcode back in and handed the single module to JIT it worked fine.

I’ll try to dump a case where the definition is in a different module tomorrow.

Anyhow, below is what clang-cl turned the bitcode from my IR into – probably not very useful though as this code does what it should…

$ llvm-objdump.exe -r -d test.o

test.o: file format COFF-i386

Disassembly of section .text:
.text:
0: 00 00 addb %al, (%eax)
00000000: IMAGE_REL_I386_DIR32 _XEP:setfoo
2: 00 00 addb %al, (%eax)

_setfoo:
4: 56 pushl %esi
5: 83 ec 40 subl $64, %esp
8: 89 e0 movl %esp, %eax
0000000c: IMAGE_REL_I386_DIR32 _foo
10: e8 00 00 00 00 calll 0 <_setfoo+0x11>
00000011: IMAGE_REL_I386_REL32 _debugPointer
15: 89 e1 movl %esp, %ecx
17: c7 01 00 00 00 00 movl $0, (%ecx)
00000019: IMAGE_REL_I386_DIR32 _foo
1d: 89 44 24 3c movl %eax, 60(%esp)
21: 89 54 24 38 movl %edx, 56(%esp)
25: e8 00 00 00 00 calll 0 <_setfoo+0x26>
00000026: IMAGE_REL_I386_REL32 _debugInt
2a: c7 05 00 00 00 00 00 00 00 00 movl $0, 0
0000002c: IMAGE_REL_I386_DIR32 _foo
00000030: IMAGE_REL_I386_DIR32 _JazzFixnumClass
34: b9 00 00 00 00 movl $0, %ecx
00000035: IMAGE_REL_I386_DIR32 _JazzFixnumClass
39: 89 e6 movl %esp, %esi
3b: c7 06 04 00 00 00 movl $4, (%esi)
0000003d: IMAGE_REL_I386_DIR32 _foo
41: 89 44 24 34 movl %eax, 52(%esp)
45: 89 54 24 30 movl %edx, 48(%esp)
49: 89 4c 24 2c movl %ecx, 44(%esp)
4d: e8 00 00 00 00 calll 0 <_setfoo+0x4E>
0000004e: IMAGE_REL_I386_REL32 _debugInt
52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
00000054: IMAGE_REL_I386_DIR32 _foo
5c: 89 e1 movl %esp, %ecx
5e: c7 01 00 00 00 00 movl $0, (%ecx)
00000060: IMAGE_REL_I386_DIR32 _foo
64: 89 44 24 28 movl %eax, 40(%esp)
68: 89 54 24 24 movl %edx, 36(%esp)
6c: e8 00 00 00 00 calll 0 <_setfoo+0x6D>
0000006d: IMAGE_REL_I386_REL32 _debugPointer
71: c7 05 00 00 00 00 00 00 00 00 movl $0, 0
00000073: IMAGE_REL_I386_DIR32 _foo
00000077: IMAGE_REL_I386_DIR32 _JazzFixnumClass
7b: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
0000007d: IMAGE_REL_I386_DIR32 _foo
85: 89 e1 movl %esp, %ecx
87: c7 01 00 00 00 00 movl $0, (%ecx)
00000089: IMAGE_REL_I386_DIR32 _foo
8d: 89 44 24 20 movl %eax, 32(%esp)
91: 89 54 24 1c movl %edx, 28(%esp)
95: e8 00 00 00 00 calll 0 <_setfoo+0x96>
00000096: IMAGE_REL_I386_REL32 _debugPointer
9a: 89 e1 movl %esp, %ecx
9c: c7 41 08 d5 00 00 00 movl $213, 8(%ecx)
a3: c7 41 04 00 00 00 00 movl $0, 4(%ecx)
000000a6: IMAGE_REL_I386_DIR32 _JazzFixnumClass
aa: c7 01 00 00 00 00 movl $0, (%ecx)
000000ac: IMAGE_REL_I386_DIR32 _foo
b0: 89 44 24 18 movl %eax, 24(%esp)
b4: 89 54 24 14 movl %edx, 20(%esp)
b8: e8 00 00 00 00 calll 0 <_setfoo+0xB9>
000000b9: IMAGE_REL_I386_REL32 _setGlobal
bd: 89 e0 movl %esp, %eax
bf: c7 00 00 00 00 00 movl $0, (%eax)
000000c1: IMAGE_REL_I386_DIR32 _foo
c5: e8 00 00 00 00 calll 0 <_setfoo+0xC6>
000000c6: IMAGE_REL_I386_REL32 _debugPointer
ca: b9 d5 00 00 00 movl $213, %ecx
cf: 8b 74 24 2c movl 44(%esp), %esi
d3: 89 44 24 10 movl %eax, 16(%esp)
d7: 89 f0 movl %esi, %eax
d9: 89 54 24 0c movl %edx, 12(%esp)
dd: 89 ca movl %ecx, %edx
df: 83 c4 40 addl $64, %esp
e2: 5e popl %esi
e3: c3 retl
e4: 66 66 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%eax,%eax)

_XEP:setfoo:
f0: 8b 44 24 04 movl 4(%esp), %eax
f4: 83 f8 00 cmpl $0, %eax
f7: 0f 84 05 00 00 00 je 5 <_XEP:setfoo+0x12>
fd: e8 00 00 00 00 calll 0 <_XEP:setfoo+0x12>
000000fe: IMAGE_REL_I386_REL32 _typeError
102: e8 00 00 00 00 calll 0 <_XEP:setfoo+0x17>
00000103: IMAGE_REL_I386_REL32 _setfoo
107: c3 retl
108: 0f 1f 84 00 00 00 00 00 nopl (%eax,%eax)
110: 00 00 addb %al, (%eax)
00000110: IMAGE_REL_I386_DIR32 _XEP:getfoo
112: 00 00 addb %al, (%eax)

_getfoo:
114: 50 pushl %eax
115: 89 e0 movl %esp, %eax
117: c7 00 00 00 00 00 movl $0, (%eax)
00000119: IMAGE_REL_I386_DIR32 _foo
11d: e8 00 00 00 00 calll 0 <_getfoo+0xE>
0000011e: IMAGE_REL_I386_REL32 _getGlobal
122: 59 popl %ecx
123: c3 retl
124: 66 66 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%eax,%eax)

_XEP:getfoo:
130: 8b 44 24 04 movl 4(%esp), %eax
134: 83 f8 00 cmpl $0, %eax
137: 0f 84 05 00 00 00 je 5 <_XEP:getfoo+0x12>
13d: e8 00 00 00 00 calll 0 <_XEP:getfoo+0x12>
0000013e: IMAGE_REL_I386_REL32 _typeError
142: e8 00 00 00 00 calll 0 <_XEP:getfoo+0x17>
00000143: IMAGE_REL_I386_REL32 _getfoo
147: c3 retl

_sean_silva · June 6, 2017, 9:16pm

That’s useful to know that the static compilation code path works. Furthermore, as expected from that:

52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
00000054: IMAGE_REL_I386_DIR32 _foo

It looks like the offset 4 of the second field of your struct is correct in the object file, so this does seem to be a problem in the JIT-specific linking/loading.

Can you try debugging into lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h to see if the relocation is getting applied correctly in the context of your JIT?

You may be able to repro this more easily using lli. It has a -jit-kind argument that should get you into the JIT codepath. (see test/ExecutionEngine/{MCJIT,ORCMCJIT}/)

– Sean Silva

Nikodemus_Siivola · June 6, 2017, 9:41pm

I just managed a quick experiment today to dump and load the definition of the variable and the function that sets it into separate modules.

…loading those bitcode files into separate modules (and handing those modules to JIT) works as expected. What should be same code going directly into JIT does not work.

Which smells like the problem may be in my JIT hookup and not in RuntimeDyld.

I’ll try to sort out my codepaths before digging into RuntimeDyld, so I can be sure I’m doing same things in “live” JIT and when dumping/loading bitcode.

I’ll let you know what turns up.

Cheers,

– nikodemus

Nikodemus_Siivola · June 7, 2017, 5:30am

My code was hinky, but only in the sense that I was accidentally duplicating the definition variable in the module where the function was. With only the declaration in the second module loading the bitcode reproduces the issue.

Managed an lli reproduction:

$ cat jit-0.ll

target datalayout = “e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32”
target triple = “i686-pc-windows-msvc”

@foo = global { i8*, i32 } undef

$ cat jit-1-clobber.ll

target datalayout = “e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32”
target triple = “i686-pc-windows-msvc”

@foo = external global { i8*, i32 }

define void @setfoo() {
entry:
%p = inttoptr i32 42 to i8*
store i8* %p, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
store i32 13, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
ret void
}

$ cat jit-1-noclobber.ll

target datalayout = “e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32”
target triple = “i686-pc-windows-msvc”

@foo = external global { i8*, i32 }

define void @setfoo() {
entry:
%p = inttoptr i32 42 to i8*
store i8* %p, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
ret void
}

$ lli -jit-kind=orc-mcjit -extra-module=jit-0.ll -extra-module=jit-1-clobber.ll main.ll; echo $?

13

$ lli -jit-kind=orc-mcjit -extra-module=jit-0.ll -extra-module=jit-1-noclobber.ll main.ll; echo $?

42

(Same happens with -jit-kind=mcjit.)

Cheers,

– nikodemus

_sean_silva · June 7, 2017, 5:40am

Great work!

This is ready to post into a bug on llvm.org/bugs. If you’re feeling a bit adventurous, feel free to also try to debug it and post any clues (setting breakpoints in the functions in lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h is how I would start).

Lang (CC’d) may have some other tips for where to look. (I’m actually not very familiar with the JIT infrastructure myself, so take my advice with a grain of salt)

– Sean Silva

Nikodemus_Siivola · June 7, 2017, 3:59pm

Done: https://bugs.llvm.org//show_bug.cgi?id=33344

Thanks for assistance!

Cheers,

– nikodemus

Topic		Replies	Views
LLVM load global pointer to i32 type for adding to it LLVM Project llvm	2	166	April 3, 2023
Storing values in global variables LLVM Dev List Archives	2	119	October 30, 2014
JIT execution with thread_local global variable Beginners	10	1668	May 17, 2020
Null GlobalVariable during asm generation LLVM Dev List Archives	0	73	January 19, 2016
Creating a global variable in JIT context LLVM Dev List Archives	3	82	February 16, 2010

[newbie] trouble with global variables and CreateLoad/Store in JIT

Before

Should be address of the class slot → correct

Should be address of the datum slot, ie address of class slot + 4 → incorrect

Yeah, both values want to class slot, so actual class pointer got clobbered

Same result from the single CreateStore

Calling out to setGlobal as in my first email works

Related Topics