Load from abs address generated bad code on LLVM 2.4

This is x86_64. I have a problem where an absolute memory load

define i32 @foo() {
entry:
        %0 = load i32* inttoptr (i64 12704196 to i32*) ; <i32> [#uses=1]
        ret i32 %0
}

generates incorrect code on LLVM 2.4:

0x7ffff6d54010: mov 0xc1d9c4(%rip),%eax # 0x7ffff79719da
0x7ffff6d54016: retq

should be

0x7ffff6d54010: mov 0xc1d9c4, %eax
0x7ffff6d54016: retq

i.e. the IP-relative addressing mode is incorrect.

The current LLVM trunk does not have this bug. This seems quite a nasty
bug; is there any chance of a bug-fix release for LLVM 2.4, or should I
just use LLVM trunk until LLVM 2.5 ?

Andrew.

#include <llvm/Module.h>
#include <llvm/Function.h>
#include <llvm/PassManager.h>
#include <llvm/CallingConv.h>
#include <llvm/Analysis/Verifier.h>
#include <llvm/Assembly/PrintModulePass.h>
#include <llvm/Support/IRBuilder.h>
#include <llvm/Support/Debug.h>
#include <llvm/ExecutionEngine/ExecutionEngine.h>
#include <llvm/ModuleProvider.h>
#include <llvm/Assembly/PrintModulePass.h>
#include <llvm/Support/raw_ostream.h>
#include <stdio.h>

using namespace llvm;

Module *makeLLVMModule ();

Function *foo;

int
main (int argc, char **argv)
{
  Module *Mod = makeLLVMModule ();
  verifyModule (*Mod, PrintMessageAction);

  PassManager PM;
  PM.add (new PrintModulePass ());
  PM.run (*Mod);

  ExecutionEngine *engine = ExecutionEngine ::create(Mod);

// DebugFlag = true;
// CurrentDebugType = "x86-emitter";

  typedef int (*pf)();
  pf F = (pf)engine->getPointerToFunction(foo);
  int res = F();

  printf ("ret = %d\n", res);

  delete Mod;
  return 0;
}

int32_t p = 99;

Module *
makeLLVMModule ()
{
  // Module Construction
  Module *mod = new Module ("test");

  Constant *c
    = mod->getOrInsertFunction ("foo", IntegerType::get(32),
        NULL);

  foo = cast <Function> (c);

  BasicBlock *block = BasicBlock::Create ("entry", foo);
  IRBuilder<> builder (block);

  Value *tmp = ConstantInt::get(Type::Int64Ty, (long)&p);
  tmp = builder.CreateIntToPtr(tmp,
             PointerType::getUnqual(IntegerType::get(32)));
  tmp = builder.CreateLoad(tmp);

  builder.CreateRet (tmp);

  return mod;
}

Andrew Haley <aph@redhat.com> writes:

This is x86_64. I have a problem where an absolute memory load

define i32 @foo() {
entry:
        %0 = load i32* inttoptr (i64 12704196 to i32*) ; <i32> [#uses=1]
        ret i32 %0
}

generates incorrect code on LLVM 2.4:

0x7ffff6d54010: mov 0xc1d9c4(%rip),%eax # 0x7ffff79719da
0x7ffff6d54016: retq

should be

0x7ffff6d54010: mov 0xc1d9c4, %eax
0x7ffff6d54016: retq

i.e. the IP-relative addressing mode is incorrect.

This seems the same as

http://www.llvm.org/bugs/show_bug.cgi?id=2920

which was fixed by an unknown change after the 2.4 release.

The current LLVM trunk does not have this bug. This seems quite a nasty
bug; is there any chance of a bug-fix release for LLVM 2.4, or should I
just use LLVM trunk until LLVM 2.5 ?

As there ar not 2.x.y releases, your only option is to use LLVM trunk.

Andrew Haley <aph@redhat.com> writes:

This is x86_64. I have a problem where an absolute memory load

define i32 @foo() {
entry:
        %0 = load i32* inttoptr (i64 12704196 to i32*) ; <i32> [#uses=1]
        ret i32 %0
}

generates incorrect code on LLVM 2.4:

0x7ffff6d54010: mov 0xc1d9c4(%rip),%eax # 0x7ffff79719da
0x7ffff6d54016: retq

IIRC, one workaround is to use a GlobalValue instead of a IntToPtr on a
Constant.

Andrew Haley <aph@redhat.com> writes:

Óscar Fuentes wrote:

The following message is a courtesy copy of an article
that has been posted to gmane.comp.compilers.llvm.devel as well.

Andrew Haley <aph@redhat.com> writes:

This is x86_64. I have a problem where an absolute memory load

define i32 @foo() {
entry:
        %0 = load i32* inttoptr (i64 12704196 to i32*) ; <i32> [#uses=1]
        ret i32 %0
}

generates incorrect code on LLVM 2.4:

0x7ffff6d54010: mov 0xc1d9c4(%rip),%eax # 0x7ffff79719da
0x7ffff6d54016: retq

IIRC, one workaround is to use a GlobalValue instead of a IntToPtr on a
Constant.

Err, how? I can't figure out how to do it. The only documentation for
GlobalValue describes it as a superclass of GlobalVariables and Functions.

IIRC, the stuff I used was something like...

GlobalVariable *gv = new GlobalVariable( /* your pointer type */,
                         other required parameters...);
AddGlobalMapping(gv, your_pointer_value); // i.e. (void*)0x938742

then, where you use

  Constant* thp = ConstantExpr::getCast(Instruction::IntToPtr,
  your_pointer_value, /* i.e. 0x938742 */
  /* your pointer type */);

/* Now use thp */

change it to

/* Now just use gv */

GetGlobalValueAtAddress may be useful for housekeeping.

If something is wrong or not as effcient as it could be, I hope someone
on the mailing list will correct me.

Hi Andrew,

As others have pointed out, using a global and addglobalmapping is a great workaround for the problem.

We generally don't do "dot" releases, since we have a short release cycle anyway. The 2.5 release process is slated to start this week.

-Chris

Chris Lattner wrote:

This is x86_64. I have a problem where an absolute memory load
The current LLVM trunk does not have this bug. This seems quite a
nasty
bug; is there any chance of a bug-fix release for LLVM 2.4, or
should I
just use LLVM trunk until LLVM 2.5 ?

As others have pointed out, using a global and addglobalmapping is a
great workaround for the problem.

Thanks.

We generally don't do "dot" releases, since we have a short release
cycle anyway. The 2.5 release process is slated to start this week.

Mmm, but the problem is that interfaces keep changing, so simply
upgrading to the latest release isn't possible. Even the tiny
test case I posted doesn't work with the latest version: there
were changes needed to get it to compile. Also, I can no longer
figure out how to turn on debugging dumps in the JIT. The simple

   DebugFlag = true;
   CurrentDebugType = "x86-emitter";

no longer works, and there seems to be no replacement for it.

Andrew.

That should work fine, just use "jit" instead of "x86-emitter" as the debug type.

-Chris

Chris Lattner wrote:

Óscar Fuentes wrote:

Andrew Haley <aph@redhat.com> writes:

Óscar Fuentes wrote:

The following message is a courtesy copy of an article
that has been posted to gmane.comp.compilers.llvm.devel as well.

Andrew Haley <aph@redhat.com> writes:

This is x86_64. I have a problem where an absolute memory load

define i32 @foo() {
entry:
        %0 = load i32* inttoptr (i64 12704196 to i32*) ; <i32> [#uses=1]
        ret i32 %0
}

generates incorrect code on LLVM 2.4:

0x7ffff6d54010: mov 0xc1d9c4(%rip),%eax # 0x7ffff79719da
0x7ffff6d54016: retq

IIRC, one workaround is to use a GlobalValue instead of a IntToPtr on a
Constant.

Err, how? I can't figure out how to do it. The only documentation for
GlobalValue describes it as a superclass of GlobalVariables and Functions.

IIRC, the stuff I used was something like...

GlobalVariable *gv = new GlobalVariable( /* your pointer type */,
                         other required parameters...);
AddGlobalMapping(gv, your_pointer_value); // i.e. (void*)0x938742

then, where you use

  Constant* thp = ConstantExpr::getCast(Instruction::IntToPtr,
  your_pointer_value, /* i.e. 0x938742 */
  /* your pointer type */);

/* Now use thp */

change it to

/* Now just use gv */

GetGlobalValueAtAddress may be useful for housekeeping.

How does one use GetGlobalValueAtAddress? It returns a const GlobalValue *,
but it seems that all the LLVM operations take a Value *. Any attempt to do
anything with a const GlobalValue * is rejected by the C++ compiler. Perhaps
I'm supposed to cast away the const?

    const GlobalValue *vv = engine->getGlobalValueAtAddress(&p);
    if (vv)
      {
  Value *tmp = builder.CreateLoad(vv);

results in

tut4.cpp:55: error: invalid conversion from 'const llvm::Value*' to 'llvm::Value*'
tut4.cpp:55: error: initializing argument 1 of 'llvm::LoadInst* llvm::IRBuilder<preserveNames, T>::CreateLoad(llvm::Value*, const char*) [with bool preserveNames = true, T = llvm::ConstantFolder]'

Andrew.

Andrew Haley <aph@redhat.com> writes:

IIRC, one workaround is to use a GlobalValue instead of a IntToPtr on a
Constant.

Err, how? I can't figure out how to do it. The only documentation for
GlobalValue describes it as a superclass of GlobalVariables and Functions.

IIRC, the stuff I used was something like...

GlobalVariable *gv = new GlobalVariable( /* your pointer type */,
                         other required parameters...);
AddGlobalMapping(gv, your_pointer_value); // i.e. (void*)0x938742

then, where you use

  Constant* thp = ConstantExpr::getCast(Instruction::IntToPtr,
  your_pointer_value, /* i.e. 0x938742 */
  /* your pointer type */);

/* Now use thp */

change it to

/* Now just use gv */

GetGlobalValueAtAddress may be useful for housekeeping.

How does one use GetGlobalValueAtAddress?

Sorry, GetGlobalValueAtAddress is a helper function on my code, it is
not LLVM.

The getGlobalValueAtAddress method is not required for what you need. Go
ahead without it.

[snip]

Ah ok. I'd suggest doing what llvm-gcc does here, it is a much more stable API:

   std::vector<const char*> Args;
   Args.push_back(""); // program name
   Args.push_back("-debug-only=jit");
   ...
   Args.push_back(0); // Null terminator.
   cl::ParseCommandLineOptions(Args.size()-1, (char**)&Args[0]);

This also gives you control over optimizations and codegen options,

-Chris

Chris Lattner wrote:

That should work fine, just use "jit" instead of "x86-emitter" as the
debug type.

That's impossible: CurrentDebugType is now private; it appears
nowhere in the installed headers. I can't find any public interface
to allow a JIT to set it.

Ah ok. I'd suggest doing what llvm-gcc does here, it is a much more
stable API:

   std::vector<const char*> Args;
   Args.push_back(""); // program name
   Args.push_back("-debug-only=jit");
   ...
   Args.push_back(0); // Null terminator.
   cl::ParseCommandLineOptions(Args.size()-1, (char**)&Args[0]);

This also gives you control over optimizations and codegen options,

Yes, thanks. It's a slightly weird hack, but it works perfectly. :slight_smile:

"-debug-only=jit" generates just a binary dump, like this:

JIT: Binary code:
JIT: 00000000: 4a04b848 000000ca 048b0000 c320

whereas "-debug-only=x86-emitter" generates this:

%RAX<def> = MOV64ri <ga:poo1>
%EAX<def> = MOV32rm %RAX<kill>, 1, %reg0, 0, Mem:LD(4,4) [poo1 + 0]
RET %EAX<imp-use,kill>

which may be more useful.

Can I put in a request some machine-independent names for the debug
dumps? It's "x86-emitter", "alpha-emitter", "x86-codegen", and so on.
We'd have to put a big target-dependent switch statement in our code to
enable asm dumps.

Andrew.

Chris Lattner wrote:

That should work fine, just use "jit" instead of "x86-emitter" as the
debug type.

That's impossible: CurrentDebugType is now private; it appears
nowhere in the installed headers. I can't find any public interface
to allow a JIT to set it.

Ah ok. I'd suggest doing what llvm-gcc does here, it is a much more
stable API:

  std::vector<const char*> Args;
  Args.push_back(""); // program name
  Args.push_back("-debug-only=jit");
  ...
  Args.push_back(0); // Null terminator.
  cl::ParseCommandLineOptions(Args.size()-1, (char**)&Args[0]);

This also gives you control over optimizations and codegen options,

Yes, thanks. It's a slightly weird hack, but it works perfectly. :slight_smile:

"-debug-only=jit" generates just a binary dump, like this:

JIT: Binary code:
JIT: 00000000: 4a04b848 000000ca 048b0000 c320

whereas "-debug-only=x86-emitter" generates this:

This has been changed in tot. -debug-only=jit will dump both target independent and target specific debug info.

Evan

Chris Lattner wrote:

That should work fine, just use "jit" instead of "x86-emitter" as the
debug type.

That's impossible: CurrentDebugType is now private; it appears
nowhere in the installed headers. I can't find any public interface
to allow a JIT to set it.

Ah ok. I'd suggest doing what llvm-gcc does here, it is a much more
stable API:

   std::vector<const char*> Args;
   Args.push_back(""); // program name
   Args.push_back("-debug-only=jit");
   ...
   Args.push_back(0); // Null terminator.
   cl::ParseCommandLineOptions(Args.size()-1, (char**)&Args[0]);

This also gives you control over optimizations and codegen options,

Sadly, that doesn't work either. Well, it works once, but then you
can't change it back. We need something more like

--- ./lib/Support/Debug.cpp~ 2009-01-23 12:15:27.000000000 +0000
+++ lib/Support/Debug.cpp 2009-01-23 12:15:53.000000000 +0000
@@ -48,7 +48,7 @@
   static cl::opt<DebugOnlyOpt, true, cl::parser<std::string> >
   DebugOnly("debug-only", cl::desc("Enable a specific type of debug output"),
             cl::Hidden, cl::value_desc("debug string"),
- cl::location(DebugOnlyOptLoc), cl::ValueRequired);
+ cl::location(DebugOnlyOptLoc), cl::ValueRequired, cl::ZeroOrMore);
#endif
}

to allow it to be dynamically used by a JIT. Of course, with a large program
being JITted it's really important to be able just to use debugging exactly
where you need it.

Andrew.