setDataLayout segfault

I get a segfault with this code when setting the data layout:

int main(int argc, char** argv)
{
   llvm::InitializeNativeTarget();

   llvm::LLVMContext TheContext;
   unique_ptr<Module> Mod(new Module("A",TheContext));

   llvm::EngineBuilder engineBuilder(std::move(Mod));
   std::string mcjit_error;

   engineBuilder.setMCPU(llvm::sys::getHostCPUName());

   engineBuilder.setEngineKind(llvm::EngineKind::JIT);
   engineBuilder.setOptLevel(llvm::CodeGenOpt::Aggressive);
   engineBuilder.setErrorStr(&mcjit_error);

   llvm::TargetOptions targetOptions;
   targetOptions.AllowFPOpFusion = llvm::FPOpFusion::Fast;
   engineBuilder.setTargetOptions( targetOptions );

   TargetMachine *targetMachine = engineBuilder.selectTarget();

   assert(targetMachine && "failed to create target machine");

   std::cout << targetMachine->createDataLayout().getStringRepresentation() << "\n";

   Mod.get()->setDataLayout( targetMachine->createDataLayout().getStringRepresentation() ); // this segfaults
   Mod.get()->setDataLayout( targetMachine->createDataLayout() ); // as well as this
}

Backtrace:

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
e-m:e-i64:64-f80:128-n8:16:32:64-S128

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff5f65832 in llvm::SmallVectorTemplateCommon<unsigned char,

::end (this=0x148) at

/home/fwinter/svn/llvm-3.9/include/llvm/ADT/SmallVector.h:117
117 iterator end() { return (iterator)this->EndX; }
(gdb) bt
#0 0x00007ffff5f65832 in llvm::SmallVectorTemplateCommon<unsigned char,

::end (this=0x148) at

/home/fwinter/svn/llvm-3.9/include/llvm/ADT/SmallVector.h:117
#1 llvm::SmallVectorImpl<unsigned char>::clear (this=0x148) at /home/fwinter/svn/llvm-3.9/include/llvm/ADT/SmallVector.h:345
#2 0x00007ffff5fa13b5 in llvm::DataLayout::clear (this=0x138) at /home/fwinter/svn/llvm-3.9/lib/IR/DataLayout.cpp:545
#3 0x00007ffff5f9f561 in llvm::DataLayout::reset (this=0x138, Desc=...) at /home/fwinter/svn/llvm-3.9/lib/IR/DataLayout.cpp:179
#4 0x00007ffff60b25ca in llvm::Module::setDataLayout (this=0x0, Desc=...) at /home/fwinter/svn/llvm-3.9/lib/IR/Module.cpp:377
#5 0x0000000000406c46 in main (argc=1, argv=0x7fffffffde68) at main.cc:95

Any idea?

Frank

I am constructing the engine builder in the following way:

llvm::SMDiagnostic Err;
unique_ptr<Module> Mod = getLazyIRFileModule("f.ll", Err, TheContext);
llvm::EngineBuilder engineBuilder(std::move(Mod));

However, after moving the pointer to the constructor it is no longer retrievable from the unique_ptr object.

Mod.get()->dump(); // this segfaults after the move, but not before

So I conclude that any type of operation on the module is no longer valid.

Am I constructing the engine builder as it is supposed to?

Frank

Ok. I can make a copy of the unique_ptr before moving it into the builder's constructor and use the copy later on. It is confusing to require a unique_ptr.

Frank

Ok. I can make a copy of the unique_ptr before moving it into the
builder's constructor and use the copy later on. It is confusing to
require a unique_ptr.

It's (almost?) always a bad idea to make a copy of a std::unique_ptr.

Frank

I am constructing the engine builder in the following way:

llvm::SMDiagnostic Err;
unique_ptr<Module> Mod = getLazyIRFileModule("f.ll", Err, TheContext);
llvm::EngineBuilder engineBuilder(std::move(Mod));

However, after moving the pointer to the constructor it is no longer
retrievable from the unique_ptr object.

Moving re-assigns ownership of the pointed-to memory to the EngineBuilder.

Mod.get()->dump(); // this segfaults after the move, but not before

That is expected.

So I conclude that any type of operation on the module is no longer
valid.

Correction: any operation via the *moved* unique_ptr is no longer valid.

The Module is a null pointer here.

I am constructing the engine builder in the following way:

llvm::SMDiagnostic Err;
unique_ptr<Module> Mod = getLazyIRFileModule("f.ll", Err, TheContext);
llvm::EngineBuilder engineBuilder(std::move(Mod));

However, after moving the pointer to the constructor it is no longer retrievable from the unique_ptr object.

Mod.get()->dump(); // this segfaults after the move, but not before

So I conclude that any type of operation on the module is no longer valid.

Oh it seems you already figured, my previous email was not helpful :slight_smile:

I think a large part of the confusion comes from two conflicting ideas:

  1. Things that create modules return a unique_ptr, which indicates an intention to make it difficult to have multiple copies of a module “pointer”.

  2. However, the LLVM API has several methods that return or require bare Module* pointers.

The result is that you pretty much have to “smash” the unique_ptr to get a raw pointer out to get any work done, which is potentially dangerous.

In my case, I can create a Module in one of two ways:

a) By reading in bitcode:

ErrorOr<std::unique_ptrllvm::Module > eom =
parseBitcodeFile(mb->getMemBufferRef(), context);

if (eom) {
m_Module = std::move(eom.get());

b) By newing one:

std::unique_ptr upm(new Module(name, context));

m_Module = std::move(upm);

If I want to support a) then I am forced to store a unique_ptr in my own code.

But then if I want to pass a Module* into some other LLVM API, I need to:

Module &getModule() const { return *m_Module; }

m_Func = Function::Create(as_type, Function::ExternalLinkage, name, &getModule());

All of this seems quite stilted. On the one hand, we have a strong effort to ensure that there are no dangling pointers to Modules, but on the other, we have to jump through hoops to get values required for arguments to other API calls, and in doing so, undo the “guarantees” we get from using unique_ptr in the first place.

Or am I totally missing the point? Is there a better way for me to organize this code?

I think a large part of the confusion comes from two conflicting ideas:

1. Things that create modules return a unique_ptr<Module>, which
indicates an intention to make it difficult to have multiple copies of a
module "pointer".
2. However, the LLVM API has several methods that return or require bare
Module* pointers.

These two just reflect different ownership semantics. /Most/ of the time in llvm's codebase, if you see a raw pointer in an interface, it means that the given function does not take ownership of the pointer.

The result is that you pretty much have to "smash" the
unique_ptr<Module> to get a raw pointer out to get any work done, which
is potentially dangerous.

Getting the raw pointer out is ok, you just have to be careful that none of your uses of it live longer than the thing you gave ownership of it (or however long it's API says its supposed to keep the thing it took ownership of).

In my case, I can create a Module in one of two ways:

a) By reading in bitcode:

        ErrorOr<std::unique_ptr<llvm::Module> > eom =
            parseBitcodeFile(mb->getMemBufferRef(), context);

        if (eom) {
            m_Module = std::move(eom.get());

b) By newing one:

        std::unique_ptr<Module> upm(new Module(name, context));

        m_Module = std::move(upm);

If I want to support a) then I am forced to store a unique_ptr<Module>
in my own code.

But then if I want to pass a Module* into some other LLVM API, I need to:

    Module &getModule() const { return *m_Module; }

    m_Func = Function::Create(as_type, Function::ExternalLinkage, name,
&getModule());

All of this seems quite stilted. On the one hand, we have a strong
effort to ensure that there are no dangling pointers to Modules, but on

The actual aim is to clarify ownership in order to simplify memory management (prevent leaks).

the other, we have to jump through hoops to get values required for
arguments to other API calls, and in doing so, undo the "guarantees" we
get from using unique_ptr in the first place.

The guarantee that unique_ptr gives you is that the memory it points to will get freed precisely once.

I think a large part of the confusion comes from two conflicting ideas:

  1. Things that create modules return a unique_ptr, which indicates an intention to make it difficult to have multiple copies of a module "pointer”.
  1. However, the LLVM API has several methods that return or require bare Module* pointers.

The result is that you pretty much have to “smash” the unique_ptr to get a raw pointer out to get any work done, which is potentially dangerous.

In my case, I can create a Module in one of two ways:

a) By reading in bitcode:

ErrorOr<std::unique_ptrllvm::Module > eom =
parseBitcodeFile(mb->getMemBufferRef(), context);

if (eom) {
m_Module = std::move(eom.get());

b) By newing one:

std::unique_ptr upm(new Module(name, context));

m_Module = std::move(upm);

If I want to support a) then I am forced to store a unique_ptr in my own code.

But then if I want to pass a Module* into some other LLVM API, I need to:

Module &getModule() const { return *m_Module; }

m_Func = Function::Create(as_type, Function::ExternalLinkage, name, &getModule());

All of this seems quite stilted. On the one hand, we have a strong effort to ensure that there are no dangling pointers to Modules, but on the other, we have to jump through hoops to get values required for arguments to other API calls, and in doing so, undo the “guarantees” we get from using unique_ptr in the first place.

Or am I totally missing the point? Is there a better way for me to organize this code?

IMO, you’re expecting from unique_ptr something that it is not intended to, i.e. it is not expressing “an intention to make it difficult to have multiple copies”. It is just here to handle the ownership responsibility.

This does not seem conflicting with API that takes raw pointers to me: raw pointers don’t imply ownership transfer.
(That said I prefer reference in APIs to make explicit what is/isn’t optional, but that’s an orthogonal issue).