From: "Joerg Sonnenberger" <joerg@britannica.bec.de>
To: llvmdev@cs.uiuc.edu
Sent: Monday, October 20, 2014 5:13:03 AM
Subject: Re: [LLVMdev] RFC: Are we ready to completely move away from the optionality of a DataLayout?
> I think that, generally speaking, this does not make sense. You
> could
> imagine linking together two modules where one data layout was a
> "subset" of the other (one is missing details of the vector types,
> for
> example, in a module that used no vector types), but even that
> seems tenuous.
I think linking modules with different vector types makes perfect
sense.
Consider a larger program that includes optimised SSE2 vs AVX
routines,
switching between them at run time.
Yes, we certainly want to support that. But even in that case, we'd use the same DataLayout for both modules (just specify different target cpus for the functions in the different modules).
> I'm biased towards making DataLayout mandatory but it does break
legitimate use cases. Target-independent bitcode is not in the best shape
but this change would kill it off entirely, so we better make sure the
maintenance is causing enough pain to rectify the change.
Target-independent bitcode exists in the form of things like SPIR and
PNaCl. These all have a DataLayout. The IR already implicitly depends on
some of these things (e.g. pointer size), making it explicit doesn't break
things.
Correct.
+1 on Chandler's proposal from PNaCl's perspective.
I agree. LLVMContext is about threading, not about datalayout. It is perfectly reasonable to use the same LLVM Context to hold IR for two different LLVM modules with different datalayout strings.
Agreed. The DataLayout should move (back) to the TargetMachine and live
there (I'm doing that part right now). I don't particularly want to put it
on the module because of (admittedly pie in the sky) plans of being able to
compile a module with two target machines at the same time.
Ha. And yet currently some of them are dependent upon subtarget
features. I'm separating them out, but making them "ARM" or "X86" or
what have you seems to be the best route.
Can you elaborate on your goals and what problem you are trying to solve? As Chandler points out, DataLayout is part of module for a reason.
Which is an interesting point - it's not really. (This was also going
to be part of my talk next week, but since it's been brought up...)
So the storage for DataLayout right now is on a per-subtarget basis.
I.e. if you don't construct one in the module the backend will make
one up based on information in the subtarget (everything from
particular subtarget, abi, etc) though I've been pulling that apart so
that it's at least on a per TargetMachine basis so that you'll get one
based on ABI (because I don't really see the point in dealing with ABI
changes between functions at the moment, though it's possible) and OS
which isn't going to change between subtargets.
So, after I'm done we remove all of the subtarget dependencies from
the TargetMachine specific DataLayout and we can put it on the
TargetMachine. This would mean that it would be effectively module
wide.
So on to that Pie in the Sky idea:
Be able to compile, in a single module, code for multiple targets at
once. This would be in the idea of an accelerator in some way on the
chip so that you can store pre-compiled versions of code for different
gpus, etc (things with a different architecture) that can then be
dispatched to via grabbing it out of the object file and sending the
data over an I/O connection etc.
It's pie in the sky and not very important at the moment. Having it at
the module level at the moment isn't an issue at the moment, or even
for the forseeable future - and it won't be difficult to move off if
necessary later. Just that if we were going to store it some place
then the TargetMachine seemed to be the best place because that's how
the backend will construct one if it's not supplied anyhow.
From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu] On
Behalf Of Chris Lattner
>> Hi Eric,
>>
>> Can you elaborate on your goals and what problem you are trying to
solve? As Chandler points out, DataLayout is part of module for a reason.
>
> Which is an interesting point - it's not really. (This was also going
> to be part of my talk next week, but since it's been brought up...)
>
> So the storage for DataLayout right now is on a per-subtarget basis.
> I.e. if you don't construct one in the module the backend will make
> one up based on information in the subtarget (everything from
I think this is what Chandler is proposing to fix: every module will have
a DataLayout string.
-Chris
I am not sure exactly where Eric is headed, but I can imagine that it would
be cool to be able to compile most functions for the CPU and some for the
GPU out of the same Module. Attaching a single DataLayout to the Module
would seem to require that these guys both use the same DataLayout; is that
really how things work in the real world? I'm out of my depth technically
here, but I feel compelled to raise the question even if it's only to
address my own ignorance. I can't say necessarily whether my employer
would care but I sure don't want to close out any intriguing options.
--paulr
Right. I was figuring there'd be some way of having the backend
specify it if there isn't one in the module IR - or a way of calling
into the backend to generate one up since memorizing all of the
possible target layouts for all of the targets would probably be a
pain. These bits would probably be off of the TargetMachine right now.
It'd make moving the DataLayout string onto the TargetMachine easier
later if we decide to do that.
Yes and no. In the general case, no. In the specific case where you want to do this kind of outlining, more or less. At least, you want to make sure that they use the same struct layouts and pointer sizes because, to get any benefit from the accelerator, you don't want to have to do complex operations when handing data off to it (and you do want it to be able to access your memory directly). In these cases, you want to ignore any ABI that the accelerator might have 'natively' and force it to use the host's data layout (although its own internal calling conventions, but that's not part of the data layout. Unfortunately, it *is* something that we don't sanely abstract in the IR, which makes this kind of thing difficult).
Is it safe to copy a function pointer contents from an ObjectImage ?
I made something like:
void (*fptr)() = nullptr;
llvm::object::symbol_iterator i = obj.begin_symbols();
if (!i->getType(type) && type == llvm::object::SymbolRef::ST_Function)
{
llvm::StringRef name;
uint64_t addr, size;
if (!i->getName(name) && !i->getAddress(addr) && !i->getSize(size))
{
llvm::sys::MemoryBlock mem = llvm::sys::Memory::AllocateRWX(size, nullptr);
memcpy(mem.base(), (void *)addr, size);
fptr = reinterpret_cast<void(*)()>(mem.base());
}
}
...
My first experiments shew that it works but I'd like to know it could have any side effects?
My goal is to delete a finalized module and just keep the copyed function (to decrease the memory use)
My first experiments shew that it works but I'd like to know it could have
any side effects?
I'd be very worried about doing that. For an isolated function with no
external references, and referenced by nothing externally, you might
get away with it (make sure it's compiled PIC!). But outside those
bounds, all kinds of things could go wrong:
+ Global variables going away when the memory is freed
+ GOT & PLT entries going away when the memory is freed.
+ External functions calling freed memory or getting unequal
function pointers (at best).
My goal is to delete a finalized module and just keep the copyed function
(to decrease the memory use)
I *think* you should be able to just keep the output around and delete
the input Module & associated compile-time data. That's a large part
of what the various MemoryManagers are there to handle, I think.
I think when doing this kind of accelerator stuff, it makes sense to always
use packed structs. Then you don't need to worry about LLVM's data layout,
you just always have something explicit that means the same thing to the
accelerator and the host.
*nod* I'm mostly thinking about calling conventions and data types not
supported by the host (half float, etc). I'm not even worried about
each thing calling the other, just having precompiled things sit in
the binary.
I think I see what you’re saying: you’re saying that it doesn’t make sense for each target to know the DL string *and* to require each frontend to know the DL string for each target it supports.
If that’s the problem you’re trying to solve, the approach I would take is to have Clang (and other frontends) query the TargetMachine directly when it is setting up the module. This will give you the layering that you’re looking for, and avoid duplicating the magic strings.