llvm.memcpy for struct copy

ma_jun · January 30, 2018, 7:12am

Hi all
I’m new here, and I have some question about llvm.memcpy intrinsic.

why does llvm.memcpy intrinsic only support i8* for first two arguments? and does clang will also transform struct copy into llvm.memcpy ? what format does IR looks like?
Thanks !

Regards

Jun

topperc · January 30, 2018, 7:24am

The i8 type in the pointers doesn’t matter a whole lot. There’s a long term plan to remove the type from all pointers in llvm IR.

Yes, clang will use memcpy for struct copies. You can see example IR here https://godbolt.org/g/8gQ18m. You’ll see that the struct pointers are bitcasted to i8* before the call.

Hongbin_Zheng · January 30, 2018, 7:25am

hi

ma_jun · January 30, 2018, 7:36am

Hi
Thanks !
so for this example
void foo(X &src, X &dst) {
dst = src;
}
and the IR:

define void @foo(X&, X&)(%struct.X* dereferenceable(8), %struct.X* dereferenceable(8)) #0 {
%3 = alloca %struct.X*, align 8
%4 = alloca %struct.X*, align 8
store %struct.X* %0, %struct.X** %3, align 8
store %struct.X* %1, %struct.X** %4, align 8
%5 = load %struct.X*, %struct.X** %3, align 8
%6 = load %struct.X*, %struct.X** %4, align 8
%7 = bitcast %struct.X* %6 to i8*
%8 = bitcast %struct.X* %5 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %7, i8* align 4 %8, i64 8, i1 false)
ret void
}

how can I transform the llvm.memcpy into data move loop IR and eliminate the bitcast instruction ?

Regards
Jun

ma_jun · January 30, 2018, 7:45am

Hi

topperc · January 30, 2018, 8:11am

The pointers must always be i8* the alignment is independent and is controlled by the attributes on the arguments in the call to memcpy.

ma_jun · January 30, 2018, 8:22am

Hi Craig
Thank you very much ！

kuhar · January 31, 2018, 5:36pm

Hi Ma,

how can I transform the llvm.memcpy into data move loop IR and eliminate the bitcast instruction ?

I’m not sure why you are concerned about memcpy and bitcasts, but if you call MCpyInst->getSource() and MCpyInst->getDest() it will look through casts and give you the ‘true’ source/destination.

If you want to get rid of memcpy altogether, you can take a look at this pass: https://github.com/seahorn/seahorn/blob/master/lib/Transforms/Scalar/PromoteMemcpy.cc .

Best,
Kuba

ma_jun · February 1, 2018, 5:40am

Hi Jakub
thanks, I saw the pass with code:

David_Chisnall3 · February 1, 2018, 10:03am

There are at least four different places in LLVM where memcpy intrinsics are expanded to either sequences of instructions or calls:

- InstCombine does it for very small memcpys (with a broken heuristic).

- PromoteMemCpy does it mostly to expose other optimisation opportunities.

- SelectionDAG does it (though in a pretty terrible way, because it can’t create new basic blocks and so can’t emit small loops)

- Some back ends do it in cooperation with SelectionDAG to provide their own implementation.

Whether you want a memcpy intrinsic or a sequence of loads and stores depends a little bit on what optimisation you’re doing next - some work better treating individual fields separately, some prefer to have a blob of memory that they can treat as a single entity.

It’s also worth noting that LLVM’s handling of padding in structure fields is particularly bad. LLVM IR has two kinds of struct: packed an non-packed. The documentation doesn’t make it clear whether non-packed structs have padding at the end (and clang assumes that it doesn’t, some of the time). Non-padded structs do have padding in between fields for alignment. When lowering from C (or a language needing to support a C ABI), you sometimes end up with padding fields inserted by the front end. Optimisers have no way of distinguishing these fields from non-padding fields and so we only get rid of them if SROA extracts them and finds that they have no side-effect-free consumers. In contrast, the padding between fields in non-packed structs disappears as soon as SROA runs. This can lead to violations of C semantics, where padding fields should not change (because C defines bitwise comparisons on structs using memcmp). This can lead to subtly different behaviour in C code depending on the target ABI (we’ve seen cases where trailing padding is copied in one ABI but not in another, depending solely on pointer size).

David

ma_jun · February 1, 2018, 1:25pm

Hi David
tks a lot, that makes much more clear!

Regards
Jun

Eli_Friedman · February 1, 2018, 6:39pm

The IR type of an alloca isn't supposed to affect the semantics; it's just a sizeof(type) block of bytes. We haven't always gotten this right in the past, but it should work correctly on trunk, as far as I know. If you have an IR testcase where this still doesn't work correctly, please file a bug.

-Eli

David_Chisnall3 · February 2, 2018, 10:59am

It’s not an IR test case. We have a C struct that is {void*, int}. On a system with 8-byte pointers, this becomes an LLVM struct { i8*, i8 }. On a system with 16-byte pointers, clang lowers it to { i8*, i8, [12 x i8] }. From the perspective of SROA, the [12 x i8] is a real field. When a function is called with the struct, it is lowered to taking an explicit [12 x i8] argument, whereas the other version takes only i8* and i8 in registers. This means that if the callee writes the data out to memory and then performs a memcmp, the 8-byte-pointer version may not have the same padding, whereas the 16-byte-pointer version will.

In the code that we were using (the DukTape JavaScript interpreter), the callee didn’t actually look at the padding bytes in either case, so we just ended up with less efficient code in the 16-byte-pointer case, but the same could equally have generated incorrect code for the 8-byte-pointer case.

David

Hongbin_Zheng · February 2, 2018, 9:27pm

I wonder it is possible the explicitly mark the padding bytes such that the later optimization know the padding bytes and do some optimizations.

Thanks
Hongbin

Topic		Replies	Views
llvm.memcpy for struct copy LLVM Dev List Archives	3	97	February 2, 2018
Why does clang do a memcpy? Is the cast not enough? (ABI function args) LLVM Dev List Archives	9	130	April 20, 2018
Why does llvm bitcast a struct pointer to an integer pointer? LLVM Dev List Archives	1	87	July 2, 2012
types allowed for intrinsics? LLVM Dev List Archives	4	80	July 14, 2009
llvm.memcpy intrinsics. LLVM Dev List Archives	2	83	July 16, 2009

llvm.memcpy for struct copy

Related topics