Comparison of 2 schemes to implement OpenMP 5.0 declare mapper codegen

Hi Lingda,

may I ask to start discussions about important decisions related to Clang's OpenMP support on the public mailing list instead of having private conversations? That would help to get feedback from people not being part of the selected circle participating in the "OpenMP / HPC in Clang / LLVM Multi-company Telecom".

Thanks,
Jonas

Hi Jonas,

Sure, we are trying to do so. The public lists often reject my emails because it is large and I cannot fill all people in that mailing list here, though.

Thanks,
Lingda Li

I believe the key question here is whether it is true that (the
overhead of push_back() > the overhead of precalculating the total
number + the memory allocation overhead + directly memory write).

Not familiar with this area, nevertheless given

I believe the key question here is whether it is true that (the
overhead of push_back() > the overhead of precalculating the total
number + the memory allocation overhead + directly memory write).

What we appear to be looking at is something very close to vectors of vectors in using position pointers and indexes as against a separate key index as in the case of std::map.

Given that we have vectors of vectors (nested mappers), the nested vector, the vector under the parent vector would almost necessarily need to be reached via a pointer in the parent vector. The reductio ad absurdum would be if we have a nested dynamic vector or a resizable vector, it would be impossible to use pointer arithmetic on the parent if the parent object could change size in containing the dynamic nested vector.

The result is that the size of the parent object would be fixed in using a pointer to the nested dynamic vector. Hence the ability to allocate these vectors at compile time depends on knowing the lengths of the vectors in descending order, parent first. When the length of a child vector is not known, that vector and its children would need to be allocated at run time.

I believe the key question here is whether it is true that (the
overhead of push_back() > the overhead of precalculating the total
number + the memory allocation overhead + directly memory write).

There are two inefficiencies associated with dynamic vectors. That is, the question appears to be whether or not we know the vector lengths at compile time and hence if we do not have the lengths, use dynamic vectors.

Dynamic vectors require a reserve area that can receive push_back objects. Some of that empty memory area will be wasted as it will likely never be used and if you do not have some reasonable idea of how long the eventual vector will likely be, you can waste a large amount of memory.

And if the initial estimate of the vector length is too small, reallocating memory and copying to a larger vector when the first vector is used up takes time.

A deque may be a better choice than a dynamic vector. You can still use position indexes but not pointer arithmetic and the deque would automatically grow in chunks and not need to reallocate.

I suggest that the downsides of dynamic vectors prefers allocation in the compiler when the lengths can be known.

Neil Nelson

Hi Jonas,

Sure, we are trying to do so. The public lists often reject my emails because it is large and I cannot fill all people in that mailing list here, though.

That shouldn't be necessary. Everyone on that list should be subscribed to openmp-dev. That's where technical discussions about libomptarget should be held. If there are particular people of interest, then cc them, but there's no need to directly cc everyone who might be on the biweekly call.

Thanks again,

Hal

Thanks,
Lingda Li

Hi Lingda,

may I ask to start discussions about important decisions related to
Clang's OpenMP support on the public mailing list instead of having
private conversations? That would help to get feedback from people not
being part of the selected circle participating in the "OpenMP / HPC in
Clang / LLVM Multi-company Telecom".

Thanks,
Jonas