Get class template cursor from class template instantiation cursor

Hey,

I am traversing a translation unit in the python bindings to libclang.
When I encounter a CXX_BASE_SPECIFIER I want to know which class is
derived from which here.
The base class I get over the USR of the referenced cursor.
The derived class I can get using cursor.lexical_parent.

This is fine, but when the base class is a template instantiation, I do
not want to get the USR of the specific instantiation. Instead I want
the USR of the class template.

How can I get the class template cursor, when I have the class template
instantiation cursor?

Thanks!
Nathan

Hi all,

I've noticed that fixed-size arrays are not always preserved in LLVM assembly. This happens in the case of procedure argument arrays. While an alloca declares the size of fixed-size local arrays (for instance), argument arrays are always accessed as pointers (with getelementptr, which is not intuitive for an LLVM n00b).

While C (and other languages that share this approach) does not impose any bounds checking, maintaining the sizes may be meaningful for certain targets, e.g. non-programmable hardware.

1) Does the in-memory data structures that represent LLVM IR maintain argument array sizes? If yes, how is this information accessed? Or is it lost in translation?

2) If this is not the case, can the LLVM IR (in principle) cope with array arguments without converting them to pointers?

Thank you in advance.

Best regards
Nikolaos Kavvadias

I've noticed that fixed-size arrays are not always preserved in LLVM assembly. This happens in the case of procedure argument arrays.

This is just how C works. Arguments and parameters of array type decay to pointers; they are not passed by copying the array.

While an alloca declares the size of fixed-size local arrays (for instance), argument arrays are always accessed as pointers (with getelementptr, which is not intuitive for an LLVM n00b).

Sounds like you ought to learn about getelementptr, then.

While C (and other languages that share this approach) does not impose any bounds checking, maintaining the sizes may be meaningful for certain targets, e.g. non-programmable hardware.

There is no C-level language requirement at all that the argument actually be an array of that size or greater.

It's vaguely possible that this information might be preserved in clang's DWARF output, but I don't think so.

2) If this is not the case, can the LLVM IR (in principle) cope with array arguments without converting them to pointers?

In principle, yes. In practice, I don't know if the calling-convention lowering code is likely to behave well in the presence of a first-class array argument.

John.

Hi,

While C (and other languages that share this approach) does not impose any

bounds checking, maintaining the sizes may be meaningful for certain
targets, e.g. non-programmable hardware.

There is no C-level language requirement at all that the argument actually

be an array of that size or greater.

I'm not a language lawyer, but just as a trifling technicality I gather the
construct

void f(int m,int array[static m])

then "the value of the corresponding actual argument shall provide access to
the first element of an array with at least as many elements as specified by
the size expression". So you might get user code using this construct whence
the compiler is entitled to expect a minimum size. But that's realy quite a
weak requirement, and it's also the wrong way round ("at least as big as"
rather than "not bigger than").

Cheers,
Dave

Hi John, David,

I'm not a language lawyer, but just as a trifling technicality I gather the
construct
void f(int m,int array[static m])

I was thinking that for certain targets, the knowledge of the static array size, allows for interesting optimizations, e.g. parallelized forms for accessing the interface array (ranging from single-element to simultaneous access of all elements, as would a partial or full unroller permit). If the compile-time known size information cannot propagate in the IR, then this opportunity is lost.

David, your approach offers a way to circumvent the problem from a software view (and spilling to a large stack/heap). I guess it will work with C99. It might also be directly usable in a JIT setting, however, in general, the m argument (size) will not be known at compile time.

For a hardware target, an LLVM procedure (CDFG) would be implemented as control+datapath block, by a high-level synthesis mapping process to a model of computation such as an FSMD (Finite-State Machine with Datapath).

then "the value of the corresponding actual argument shall provide access to
the first element of an array with at least as many elements as specified by
the size expression". So you might get user code using this construct whence
the compiler is entitled to expect a minimum size.

I agree and will put it to test in order to check how it is mapped to the IR.

But that's realy quite a
weak requirement, and it's also the wrong way round ("at least as big as"
rather than "not bigger than").

Yes, the hard requirement for the case i'm thinking is indeed "not bigger than"; a static limit that should not be surpassed. The memory model i assume is segmented (physically), each array is mapped into physically separate storage, and by default array layout is not further optimized. For safe accesses, scratch storage or spilling to slower memory (external memory if using an FPGA-based system) would be needed, but either option has its disadvantages. Defining also a default static size for bounds checking would be arbitrary (would usually be too large or too small).

Best regards
Nikolaos Kavvadias

Hi,

David, your approach offers a way to circumvent the problem from a
software view (and spilling to a large stack/heap). I guess it will
work with C99. It might also be directly usable in a JIT setting,
however, in general, the m argument (size) will not be known at
compile time.

Note that I was just clarifying John's point about array sizes not being in any way
known in functions taking arrays, pointing out one obscure bit of syntax that means
C does in theory provide a way to express a constraint on arrays. However,
I believe that _all_ it does is mean that if the size isn't met by the actual arguments
then you're in the realm of undefined behaviour. In particular, it's about allowing
a compiler to do things it couldn't validly otherwise do (loading elements out-of-order, etc)
so that a conservative compiler can choose to just ignore it. I'd certainly be hesitant about
relying on anything beyond those semantics.

then "the value of the corresponding actual argument shall provide access to
the first element of an array with at least as many elements as specified by
the size expression".

That quote is from 6.7.5.3 of the C standard, and as far as I can see that's
_all_ the compiler is allowed to infer from it.

Cheers,
Dave

Point taken; I usually forget about the C99 array-parameter qualifiers.

John.