template instantiation, typedef used as template argument and dependent type for data members.

Hi,

In the context of ROOT (<http://root.cern.ch>), it's automated I/O of C++
objects, and of the cling Interpreter (<Cling - ROOT>,
we are strongly hindered by the fact that template instantiations that have
a typedef as template parameter 'forget' about it in the AST (at least
as far as the data members and member functions of the template instance
are concerned).

In ROOT I/O we use two typedefs (Float16_t and Double32_t) to indicates
that even-though the data member should have full precision while in
memory and while involved in computation, when being stored they
should be stored with lower precision.

When using for example:

    // Start of file templateAndTypedef.C
    //
    // This typedef indicates that the user would
    // like to store the data on disk with only
    // 32 bits of precision.
    typedef double Double32_t;
    // This typedef indicates that the user would
    // like to store the data on disk with only
    // 16 bits of precision.
    typedef float Float16_t;

    template <typename T>
    class Data
    {
    private:
       Float16_t m_Common;
       T m_UserDetermined;
    };

    class Event {
    private:
       Data<double> m_FullPrecision;
       Data<Double32_t> m_LimitedRange;
    };
    // end of file templateAndTypedef.C

we get:

    clang++ -cc1 -ast-dump templateAndTypedef.C
    ....
    typedef double Double32_t;
    typedef float Float16_t;
    template <typename T = double> class Data {
        class Data;
    private:
        Float16_t m_Common;
        double m_UserDetermined;
    }
    template <typename T> class Data {
        class Data;
    private:
       Float16_t m_Common;
       T m_UserDetermined;
    };
    class Event {
        class Event;
    private:
        Data<double> m_FullPrecision;
        Data<Double32_t> m_LimitedRange;
    };

Which in particular means that when we follow the AST nodes
from the class Event down to inside the type of m_LimitedRange,
we are brought to 'just' the content of the generic Data<double>
instantiation. Where in particular we have lost the
information that 'm_UserDetermined' was 'meant' by the user
to be a Double32_t. In contrast, we still have the information
that the developer meant that m_Common should be a Float16_t.

Having access to the 'intended' type description from a template
instantiation is here fundamental for the ROOT I/O mechanism
to work properly (the semantic and on file format of 'double'
and 'Double32_t' being very different).

Using an alternative to the typedef for this purpose has its
own set of problems as we also need things like
std::vector<Double32_t> and std::vector<double> to be
interchangeable (for example as far as parameter passing
is concerned).

In addition to our use case, having access to this type
of information could be useful for a document generator
or to clarify/simplify some error messages. For example:

    // start of stringError.C
    #include <string>
    #include <vector>
    void test ()
    {
       std::vector<std::string> vec;
       vec.push_back("abc");
       vec.push_back(1.3);
    }
    // end of stringError.C

when compiled with clang, the output currently contains the error message:

stringError.C:8:18: error: reference to type 'const value_type' (aka 'const std::basic_string<char>') could not bind to an rvalue of type 'double'
    vec.push_back(1);
                  ^

which would be even nicer, if reading:

stringError.C:8:18: error: reference to type 'const value_type' (aka 'const std::string') could not bind to an rvalue of type 'int'
    vec.push_back(1);
                  ^

where 'std::basic_string<char>' has been replaced by 'std::string'.

In first approximation, one plausible step could be to generate a distinct
template instantiations for each of the cases with distinct typedef
as a template argument and a template instantiation for the case
without typedef as a template argument (i.e. the primary template
instantiation). And then make sure that for the rest of the semantic
analysis vector<Double32_t> and vector<double> are
treated as equivalent/interchangeable/aliases.

Any ideas/thoughts on if and how being able to access the intended
type of a template instantiation's members whose type depends on the
template parameter(s) could be supported in any way?

Thanks,
Philippe.

templateAndTypedef.C (487 Bytes)

stringError.C (136 Bytes)

In the context of ROOT (<http://root.cern.ch>), it’s automated I/O of C++
objects, and of the cling Interpreter (<http://root.cern.ch/drupal/content/cling>,
we are strongly hindered by the fact that template instantiations that have
a typedef as template parameter ‘forget’ about it in the AST (at least
as far as the data members and member functions of the template instance
are concerned).

[…]

In addition to our use case, having access to this type
of information could be useful for a document generator
or to clarify/simplify some error messages. For example:

// start of stringError.C
#include
#include
void test ()
{
std::vectorstd::string vec;
vec.push_back(“abc”);
vec.push_back(1.3);
}
// end of stringError.C

when compiled with clang, the output currently contains the error message:

stringError.C:8:18: error: reference to type ‘const value_type’ (aka ‘const std::basic_string’) could not bind to an rvalue of type ‘double’
vec.push_back(1);
^

which would be even nicer, if reading:

stringError.C:8:18: error: reference to type ‘const value_type’ (aka ‘const std::string’) could not bind to an rvalue of type ‘int’
vec.push_back(1);
^

where ‘std::basic_string’ has been replaced by ‘std::string’.

Yes, this is a real problem which is affecting more people than just you. IIRC, there are several bugs in bugzilla with this as the root cause (see for instance llvm.org/PR12853).

In first approximation, one plausible step could be to generate a distinct
template instantiations for each of the cases with distinct typedef
as a template argument and a template instantiation for the case
without typedef as a template argument (i.e. the primary template
instantiation). And then make sure that for the rest of the semantic
analysis vector<Double32_t> and vector are
treated as equivalent/interchangeable/aliases.

I do not think this is the right approach: it could be extremely expensive to perform all the redundant template instantiations for canonically-equivalent-but-not-identical template arguments, and would significantly complicate the AST to have multiple definitions for the same declaration.

Any ideas/thoughts on if and how being able to access the intended
type of a template instantiation’s members whose type depends on the
template parameter(s) could be supported in any way?

I think we should address this by adding type sugar for references to instantiated entities. For instance, in this case:

template struct T { template decltype(U()+V()) get(V); };
T x;
FloatTypedef d;
int a = x.get(d);

… we can track that the type of ‘get’ is equivalent to substituting U=IntTypedef, V=FloatTypedef into the type in the primary template, and then instantiate (and cache) the type of that member. That should be significantly cheaper (and simpler) than instantiating the declaration, and would provide the same benefits.