GSoC project ideas

Thank you for the follow-up, it's a good news. I will do my best fort the proposal (and probably asks some questions in the process).

From Douglas Gregor comment, I can eliminate working on Modules or a Doxygen-like tool.

Improving libclang still seems to be a good projects. Working on cpp11-migrate interest me a bit more than improving libclang, I'm not sure if I should write 2 proposals in case working on cpp11-migrate wouldn't be accepted independent from me.

Not sure if you were expecting a response about the two proposals thing. I'm afraid I'm not sure what to do in this case. You can get advice about this from the LLVM GSoC coordinator on the llvm-dev mailing list. Or unless somebody here pipes up.

I'm working on the proposal and would like to have your feedback about
the following plan regarding cpp11-migrate.

                                Tasks

Comments below.

From: Guillaume Papin [mailto:guillaume.papin@epitech.eu]
Sent: Sunday, April 28, 2013 9:13 PM
To: Vane, Edwin
Cc: cfe-dev@cs.uiuc.edu
Subject: Re: [cfe-dev] GSoC project ideas

I'm working on the proposal and would like to have your feedback about the
following plan regarding cpp11-migrate.

                                Tasks
                                =====

Table of Contents

1 Transform to replace 'auto_ptr' by 'unique_ptr'
2 Transform for delegating constructors
3 Transform for non-static data member initializers
4 Add support for interactive actions
5 Default transformation profile
6 Integrating LibFormat
7 Transform to make use existing of move constructors
8 Generate a diff of the changes
9 Other incomplete ideas

1 Transform to replace 'auto_ptr' by 'unique_ptr'

   Seems like a good transform to start.

I agree. It's not completely trivial due to semantic differences between auto_ptr and unique_ptr (e.g. no destructive copy in unique_ptr) but should be a good first big project.

2 Transform for delegating constructors

   A transform that can convert code such as:

  struct A
  {
    int x;

    A() : x(0) { }
    A(int _x) : x(_x) { }
  };

   Into:

  struct A
  {
    int x;

    A() : A(0) { } // now use delegation
    A(int _x) : x(_x) { }
  };

   This is a really trivial case here but I expect this transform to
   be non-trivial to implement.

A test for determining if the functionality of one constructor is completely subsumed by another would be really difficult to do. I'm not sure the benefit of a few less lines of code and some improved maintainability is really worth it. There is the common workaround of having constructors call init() functions that might be easier to handle but still, I think there are more useful things to focus on first.

3 Transform for non-static data member initializers

   When one or more constructor initialize a member variable with
   a value independant from the constructor arguments the
   initialization can be placed in-class.

   This might be beneficial when multiple constructors are duplicating
   member initialization.

   Note that this transform might easily leads to conflicts with the
   previous transform (delegating constructors).

Also questionable implementation/benefit ratio. You'd have to ensure every member variable is initialized the same way by every constructor. If you detect such a case, that would mean removing all the existing initializations and adding the in-class initialization. All that's left is to hope the user didn't mind making some vars initialized by constructors and some by the in-class initializers.

4 Add support for interactive actions

   Some actions might need user interaction.

   Example (maybe not the best one):
   If some replacement code needs to introduce a new variable and
   that the default identifier is already taken then we might want to
   prompt the user for an alternative name.

   Or simply to ask confirmation before a risky replacement.

Definitely something we'd like to add to the migrator but requires some design first. User interactivity should be implemented in such a way that the actual user interface doesn't matter. That way one could write a plugin for an editor/IDE or just have a simple command-line interface. This implies some sort of library interface for cpp11-migrate and cpp11-migrate itself then turns into a library. LibFormat and clang-format have the same relationship. I'm not sure if this much design work is suitable to a GSoC project.

5 Default transformation profile

   Apply a list of transformation by default and allow different
   profiles. By profile I'm talking about an option such as:

     cpp11-migrate -target-profile=[clang-3.2|gcc-4.7|...] ...

   This option will enable all known safe (low-risk/zero-risk)
   transformations to the input files and are supported by the given
   target.

   This could allow incremental migration toward C++11. Let's say the
   project has to support Clang 3.1 in a first place and later on the
   minimum version switch to 3.2, they can re-run the tools with the
   new profile.

This is kinda cool. It's certainly not much work right now since there are only a handful of transforms. It'd be a slightly nicer way than just saying --all-transforms (if such an option existed) especially for people out there migrating code that's tied to a particular compiler version.

Remembering the discussion about C++11 on llvm-dev a while back, maybe you could even specify a list of compilers to this flag and the common subset of supported features is applied :slight_smile:

6 Integrating LibFormat

   In order to format correctly inserted code.

Would definitely be nice. The transforms don't do too much to mangle code right now but any that use the TypePrinter to print out types will cause the 'const' to go on the wrong side of the type specifier according to most styles. (i.e. const MyType *A => MyType const A*). I don't think LibFormat handles const locations though yet, probably for the same reason the transforms are limited in dealing with const qualifiers currently: clang doesn't provide enough TypeLoc info.

7 Transform to make use existing of move constructors

   With move semantics added to the language and the standard library
   being updated accordingly (move constructors added to many types),
   it is now interesting to take an argument by value and then moving
   it (as opposed to take by 'const &' and then copy).

Could be useful. Also in this category would be use of stl_container::emplace() functions. You'll have to be very, very careful about semantics though.

8 Generate a diff of the changes

   Add an option to print a diff of the modifications against the
   original source file.

Could be useful as a kind of 'dry-run' mode where changes are not actually made but one could find out how many and what sort of changes were made.

9 Other incomplete ideas

   If the charge of the previous ideas is not sufficient for the
   GSoC I'm confident there is more work to do.

   - initializer_list and uniform initialization transforms (use
     cases not identified yet)

Someone once suggested to me looking for:

Std::vector<int> A;
A.push_back(a);
A.push_back(b);
...
A.push_back(z);

And replacing with

Std::vector<int> A = {a,b,...,z};

I'm not entirely sure this is worth the effort. That is, how often is a vector initialization done this way? I'm not aware of other use cases right now.

   - tr1 replacements. Doing everything might not be possible but at
     least some would be useful such as: unordered_map, smart
     pointers, function<> & bind(), tuple.

This one in particular is high priority. I think pretty much everything in TR1 except the extra math functions is in C++11.

   - fixing existing bugs (I think it's a good way to get around the
     project before starting the GSoC to get acquainted with the
     code)

I agree.

   - and (much) more...

Another option could be looking at additions to STL for C++11 and making changes based on those additions. I mentioned emplace earlier. Another option could be looking for nested calls to std::max or std::min to do an N-wise horizontal max/min op: std::max(std::max(a,b), std::max(c,d)) => std::max({a,b,c,d}); Again, not sure how useful this particular case is. Another suggestion was replacing use of C arrays with std::array. I haven't looked into the implications of this myself though. Yet another option is something done by the remove-cstr tool in clang-tools-extra. C++11 allows you to create std::fstreams with a std::string directly now instead of calling std::string::c_str().

I'd rate this as low priority.
There are lots of diffing programs out there. git/svn/hg/whatever your source control system is will do this for you.

Of course, (since it was my idea), I'd like to suggest the "tr1 killer" as a project. :wink:
  http://clang.llvm.org/docs/ClangTools.html#ideas-for-new-tools

-- Marshall

Marshall Clow Idio Software <mailto:mclow.lists@gmail.com>

A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait).
        -- Yu Suzuki

I added some comments and wrote a summary of the new plan at the end of the
mail.

"Vane, Edwin" <edwin.vane@intel.com> writes:

Comments below.

From: Guillaume Papin [mailto:guillaume.papin@epitech.eu]
Sent: Sunday, April 28, 2013 9:13 PM
To: Vane, Edwin
Cc: cfe-dev@cs.uiuc.edu
Subject: Re: [cfe-dev] GSoC project ideas

I'm working on the proposal and would like to have your feedback about the
following plan regarding cpp11-migrate.

                                Tasks
                                =====

Table of Contents

1 Transform to replace 'auto_ptr' by 'unique_ptr'
2 Transform for delegating constructors
3 Transform for non-static data member initializers
4 Add support for interactive actions
5 Default transformation profile
6 Integrating LibFormat
7 Transform to make use existing of move constructors
8 Generate a diff of the changes
9 Other incomplete ideas

1 Transform to replace 'auto_ptr' by 'unique_ptr'

   Seems like a good transform to start.

I agree. It's not completely trivial due to semantic differences
between auto_ptr and unique_ptr (e.g. no destructive copy in
unique_ptr) but should be a good first big project.

I had this in mind (non-triviality) as you mentioned it in an earlier mail.

2 Transform for delegating constructors

   A transform that can convert code such as:

  struct A
  {
    int x;

    A() : x(0) { }
    A(int _x) : x(_x) { }
  };

   Into:

  struct A
  {
    int x;

    A() : A(0) { } // now use delegation
    A(int _x) : x(_x) { }
  };

   This is a really trivial case here but I expect this transform to
   be non-trivial to implement.

A test for determining if the functionality of one constructor is
completely subsumed by another would be really difficult to do. I'm
not sure the benefit of a few less lines of code and some improved
maintainability is really worth it. There is the common workaround of
having constructors call init() functions that might be easier to
handle but still, I think there are more useful things to focus on
first.

Okay, I will remove this of the list then.

I was considering handling only constructors with empty bodies (at least for
the one 'delegated') and only simple expressions in initialization (such as
parameters, literals, ...). But it was mostly for aesthetics reasons and some
other transforms might be more beneficial (tr1?).

3 Transform for non-static data member initializers

   When one or more constructor initialize a member variable with
   a value independant from the constructor arguments the
   initialization can be placed in-class.

   This might be beneficial when multiple constructors are duplicating
   member initialization.

   Note that this transform might easily leads to conflicts with the
   previous transform (delegating constructors).

Also questionable implementation/benefit ratio. You'd have to ensure
every member variable is initialized the same way by every
constructor. If you detect such a case, that would mean removing all
the existing initializations and adding the in-class initialization.
All that's left is to hope the user didn't mind making some vars
initialized by constructors and some by the in-class initializers.

I totally agree. I will remove it from the list.

4 Add support for interactive actions

   Some actions might need user interaction.

   Example (maybe not the best one):
   If some replacement code needs to introduce a new variable and
   that the default identifier is already taken then we might want to
   prompt the user for an alternative name.

   Or simply to ask confirmation before a risky replacement.

Definitely something we'd like to add to the migrator but requires
some design first. User interactivity should be implemented in such a
way that the actual user interface doesn't matter. That way one could
write a plugin for an editor/IDE or just have a simple command-line
interface. This implies some sort of library interface for
cpp11-migrate and cpp11-migrate itself then turns into a library.
LibFormat and clang-format have the same relationship. I'm not sure if
this much design work is suitable to a GSoC project.

I see. Actually this idea was very vague. I will remove it from the list as I
don't think I'm well suited (yet?) to start designing such a library.

5 Default transformation profile

   Apply a list of transformation by default and allow different
   profiles. By profile I'm talking about an option such as:

     cpp11-migrate -target-profile=[clang-3.2|gcc-4.7|...] ...

   This option will enable all known safe (low-risk/zero-risk)
   transformations to the input files and are supported by the given
   target.

   This could allow incremental migration toward C++11. Let's say the
   project has to support Clang 3.1 in a first place and later on the
   minimum version switch to 3.2, they can re-run the tools with the
   new profile.

This is kinda cool. It's certainly not much work right now since there
are only a handful of transforms. It'd be a slightly nicer way than
just saying --all-transforms (if such an option existed) especially
for people out there migrating code that's tied to a particular
compiler version.

Remembering the discussion about C++11 on llvm-dev a while back, maybe you
could even specify a list of compilers to this flag and the common subset of
supported features is applied :slight_smile:

I actually had this in mind as well (but maybe an unconscious memory from the
discussion on llvm-dev?).

6 Integrating LibFormat

   In order to format correctly inserted code.

Would definitely be nice. The transforms don't do too much to mangle
code right now but any that use the TypePrinter to print out types
will cause the 'const' to go on the wrong side of the type specifier
according to most styles. (i.e. const MyType *A => MyType const A*). I
don't think LibFormat handles const locations though yet, probably for
the same reason the transforms are limited in dealing with const
qualifiers currently: clang doesn't provide enough TypeLoc info.

Good.

7 Transform to make use existing of move constructors

   With move semantics added to the language and the standard library
   being updated accordingly (move constructors added to many types),
   it is now interesting to take an argument by value and then moving
   it (as opposed to take by 'const &' and then copy).

Could be useful. Also in this category would be use of
stl_container::emplace() functions. You'll have to be very, very careful
about semantics though.

Well, I guess this idea will be a good fit for second half of the GSoC.

8 Generate a diff of the changes

   Add an option to print a diff of the modifications against the
   original source file.

Could be useful as a kind of 'dry-run' mode where changes are not actually made but one could find out how many and what sort of changes were made.

I will remove this one from the list. It has been pointed out the SCM tools
already provide such functionality quite well. I think for most projects using
cpp11-migrate they will already be under source control management.

I was thinking about users that are curious about the tool (or C++11) who might
want to try cpp11-migrate on a file non-destructively. But an option for the
output file or directory would be easier to implement and as useful. But then,
what if an included file is modified? Is it necessary to reproduce the source
tree structure?

9 Other incomplete ideas

   If the charge of the previous ideas is not sufficient for the
   GSoC I'm confident there is more work to do.

   - initializer_list and uniform initialization transforms (use
     cases not identified yet)

Someone once suggested to me looking for:

Std::vector<int> A;
A.push_back(a);
A.push_back(b);
...
A.push_back(z);

And replacing with

Std::vector<int> A = {a,b,...,z};

I'm not entirely sure this is worth the effort. That is, how often is a
vector initialization done this way? I'm not aware of other use cases right
now.

I was think about easier cases (more commonly used?) such as:

  struct A
  {
    A(int a, int b);
  
    int a;
    const char *b;
  };
  
  A bar()
  {
    return F(1, "toto"); // -> return { 1, "toto" };
  }

  // code such as:
  F ary[] = { A(1, "foo"), A(2, "bar"), A(3, "foobar") };
  // becomes:
  F ary[] = { {1, "foo"}, {2, "bar"}, {3, "foobar"} };

  // returning object by calling the constructor
  std::vector<int> foo(bool arg)
  {
    if (!arg)
      return std::vector<int>(); // -> return { };
  
    std::vector<int> results;
    // <fill-in results...>
    return results;
  }

But I think they have a limited usefulness and I don't want to add this to my
proposal.

And I agree, I don't think it's that common to initialize a vector in such a
way. Maybe to initialize some static containers and using a factory functions
(see: http://stackoverflow.com/questions/3701903/initialisation-of-static-vector).
In this situation it would be good to get rid of the factory function and
initialize the vector directly, which seems to add a lot of complexity.

   - tr1 replacements. Doing everything might not be possible but at
     least some would be useful such as: unordered_map, smart
     pointers, function<> & bind(), tuple.

This one in particular is high priority. I think pretty much everything in
TR1 except the extra math functions is in C++11.

One thing I'm afraid with this task is that to be useful it requires to
implement all the changes from tr1. If we change the include by dropping 'tr1/'
it means we should support the transformation of everything the #include has.
Maybe it's not risky at all to drop out 'tr1/' in the include directives and
the reference to the namespace 'tr1::' but I don't know yet. If I understand
correctly C++11 has some difference with tr1 but only additions, mostly to
benefit of the new languages features.

Also, I think someone already talked about this, it will be interesting to find
some open source project using tr1 to apply the transformation. I took a quick
look, it doesn't seem impossible to find some.

   - fixing existing bugs (I think it's a good way to get around the
     project before starting the GSoC to get acquainted with the
     code)

I agree.

   - and (much) more...

Another option could be looking at additions to STL for C++11 and
making changes based on those additions. I mentioned emplace earlier.

I haven't thought looking at this but it's a good idea. Functions such as
emplace as you pointed-out is a perfect example of a tranform people might want
to benefit by using cpp11-migrate.

Another option could be looking for nested calls to std::max or
std::min to do an N-wise horizontal max/min op:
std::max(std::max(a,b), std::max(c,d)) => std::max({a,b,c,d}); Again,
not sure how useful this particular case is. Another suggestion was
replacing use of C arrays with std::array. I haven't looked into the
implications of this myself though. Yet another option is something
done by the remove-cstr tool in clang-tools-extra. C++11 allows you to
create std::fstreams with a std::string directly now instead of
calling std::string::c_str().

For std::array I'm not sure, I think it's usefulness is limited to small number
of situations.

I like the idea of removing std::string::c_str() calls for std::fstream.

Also:
- the access of vector data, can be replaced from '&vec[0]'/'&vec.front()' to
  'vec.data()'. I haven't looked if something more has to be taken care of
  here.
- already mentioned in the tooling doc: replace member functions
  begin()/end() by their free function equivalent.

To resume the list of apparently interesting ideas:
- Transform to replace 'auto_ptr' by 'unique_ptr'
- Transform to use free-function std::begin()/std::end()
- Integrating LibFormat
- Default transformations, profiles
- Transform to remove call to std::string::c_str() when using std::fstream
- Transform to make use existing of move constructors
- Transform to make use of new emplace functions for STL containers
- [maybe] tr1 replacement (need to know more about the implications)
- [maybe] Command line option for output file / output directory
- [maybe] Make use of new std::vector.data() / std::string::data()?

Comments below,

Marshall Clow <mclow.lists@gmail.com> writes:

8 Generate a diff of the changes

  Add an option to print a diff of the modifications against the
  original source file.

Could be useful as a kind of 'dry-run' mode where changes are not actually made but one could find out how many and what sort of changes were made.

I'd rate this as low priority.
There are lots of diffing programs out there. git/svn/hg/whatever your source control system is will do this for you.

Thank you! Your comment has been addressed in my update of the tasks.

Of course, (since it was my idea), I'd like to suggest the "tr1 killer" as a project. :wink:
  http://clang.llvm.org/docs/ClangTools.html#ideas-for-new-tools

As explained in my update, I understand the benefits of such transform but I'm
a bit uncertain about what it implies.

Is it possible to just "drop" every reference to 'tr1' or the work has to be
done on a case-by-case basis for each addition to the STL?

From: Guillaume Papin [mailto:guillaume.papin@epitech.eu]
Sent: Monday, April 29, 2013 5:25 PM
To: Vane, Edwin
Cc: cfe-dev@cs.uiuc.edu
Subject: Re: [cfe-dev] GSoC project ideas

I added some comments and wrote a summary of the new plan at the end of the
mail.

"Vane, Edwin" <edwin.vane@intel.com> writes:

> Comments below.
>
>> From: Guillaume Papin [mailto:guillaume.papin@epitech.eu]
>> Sent: Sunday, April 28, 2013 9:13 PM
>> To: Vane, Edwin
>> Cc: cfe-dev@cs.uiuc.edu
>> Subject: Re: [cfe-dev] GSoC project ideas
>>
>> I'm working on the proposal and would like to have your feedback
>> about the following plan regarding cpp11-migrate.
>>
>>
>> Tasks
>> =====
>>
>> Table of Contents
>> =================
>> 1 Transform to replace 'auto_ptr' by 'unique_ptr'
>> 2 Transform for delegating constructors
>> 3 Transform for non-static data member initializers
>> 4 Add support for interactive actions
>> 5 Default transformation profile
>> 6 Integrating LibFormat
>> 7 Transform to make use existing of move constructors
>> 8 Generate a diff of the changes
>> 9 Other incomplete ideas
>>
>>
>> 1 Transform to replace 'auto_ptr' by 'unique_ptr'
>> ==================================================
>>
>> Seems like a good transform to start.
>>
> I agree. It's not completely trivial due to semantic differences
> between auto_ptr and unique_ptr (e.g. no destructive copy in
> unique_ptr) but should be a good first big project.
>

I had this in mind (non-triviality) as you mentioned it in an earlier mail.

>> 2 Transform for delegating constructors
>> ========================================
>>
>> A transform that can convert code such as:
>>
>> struct A
>> {
>> int x;
>>
>> A() : x(0) { }
>> A(int _x) : x(_x) { }
>> };
>>
>>
>> Into:
>>
>> struct A
>> {
>> int x;
>>
>> A() : A(0) { } // now use delegation
>> A(int _x) : x(_x) { }
>> };
>>
>>
>> This is a really trivial case here but I expect this transform to
>> be non-trivial to implement.
>>
>
> A test for determining if the functionality of one constructor is
> completely subsumed by another would be really difficult to do. I'm
> not sure the benefit of a few less lines of code and some improved
> maintainability is really worth it. There is the common workaround of
> having constructors call init() functions that might be easier to
> handle but still, I think there are more useful things to focus on
> first.
>

Okay, I will remove this of the list then.

I was considering handling only constructors with empty bodies (at least for the
one 'delegated') and only simple expressions in initialization (such as
parameters, literals, ...). But it was mostly for aesthetics reasons and some
other transforms might be more beneficial (tr1?).

>> 3 Transform for non-static data member initializers
>> ====================================================
>>
>> When one or more constructor initialize a member variable with
>> a value independant from the constructor arguments the
>> initialization can be placed in-class.
>>
>> This might be beneficial when multiple constructors are duplicating
>> member initialization.
>>
>> Note that this transform might easily leads to conflicts with the
>> previous transform (delegating constructors).
>>
>
> Also questionable implementation/benefit ratio. You'd have to ensure
> every member variable is initialized the same way by every
> constructor. If you detect such a case, that would mean removing all
> the existing initializations and adding the in-class initialization.
> All that's left is to hope the user didn't mind making some vars
> initialized by constructors and some by the in-class initializers.
>

I totally agree. I will remove it from the list.

>> 4 Add support for interactive actions
>> ======================================
>>
>> Some actions might need user interaction.
>>
>> Example (maybe not the best one):
>> If some replacement code needs to introduce a new variable and
>> that the default identifier is already taken then we might want to
>> prompt the user for an alternative name.
>>
>> Or simply to ask confirmation before a risky replacement.
>>
>
> Definitely something we'd like to add to the migrator but requires
> some design first. User interactivity should be implemented in such a
> way that the actual user interface doesn't matter. That way one could
> write a plugin for an editor/IDE or just have a simple command-line
> interface. This implies some sort of library interface for
> cpp11-migrate and cpp11-migrate itself then turns into a library.
> LibFormat and clang-format have the same relationship. I'm not sure if
> this much design work is suitable to a GSoC project.
>

I see. Actually this idea was very vague. I will remove it from the list as I don't
think I'm well suited (yet?) to start designing such a library.

>> 5 Default transformation profile
>> =================================
>>
>> Apply a list of transformation by default and allow different
>> profiles. By profile I'm talking about an option such as:
>>
>> cpp11-migrate -target-profile=[clang-3.2|gcc-4.7|...] ...
>>
>>
>> This option will enable all known safe (low-risk/zero-risk)
>> transformations to the input files and are supported by the given
>> target.
>>
>> This could allow incremental migration toward C++11. Let's say the
>> project has to support Clang 3.1 in a first place and later on the
>> minimum version switch to 3.2, they can re-run the tools with the
>> new profile.
>>
>
> This is kinda cool. It's certainly not much work right now since there
> are only a handful of transforms. It'd be a slightly nicer way than
> just saying --all-transforms (if such an option existed) especially
> for people out there migrating code that's tied to a particular
> compiler version.
>
> Remembering the discussion about C++11 on llvm-dev a while back, maybe
> you could even specify a list of compilers to this flag and the common
> subset of supported features is applied :slight_smile:
>

I actually had this in mind as well (but maybe an unconscious memory from the
discussion on llvm-dev?).

>> 6 Integrating LibFormat
>> ========================
>>
>> In order to format correctly inserted code.
>>
>
> Would definitely be nice. The transforms don't do too much to mangle
> code right now but any that use the TypePrinter to print out types
> will cause the 'const' to go on the wrong side of the type specifier
> according to most styles. (i.e. const MyType *A => MyType const A*). I
> don't think LibFormat handles const locations though yet, probably for
> the same reason the transforms are limited in dealing with const
> qualifiers currently: clang doesn't provide enough TypeLoc info.
>

Good.

>> 7 Transform to make use existing of move constructors
>> ======================================================
>>
>> With move semantics added to the language and the standard library
>> being updated accordingly (move constructors added to many types),
>> it is now interesting to take an argument by value and then moving
>> it (as opposed to take by 'const &' and then copy).
>>
>
> Could be useful. Also in this category would be use of
> stl_container::emplace() functions. You'll have to be very, very
> careful about semantics though.
>

Well, I guess this idea will be a good fit for second half of the GSoC.

>> 8 Generate a diff of the changes
>> =================================
>>
>> Add an option to print a diff of the modifications against the
>> original source file.
>>
>
> Could be useful as a kind of 'dry-run' mode where changes are not actually
made but one could find out how many and what sort of changes were made.
>

I will remove this one from the list. It has been pointed out the SCM tools
already provide such functionality quite well. I think for most projects using
cpp11-migrate they will already be under source control management.

I was thinking about users that are curious about the tool (or C++11) who might
want to try cpp11-migrate on a file non-destructively. But an option for the
output file or directory would be easier to implement and as useful. But then,
what if an included file is modified? Is it necessary to reproduce the source tree
structure?

These questions you ask indicate why it's just more complex for cpp11-migrate to handle this sort of thing. The easiest option would be to run the migrator on your source, use SCM to see the diff and then use SCM to undo the changes: easy non-destructive investigation. Without SCM you could just copy your code-base and do a directory diff. Also easy.

>> 9 Other incomplete ideas
>> =========================
>>
>> If the charge of the previous ideas is not sufficient for the
>> GSoC I'm confident there is more work to do.
>>
>> - initializer_list and uniform initialization transforms (use
>> cases not identified yet)
>
> Someone once suggested to me looking for:
>
> Std::vector<int> A;
> A.push_back(a);
> A.push_back(b);
> ...
> A.push_back(z);
>
> And replacing with
>
> Std::vector<int> A = {a,b,...,z};
>
> I'm not entirely sure this is worth the effort. That is, how often is
> a vector initialization done this way? I'm not aware of other use
> cases right now.
>

I was think about easier cases (more commonly used?) such as:

  struct A
  {
    A(int a, int b);

    int a;
    const char *b;
  };

  A bar()
  {
    return F(1, "toto"); // -> return { 1, "toto" };
  }

I actually kinda like this use of uniform initialization. Using braced init lists in return statements is really helpful. Granted, it's more helpful in new code that you're writing. I wouldn't be against adding this as a smallish project to add to your proposal if you liked.

  // code such as:
  F ary[] = { A(1, "foo"), A(2, "bar"), A(3, "foobar") };
  // becomes:
  F ary[] = { {1, "foo"}, {2, "bar"}, {3, "foobar"} };

  // returning object by calling the constructor
  std::vector<int> foo(bool arg)
  {
    if (!arg)
      return std::vector<int>(); // -> return { };

    std::vector<int> results;
    // <fill-in results...>
    return results;
  }

But I think they have a limited usefulness and I don't want to add this to my
proposal.

And I agree, I don't think it's that common to initialize a vector in such a way.
Maybe to initialize some static containers and using a factory functions
(see: http://stackoverflow.com/questions/3701903/initialisation-of-static-
vector).
In this situation it would be good to get rid of the factory function and initialize
the vector directly, which seems to add a lot of complexity.

>> - tr1 replacements. Doing everything might not be possible but at
>> least some would be useful such as: unordered_map, smart
>> pointers, function<> & bind(), tuple.
>
> This one in particular is high priority. I think pretty much
> everything in
> TR1 except the extra math functions is in C++11.
>

One thing I'm afraid with this task is that to be useful it requires to implement all
the changes from tr1. If we change the include by dropping 'tr1/'
it means we should support the transformation of everything the #include has.
Maybe it's not risky at all to drop out 'tr1/' in the include directives and the
reference to the namespace 'tr1::' but I don't know yet. If I understand correctly
C++11 has some difference with tr1 but only additions, mostly to benefit of the
new languages features.

Also, I think someone already talked about this, it will be interesting to find
some open source project using tr1 to apply the transformation. I took a quick
look, it doesn't seem impossible to find some.

Since Marshall suggested the transform I bet he has some TR1 code he'd like to transform. Perhaps he can point us at some open-source code to test on.

The first part of implementing this transform would be to do an inventory of TR1 and research what made it into C++11 and what didn't and what changes, if any, were made to things that did make it in. I'd split this inventory into three lists: Stuff that appears exactly in C++11 as it does in TR1, stuff that didn't make it at all, and stuff that made it but with changes. The first list is the easiest to address: just drop 'tr1::' and modify the #includes to use the right STD header. The stuff that didn't make it is also pretty easy: don't change anything. The third list will just need to be a bunch of special cases hard-coded into the transform.

I think this transform has high value and could be straightforward to get something useful working. It only requires a bit of research to start.

>> - fixing existing bugs (I think it's a good way to get around the
>> project before starting the GSoC to get acquainted with the
>> code)
>
> I agree.
>
>> - and (much) more...
>>
>
> Another option could be looking at additions to STL for C++11 and
> making changes based on those additions. I mentioned emplace earlier.

I haven't thought looking at this but it's a good idea. Functions such as emplace
as you pointed-out is a perfect example of a tranform people might want to
benefit by using cpp11-migrate.

> Another option could be looking for nested calls to std::max or
> std::min to do an N-wise horizontal max/min op:
> std::max(std::max(a,b), std::max(c,d)) => std::max({a,b,c,d}); Again,
> not sure how useful this particular case is. Another suggestion was
> replacing use of C arrays with std::array. I haven't looked into the
> implications of this myself though. Yet another option is something
> done by the remove-cstr tool in clang-tools-extra. C++11 allows you to
> create std::fstreams with a std::string directly now instead of
> calling std::string::c_str().
>

For std::array I'm not sure, I think it's usefulness is limited to small number of
situations.

I like the idea of removing std::string::c_str() calls for std::fstream.

Also:
- the access of vector data, can be replaced from '&vec[0]'/'&vec.front()' to
  'vec.data()'. I haven't looked if something more has to be taken care of
  here.
- already mentioned in the tooling doc: replace member functions
  begin()/end() by their free function equivalent.

To resume the list of apparently interesting ideas:
- Transform to replace 'auto_ptr' by 'unique_ptr'
- Transform to use free-function std::begin()/std::end()
- Integrating LibFormat
- Default transformations, profiles
- Transform to remove call to std::string::c_str() when using std::fstream
- Transform to make use existing of move constructors
- Transform to make use of new emplace functions for STL containers
- [maybe] tr1 replacement (need to know more about the implications)
- [maybe] Command line option for output file / output directory
- [maybe] Make use of new std::vector.data() / std::string::data()?

Do you want our feedback on prioritizing this list?

"Vane, Edwin" <edwin.vane@intel.com> writes:

From: Guillaume Papin [mailto:guillaume.papin@epitech.eu]
Sent: Monday, April 29, 2013 5:25 PM
To: Vane, Edwin
Cc: cfe-dev@cs.uiuc.edu
Subject: Re: [cfe-dev] GSoC project ideas

I added some comments and wrote a summary of the new plan at the end of the
mail.

"Vane, Edwin" <edwin.vane@intel.com> writes:

> Comments below.
>
>> From: Guillaume Papin [mailto:guillaume.papin@epitech.eu]
>> Sent: Sunday, April 28, 2013 9:13 PM
>> To: Vane, Edwin
>> Cc: cfe-dev@cs.uiuc.edu
>> Subject: Re: [cfe-dev] GSoC project ideas
>>
>> I'm working on the proposal and would like to have your feedback
>> about the following plan regarding cpp11-migrate.
>>
>>
>> Tasks
>> =====
>>
>> Table of Contents
>> =================
>> 1 Transform to replace 'auto_ptr' by 'unique_ptr'
>> 2 Transform for delegating constructors
>> 3 Transform for non-static data member initializers
>> 4 Add support for interactive actions
>> 5 Default transformation profile
>> 6 Integrating LibFormat
>> 7 Transform to make use existing of move constructors
>> 8 Generate a diff of the changes
>> 9 Other incomplete ideas
>>
>>
>> 1 Transform to replace 'auto_ptr' by 'unique_ptr'
>> ==================================================
>>
>> Seems like a good transform to start.
>>
> I agree. It's not completely trivial due to semantic differences
> between auto_ptr and unique_ptr (e.g. no destructive copy in
> unique_ptr) but should be a good first big project.
>

I had this in mind (non-triviality) as you mentioned it in an earlier mail.

>> 2 Transform for delegating constructors
>> ========================================
>>
>> A transform that can convert code such as:
>>
>> struct A
>> {
>> int x;
>>
>> A() : x(0) { }
>> A(int _x) : x(_x) { }
>> };
>>
>>
>> Into:
>>
>> struct A
>> {
>> int x;
>>
>> A() : A(0) { } // now use delegation
>> A(int _x) : x(_x) { }
>> };
>>
>>
>> This is a really trivial case here but I expect this transform to
>> be non-trivial to implement.
>>
>
> A test for determining if the functionality of one constructor is
> completely subsumed by another would be really difficult to do. I'm
> not sure the benefit of a few less lines of code and some improved
> maintainability is really worth it. There is the common workaround of
> having constructors call init() functions that might be easier to
> handle but still, I think there are more useful things to focus on
> first.
>

Okay, I will remove this of the list then.

I was considering handling only constructors with empty bodies (at least for the
one 'delegated') and only simple expressions in initialization (such as
parameters, literals, ...). But it was mostly for aesthetics reasons and some
other transforms might be more beneficial (tr1?).

>> 3 Transform for non-static data member initializers
>> ====================================================
>>
>> When one or more constructor initialize a member variable with
>> a value independant from the constructor arguments the
>> initialization can be placed in-class.
>>
>> This might be beneficial when multiple constructors are duplicating
>> member initialization.
>>
>> Note that this transform might easily leads to conflicts with the
>> previous transform (delegating constructors).
>>
>
> Also questionable implementation/benefit ratio. You'd have to ensure
> every member variable is initialized the same way by every
> constructor. If you detect such a case, that would mean removing all
> the existing initializations and adding the in-class initialization.
> All that's left is to hope the user didn't mind making some vars
> initialized by constructors and some by the in-class initializers.
>

I totally agree. I will remove it from the list.

>> 4 Add support for interactive actions
>> ======================================
>>
>> Some actions might need user interaction.
>>
>> Example (maybe not the best one):
>> If some replacement code needs to introduce a new variable and
>> that the default identifier is already taken then we might want to
>> prompt the user for an alternative name.
>>
>> Or simply to ask confirmation before a risky replacement.
>>
>
> Definitely something we'd like to add to the migrator but requires
> some design first. User interactivity should be implemented in such a
> way that the actual user interface doesn't matter. That way one could
> write a plugin for an editor/IDE or just have a simple command-line
> interface. This implies some sort of library interface for
> cpp11-migrate and cpp11-migrate itself then turns into a library.
> LibFormat and clang-format have the same relationship. I'm not sure if
> this much design work is suitable to a GSoC project.
>

I see. Actually this idea was very vague. I will remove it from the list as I don't
think I'm well suited (yet?) to start designing such a library.

>> 5 Default transformation profile
>> =================================
>>
>> Apply a list of transformation by default and allow different
>> profiles. By profile I'm talking about an option such as:
>>
>> cpp11-migrate -target-profile=[clang-3.2|gcc-4.7|...] ...
>>
>>
>> This option will enable all known safe (low-risk/zero-risk)
>> transformations to the input files and are supported by the given
>> target.
>>
>> This could allow incremental migration toward C++11. Let's say the
>> project has to support Clang 3.1 in a first place and later on the
>> minimum version switch to 3.2, they can re-run the tools with the
>> new profile.
>>
>
> This is kinda cool. It's certainly not much work right now since there
> are only a handful of transforms. It'd be a slightly nicer way than
> just saying --all-transforms (if such an option existed) especially
> for people out there migrating code that's tied to a particular
> compiler version.
>
> Remembering the discussion about C++11 on llvm-dev a while back, maybe
> you could even specify a list of compilers to this flag and the common
> subset of supported features is applied :slight_smile:
>

I actually had this in mind as well (but maybe an unconscious memory from the
discussion on llvm-dev?).

>> 6 Integrating LibFormat
>> ========================
>>
>> In order to format correctly inserted code.
>>
>
> Would definitely be nice. The transforms don't do too much to mangle
> code right now but any that use the TypePrinter to print out types
> will cause the 'const' to go on the wrong side of the type specifier
> according to most styles. (i.e. const MyType *A => MyType const A*). I
> don't think LibFormat handles const locations though yet, probably for
> the same reason the transforms are limited in dealing with const
> qualifiers currently: clang doesn't provide enough TypeLoc info.
>

Good.

>> 7 Transform to make use existing of move constructors
>> ======================================================
>>
>> With move semantics added to the language and the standard library
>> being updated accordingly (move constructors added to many types),
>> it is now interesting to take an argument by value and then moving
>> it (as opposed to take by 'const &' and then copy).
>>
>
> Could be useful. Also in this category would be use of
> stl_container::emplace() functions. You'll have to be very, very
> careful about semantics though.
>

Well, I guess this idea will be a good fit for second half of the GSoC.

>> 8 Generate a diff of the changes
>> =================================
>>
>> Add an option to print a diff of the modifications against the
>> original source file.
>>
>
> Could be useful as a kind of 'dry-run' mode where changes are not actually
made but one could find out how many and what sort of changes were made.
>

I will remove this one from the list. It has been pointed out the SCM tools
already provide such functionality quite well. I think for most projects using
cpp11-migrate they will already be under source control management.

I was thinking about users that are curious about the tool (or C++11) who might
want to try cpp11-migrate on a file non-destructively. But an option for the
output file or directory would be easier to implement and as useful. But then,
what if an included file is modified? Is it necessary to reproduce the source tree
structure?

These questions you ask indicate why it's just more complex for
cpp11-migrate to handle this sort of thing. The easiest option would
be to run the migrator on your source, use SCM to see the diff and
then use SCM to undo the changes: easy non-destructive investigation.
Without SCM you could just copy your code-base and do a directory
diff. Also easy.

>> 9 Other incomplete ideas
>> =========================
>>
>> If the charge of the previous ideas is not sufficient for the
>> GSoC I'm confident there is more work to do.
>>
>> - initializer_list and uniform initialization transforms (use
>> cases not identified yet)
>
> Someone once suggested to me looking for:
>
> Std::vector<int> A;
> A.push_back(a);
> A.push_back(b);
> ...
> A.push_back(z);
>
> And replacing with
>
> Std::vector<int> A = {a,b,...,z};
>
> I'm not entirely sure this is worth the effort. That is, how often is
> a vector initialization done this way? I'm not aware of other use
> cases right now.
>

I was think about easier cases (more commonly used?) such as:

  struct A
  {
    A(int a, int b);

    int a;
    const char *b;
  };

  A bar()
  {
    return F(1, "toto"); // -> return { 1, "toto" };
  }

I actually kinda like this use of uniform initialization. Using braced
init lists in return statements is really helpful. Granted, it's more
helpful in new code that you're writing. I wouldn't be against adding
this as a smallish project to add to your proposal if you liked.

I will add this (restrained?) case to the list. It can be a base for future
work on using uniform initialization.

  // code such as:
  F ary[] = { A(1, "foo"), A(2, "bar"), A(3, "foobar") };
  // becomes:
  F ary[] = { {1, "foo"}, {2, "bar"}, {3, "foobar"} };

  // returning object by calling the constructor
  std::vector<int> foo(bool arg)
  {
    if (!arg)
      return std::vector<int>(); // -> return { };

    std::vector<int> results;
    // <fill-in results...>
    return results;
  }

But I think they have a limited usefulness and I don't want to add this to my
proposal.

And I agree, I don't think it's that common to initialize a vector in such a way.
Maybe to initialize some static containers and using a factory functions
(see: http://stackoverflow.com/questions/3701903/initialisation-of-static-
vector).
In this situation it would be good to get rid of the factory function and initialize
the vector directly, which seems to add a lot of complexity.

>> - tr1 replacements. Doing everything might not be possible but at
>> least some would be useful such as: unordered_map, smart
>> pointers, function<> & bind(), tuple.
>
> This one in particular is high priority. I think pretty much
> everything in
> TR1 except the extra math functions is in C++11.
>

One thing I'm afraid with this task is that to be useful it requires to implement all
the changes from tr1. If we change the include by dropping 'tr1/'
it means we should support the transformation of everything the #include has.
Maybe it's not risky at all to drop out 'tr1/' in the include directives and the
reference to the namespace 'tr1::' but I don't know yet. If I understand correctly
C++11 has some difference with tr1 but only additions, mostly to benefit of the
new languages features.

Also, I think someone already talked about this, it will be interesting to find
some open source project using tr1 to apply the transformation. I took a quick
look, it doesn't seem impossible to find some.

Since Marshall suggested the transform I bet he has some TR1 code he'd like to transform. Perhaps he can point us at some open-source code to test on.

The first part of implementing this transform would be to do an
inventory of TR1 and research what made it into C++11 and what didn't
and what changes, if any, were made to things that did make it in. I'd
split this inventory into three lists: Stuff that appears exactly in
C++11 as it does in TR1, stuff that didn't make it at all, and stuff
that made it but with changes. The first list is the easiest to
address: just drop 'tr1::' and modify the #includes to use the right
STD header. The stuff that didn't make it is also pretty easy: don't
change anything. The third list will just need to be a bunch of
special cases hard-coded into the transform.

I think this transform has high value and could be straightforward to get something useful working. It only requires a bit of research to start.

Okay, it really sounds like a valuable inclusion to the project. I will add it
to the list.

>> - fixing existing bugs (I think it's a good way to get around the
>> project before starting the GSoC to get acquainted with the
>> code)
>
> I agree.
>
>> - and (much) more...
>>
>
> Another option could be looking at additions to STL for C++11 and
> making changes based on those additions. I mentioned emplace earlier.

I haven't thought looking at this but it's a good idea. Functions such as emplace
as you pointed-out is a perfect example of a tranform people might want to
benefit by using cpp11-migrate.

> Another option could be looking for nested calls to std::max or
> std::min to do an N-wise horizontal max/min op:
> std::max(std::max(a,b), std::max(c,d)) => std::max({a,b,c,d}); Again,
> not sure how useful this particular case is. Another suggestion was
> replacing use of C arrays with std::array. I haven't looked into the
> implications of this myself though. Yet another option is something
> done by the remove-cstr tool in clang-tools-extra. C++11 allows you to
> create std::fstreams with a std::string directly now instead of
> calling std::string::c_str().
>

For std::array I'm not sure, I think it's usefulness is limited to small number of
situations.

I like the idea of removing std::string::c_str() calls for std::fstream.

Also:
- the access of vector data, can be replaced from '&vec[0]'/'&vec.front()' to
  'vec.data()'. I haven't looked if something more has to be taken care of
  here.
- already mentioned in the tooling doc: replace member functions
  begin()/end() by their free function equivalent.

To resume the list of apparently interesting ideas:
- Transform to replace 'auto_ptr' by 'unique_ptr'
- Transform to use free-function std::begin()/std::end()
- Integrating LibFormat
- Default transformations, profiles
- Transform to remove call to std::string::c_str() when using std::fstream
- Transform to make use existing of move constructors
- Transform to make use of new emplace functions for STL containers
- [maybe] tr1 replacement (need to know more about the implications)
- [maybe] Command line option for output file / output directory
- [maybe] Make use of new std::vector.data() / std::string::data()?

Do you want our feedback on prioritizing this list?

Here is my list ordered by the order I would like to implement things (not
order of importance). I haven't thought carefully yet about the time it would
take.

- Transform to replace 'auto_ptr' by 'unique_ptr' [*]
- Transform to use free-function std::begin()/std::end()
- Transform to use uniform-initialization on return by calling a constructor
- Transform to remove call to std::string::c_str() when using std::fstream
- Integrating LibFormat [*]
- Default transformations, profiles [*]
- Transform to replace uses of tr1 [*]
- Transform to make use existing of move constructors [*]
- Transform to make use of new emplace functions for STL containers
- [maybe] Make use of new std::vector.data() / std::string::data()?

I marked with [*] the one I consider the most important to have.

Yes any feedback is most welcomed !

Thank you.

Your list has a good mix of different sized projects. Some of the points are clearly more useful to the community than others so I would recommend organizing your list that way. That said, starting with projects that are interesting to you is a good idea for keeping your motivation high:)

The only item I'm not sure of is using vector::data() and string::data(). I haven't really seen any compelling reason for converting existing &operator[i] calls. I'm fine to be convinced otherwise.

Your list has a good mix of different sized projects. Some of the points are clearly more useful to the community than others so I would recommend organizing your list that way. That said, starting with projects that are interesting to you is a good idea for keeping your motivation high:)

The only item I’m not sure of is using vector::data() and string::data(). I haven’t really seen any compelling reason for converting existing &operator[i] calls. I’m fine to be convinced otherwise.

One minor reason is that the &op[i] version is undefined in the case of an empty vector, but people often don’t bounds check this operation (because they also pass a zero size to whatever code receives the buffer, so it’s never dereferenced, which is usually sufficient)

I threw the data() idea by looking at the changes at the new standard but I
don't think it's a really interesting transform (both the benefit and my
interest are limited). So I will not let it in the final list.

Okay I will maybe move some higher priority/interest things on the top and move
down or opt-out some of the following items:
- Transform to use free-function std::begin()/std::end()
- Transform to use uniform-initialization on return by calling a constructor
- Transform to remove call to std::string::c_str() when using std::fstream
- Transform to make use of new emplace functions for STL containers

I will give you my final list before when I submit my proposal. It can be
further enhanced/modified if it's not correct according to the GSoC
documentation.

Thank you for you help.

"Vane, Edwin" <edwin.vane@intel.com> writes: