GSOC - Use more StringRef in clang.

Hello,

I was looking to tackle the "StringRef'ize APIs" suggestion from the
clang project page and just wanted to post a couple of thoughts and ask
a couple of questions.

First of all I am going to talk about how I see the goals of the
project. Basically, as far as I understand it I will be converting
existing functions that take std::string or char*'s to use
llvm::StringRef where applicable. Then I will be changing a number of
call sites to use this new function.

One major question I have is should the old version be removed? It
would be very possible to convert the old version as a stub and it may
make it easier for others when they have a string rather than a
StringRef (although conversion is simple anyways). Also there may be
API and ABI implications if a function from the public API is converted.
What do you think the best approach for this is?

Another question I have is how would you define focus. A large part of
the project is hunting through the source to find and change these
functions so how would "progress" be defined. GSOC requires a solid
requirement for mid-term and final requirements. Should I choose a
number of functions that I expect to have converted in this time or is
there a better criteria that you can think of.

cheers,
Kevin

Hello,

I was looking to tackle the "StringRef'ize APIs" suggestion from the
clang project page and just wanted to post a couple of thoughts and ask
a couple of questions.

Hi - welcome to the project!

First of all I am going to talk about how I see the goals of the
project. Basically, as far as I understand it I will be converting
existing functions that take std::string or char*'s to use
llvm::StringRef where applicable. Then I will be changing a number of
call sites to use this new function.

One major question I have is should the old version be removed?

The intention is to change APIs directly rather than introducing a new
API alongside the old one. (so, yes, the old one should be removed/not
exist)

It
would be very possible to convert the old version as a stub and it may
make it easier for others when they have a string rather than a
StringRef (although conversion is simple anyways).

This change should generally be API compatible (implicit conversions
to StringRef should fire in most/common cases) and fixing up a few
callers for which the extra user defined conversion is not accessible
shouldn't be too painful.

  Also there may be
API and ABI implications if a function from the public API is converted.
What do you think the best approach for this is?

The LLVM C++ API has no ABI guarantee/stability, we break it
continuously and intend to keep doing so - you're welcome to do the
same in this effort.

Another question I have is how would you define focus. A large part of
the project is hunting through the source to find and change these
functions so how would "progress" be defined. GSOC requires a solid
requirement for mid-term and final requirements. Should I choose a
number of functions that I expect to have converted in this time or is
there a better criteria that you can think of.

I don't know much about GSOC to know whether this would be a good
project or not, nor how it might be evaluated.

I had a few deeper issues when I started on the project & was working
on StringRef upgrades - I started looking at Twine and trying to
figure out whether Twine could be used more pervasively, but never
came to any good conclusions about that. I eventually just decided to
do ArrayRef work which was more unambiguous.

You could search the codebase for particular idioms (I found ArrayRef
opportunities by searching for "\.data().*\.length()" I think - or
idioms like that (you could search for "const std::string&"
parameters, for example, if you wanted to do StringRef upgrades)) and
see if the number of instances is high enough for a reasonable sized
project, then use that metric to track your progress - run the same
search each day/week/whatever and demonstrate that you're approaching
zero.

- David

Thanks for the feedback David.

I have created a quick draft of my proposal and would appreciate any
feedback.

GSOC Proposal -- StringRef'ize APIs

Thanks for the feedback David.

I have created a quick draft of my proposal and would appreciate any
feedback.

I don't enjoy to be the conveyer of bad news, but I don't think that this is a
very useful GSoC project.

1. It's an utterly boring and rewardless task :wink:

2. StringRef'ization is mostly done, the remaining cases may be just pieces
of code where StringRef is not obviously the best way to pass a string. Think
of passing a std::string by reference because it's modified in the callee. I highly
doubt that there are many cases in LLVM where someone just forgot to add
a "const".

StringRef'ization may be a good task if you just want to kill a couple of hours
and get a deeper understanding of the relationships between LLVM/Clang
components, for GSoC you can/should pick something cooler. :slight_smile:

You can take a look at the projects we had in the last couple of years (LLVM
participated every year since 2006) to get an idea of what a project that is likely
to get accepted looks like.

- Ben