clang-format leading whitespace

I am interested in contributing to clang-format, to add an option to preserve and correct leading indentation.

Right now if you write code such as:

{
…int x = 10;

…return x;
}

The line in the middle will be truncated and there is no option to control it.

Some code prefers to have indented leading whitespace, indented to the same level as the previous line.

Would such a contribution be accepted and is anyone willing to help me through the process where necessary?

Thanks
Samuel

1 Like

I am interested in contributing to clang-format, to add an option to preserve and correct leading indentation.

Right now if you write code such as:

{
…int x = 10;

…return x;
}

The line in the middle will be truncated and there is no option to control it.

Some code prefers to have indented leading whitespace, indented to the same level as the previous line.

Would such a contribution be accepted and is anyone willing to help me through the process where necessary?

The current reasoning is that with an automated formatter the indented leading whitespace doesn’t help any more.

Generally, see here:
https://clang.llvm.org/docs/ClangFormatStyleOptions.html#adding-additional-style-options

I vaguely remember this question being brought up before and I think we decided against it.

Thanks for your feedback, and I appreciate the link and the logic behind it.

Unfortunately, by stripping all trailing whitespace, clang-format has made a policy decision, but with no way to control it via the configuration. I understand it’s not desirable for every option to be configurable, but this is something which significantly affects existing code. I propose that it would be good to have an option “TrailingWhitespace: Truncate|Preserve|Fix”. Truncate would be existing behaviour. Preserve would not alter leading whitespace on blank lines or touch trailing whitespace. Fix would fix leading whitespace on blank lines, fix trailing whitespace on code, but avoid making any changes in comments except if it was reflowed. It could be more granular but that’s where I’d start.

Some objective problems:

  • Existing code which is indented this way would require a lot of changes.
  • Markdown (which is useful in comments) is sensitive to trailing whitespace.

It’s possible to add a post process script to normalise leading whitespace but not trailing whitespace (e.g. in the case of markdown).

Some subjective problems:

  • It’s my preference when I start editing on a line, the indentation is correct already.
  • When moving cursor up and down it’s my preference that it doesn’t jump back and forward when crossing blank lines.

Some of these issues can be fixed by editor configuration/plugins, but it’s still not 100% correct.

Thanks for your time.

Kind regards,
Samuel

As an extra data point I want to tell that even with an automated formatter the indented leading whitespace does help. If you re-indent a block of code and leading whitespace is preserved, then you get a single chunk in the blame of the file. If you eat the leading whitespace then you have a million little pieces in the blame and it’s less clear just what happened. Something like

1111bc501498 (John Doe 2015-05-12 09:23:57 +0000 178) if (something) {
1111bc501498 (John Doe 2015-05-12 09:23:57 +0000 179) // do something
2222c3ae83d8 (Adam Smith 2014-12-10 19:00:42 +0000 180)
1111bc501498 (John Doe 2015-05-12 09:23:57 +0000 181) // do some more stuff
1111bc501498 (John Doe 2015-05-12 09:23:57 +0000 182) }

If leading whitespace was preserved after reindentation you would get a single blame chunk. The same applies to diff. Empty lines can make chunks too granular.

As for requirement

be used in a project of significant size

I want to point out that preserving leading whitespace on whitespace-only lines is Xcode default setting. And as the result many projects can end up using this style. For example, a quick check of the top 5 Objective-C repositories with the most stars on GitHub shows that 3 out of 5 are using predominantly this style (rs/SDWebImage, BradLarson/GPUImage, SnapKit/Masonry). It isn’t representative in any way, I just want to show that such style isn’t that obscure.

Thanks,
Volodymyr

As an extra data point I want to tell that even with an automated
formatter the indented leading whitespace does help. If you re-indent a
block of code and leading whitespace is preserved, then you get a single
chunk in the blame of the file. If you eat the leading whitespace then you
have a million little pieces in the blame and it’s less clear just what
happened. Something like

1111bc501498 (John Doe 2015-05-12 09:23:57 +0000 178) if (something)
{
1111bc501498 (John Doe 2015-05-12 09:23:57 +0000 179) // do
something
2222c3ae83d8 (Adam Smith 2014-12-10 19:00:42 +0000 180)
1111bc501498 (John Doe 2015-05-12 09:23:57 +0000 181) // do some
more stuff
1111bc501498 (John Doe 2015-05-12 09:23:57 +0000 182) }

If leading whitespace was preserved after reindentation you would get a
single blame chunk. The same applies to diff. Empty lines can make chunks
too granular.

This seems like a very specific and actually quite rare case. And I'd
personally be happy to see some hint that John Doe wasn't actually to blame
for all of those lines (so that I can remember to run "git blame" with
"-w").

As for requirement

be used in a project of significant size

I want to point out that preserving leading whitespace on whitespace-only
lines is Xcode default setting. And as the result many projects can end up
using this style. For example, a quick check of the top 5 Objective-C
repositories with the most stars on GitHub
<https://github.com/search?l=objective-c&o=desc&q=stars:>1&s=stars&type=Repositories> shows
that 3 out of 5 are using predominantly this style
(rs/SDWebImage, BradLarson/GPUImage, SnapKit/Masonry). It isn’t
representative in any way, I just want to show that such style isn’t that
obscure.

It totally makes sense for an automated indenter in an editor to add these
leading spaces, but I think the use case is different from a formatter. And
it makes sense for some projects coming from Xcode to not bother to modify
anything afterwards. It doesn't mean that this is the style they *want*.

Thanks,

Thanks for your feedback, and I appreciate the link and the logic behind
it.

Unfortunately, by stripping all trailing whitespace, clang-format has made
a policy decision, but with no way to control it via the configuration. I
understand it's not desirable for every option to be configurable, but this
is something which significantly affects existing code. I propose that it
would be good to have an option "TrailingWhitespace:
Truncate>Preserve>Fix". Truncate would be existing behaviour. Preserve
would not alter leading whitespace on blank lines or touch trailing
whitespace. Fix would fix leading whitespace on blank lines, fix trailing
whitespace on code, but avoid making any changes in comments except if it
was reflowed. It could be more granular but that's where I'd start.

## Some objective problems:

- Existing code which is indented this way would require a lot of changes.

That's true for any project starting to use clang-format. That's why we'd
suggest to use editor and version control system integrations and only
format the lines that are touched.

- Markdown (which is useful in comments) is sensitive to trailing
whitespace.

I haven't seen a project using markdown in comments and relying on the
trailing whitespace. However, even if some do, I think it'd be better for
clang-format to recognize this and not remove spaces for those comments
specifically.

It's possible to add a post process script to normalise leading whitespace

but not trailing whitespace (e.g. in the case of markdown).

## Some subjective problems:

- It's my preference when I start editing on a line, the indentation is
correct already.

How often do you even start editing in a currently empty line? Don't you
normally start with some existing code and have to insert a line break?
clang-format wouldn't help there..

(It's also a really easy habit to loose once you have gotten yourself to
rely on clang-format working in your editor).

- When moving cursor up and down it's my preference that it doesn't jump
back and forward when crossing blank lines.

But the proposal wouldn't really change much here, unless you happen to
have the cursor right at the start of the line and it would all be indented
the exact same level. Isn't that a really rare case?

Some of these issues can be fixed by editor configuration/plugins, but it's

## Some objective problems:

- Existing code which is indented this way would require a lot of changes.

That's true for any project starting to use clang-format. That's why we'd
suggest to use editor and version control system integrations and only
format the lines that are touched.

I'm not really sure the workflow you describe really makes sense.

I'd prefer to apply it to the whole project in one shot and then keep it
formatted that way using pre-commit hooks or something like that.

- Markdown (which is useful in comments) is sensitive to trailing
whitespace.

I haven't seen a project using markdown in comments and relying on the
trailing whitespace. However, even if some do, I think it'd be better for
clang-format to recognize this and not remove spaces for those comments
specifically.

clang-format simply can't know that the comments contain markdown or not,
so what you propose is impossible. One option would be to not reformat
comments.

It's possible to add a post process script to normalise leading whitespace

but not trailing whitespace (e.g. in the case of markdown).

## Some subjective problems:

- It's my preference when I start editing on a line, the indentation is
correct already.

How often do you even start editing in a currently empty line? Don't you
normally start with some existing code and have to insert a line break?
clang-format wouldn't help there..

I start editing on empty lines all the time.

(It's also a really easy habit to loose once you have gotten yourself to
rely on clang-format working in your editor).

- When moving cursor up and down it's my preference that it doesn't jump
back and forward when crossing blank lines.

But the proposal wouldn't really change much here, unless you happen to
have the cursor right at the start of the line and it would all be indented
the exact same level. Isn't that a really rare case?

It's not a rare case at all if you have lots of empty lines.

Some of these issues can be fixed by editor configuration/plugins, but

it's still not 100% correct.

Thanks for your time.

Kind regards,
Samuel

_______________________________________________
cfe-dev mailing list
cfe-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

I still think this is a good idea. and the fact that many top projects
prefer this style is a good indication that it would be a useful addition,
IMHO.

Kind regards,
Samuel

Do you have links to example projects? Do those projects have public style guides?

I still think this is a good idea. and the fact that many top projects prefer this style is a good indication that it would be a useful addition, IMHO.

For whatever my "sample size of one" is worth, I have never come across a codebase that uses this style.

If you look to an earlier email:

I came across this thread and I highly agree with the need of this necessary feature.
Is this option still not being supported in clang-format? Will a contribution to add this feature to clang-format will be accepted?

I wouldn’t be against it, but I think it was dismissed in this thread.

Before using clang-format I also had the blank lines on the indention level, but with that tool (and my IDE) I don’t need it anymore. So maybe think about if you really want that.

If the answer is yes, you can start making a patch and put it in review, even at an early state, with some reasoning and/or examples of styles why one would want or need such an option.