LLVM source file line endings

Hi all,

This seems like a basic question, but does LLVM source code require any specific line ending encoding? I recently committed 2 new files to the LLVM repo (ModRef.cpp and OptionStrCmp.cpp) and both seems to have DOS style line endings (\r\n) whereas most other files seems to use Unix line endings. Its causing some of our internal builds to fail and while the fix is simple (run dos2unix of the 2 files) I was wondering if there is convention that LLVM uses and if not, should we automatically convert to Unix file endings ( Configuring Git to handle line endings - GitHub Docs may be with “text” for .c/.cpp/h/.td files?

Alternatively, it seems anyone adding new files needs to make sure that the line encoding is Unix style to manually in order to prevent inadvertent build failures downstream.

Another possibility is a CI warning. But handling this automatically, if possible, seems preferable.

(I am still not sure how the DOS line endings sneaked in; I suspect its VSCode used with WSL2 that defaults to dos style line endings).

Thanks,
Rahul

Files in the LLVM project should almost always have Unix line endings. There are a small number of test files that purposely have DOS endings, but aside from that, no.

I have core.autocrlf=input in my global config, which means files get checked-out with the original line endings. Modern Windows editors (even Notepad) can all tolerate LF endings and won’t randomly insert CR, if the file starts out with LF endings. I also tend to copy an existing file if I need to create a new file, just to make sure I don’t screw up.

Thanks. Yeah, that makes sense. My question then is really if we want this to be enforced that in git itself with the .gitattributes config option (assuming it works the way I think it will, so that any files with DOS endings get converted into Unix endings as a part of committing the file).

I took a quick look at llvm/.gitattributes and it explicitly specifies line endings on certain files (and ‘binary’ for some others). It doesn’t specify a general default, though.

I know this topic has come up before, you could look to see if there’s an old RFC and see what conclusions it came to (if any). You can always put up a new one, too. There are definitely people with Opinions on this topic.

See Finally formalise our defacto line-ending policy by ldrumm ¡ Pull Request #86318 ¡ llvm/llvm-project ¡ GitHub for a related PR. Not sure why that never landed.

Thanks. May be @ldrumm can comment on why it did not land.

On Tue Sep 24, 2024 at 4:20 PM BST, Rahul Joshi via LLVM Discussion Forums wrote:

Thanks. May be @ldrumm can comment on why it did not land.

It never landed because I got distracted. I should have some bandwidth to get
this over the line very shortly. This thread is good incentive to get back to
it, so thanks. Feel free to comment on that PR if there are issues with the
implementation.

Great, thanks! Given that this seems like a common recurrence, would be great to land (I myself committed 2 of these line ending fixes in last 2 days).

I’ve just merged.

To github.com:llvm/llvm-project.git
   8c60efe94ba3..9d98acb196a4  main -> main

Thanks for the push to get this done and please let me know of any unforeseen consequences!

1 Like

Thanks @ldrumm. I think this might deserve a PSA and mention in LLVM weekly to not catch folks by surprise. @asb

1 Like