Is "*- C++ -*-" in header files still relevant?

LLVM coding standard recommends adding a marker to header files to indicate to emacs that it’s a C++ header as opposed to a C header:

LLVM Coding Standards — LLVM 20.0.0git documentation

It seems that this is inconsistently applied in the code. We have .cpp files that use this tag (likely copy paste) when it’s not needed and header files that are missing this tag. My question is, is this recommendation still relevant for emacs users? Within the LLVM repo, I get:

find . -name *.h | xargs head -q -n  1 | wc -l
14657

find . -name *.h | xargs head -q -n  1 | grep "\- C++ \-" | wc -l
8836

find . -name *.cpp | xargs head -q -n  1 | wc -l
32930

find . -name *.cpp | xargs head -q -n  1 | grep "\- C++ \-" | wc -l
1833

This is probably an overestimate due to inclusion of clang and its test files. Within llvm, its 2677 out of 3104 have this, and 727 .cpp files have this unnecessarily.

1 Like

Yes, it’s still relevant and only needed in .h files.

2 Likes

I think if its important, then we should have some way of enforcement. Maybe a GitHub action could help here.

I was also thinking if clang-format can do it automatically under an option.

I use Emacs as my primary editor and I would not mind if we got rid of these markers, because Emacs has a heuristic to detect whether a .h file is C or C++ and it seems to work the vast majority of the time. As a quick experiment I tried loading all the headers without a C++ marker (ignoring files in test/ and unittests/), and the only ones that were autodetected as C were:

cmake/unwind.h
include/llvm-c/*
include/llvm/WindowsDriver/MSVCSetupApi.h
lib/ExecutionEngine/IntelJITProfiling/ittnotify_config.h
lib/ExecutionEngine/IntelJITProfiling/ittnotify_types.h
lib/ExecutionEngine/IntelJITProfiling/jitprofiling.h
lib/Support/BLAKE3/blake3_impl.h
lib/Support/BLAKE3/llvm_blake3_prefix.h
lib/Support/regex2.h
lib/Support/regex_impl.h
lib/Support/regutils.h
lib/Support/rpmalloc/rpmalloc.h
lib/Support/rpmalloc/rpnew.h
lib/Target/AMDGPU/Utils/AMDKernelCodeTInfo.h
lib/Target/Hexagon/HexagonDepMask.h
lib/Target/NVPTX/cl_common_defines.h
tools/llvm-c-test/llvm-c-test.h
tools/llvm-objdump/ObjdumpOptID.h

Many of these really are C, so the number of misdetections seems tiny, out of ~435 headers without a marker, out of ~3024 headers in total.

4 Likes

I wouldn’t mind getting rid of the filename in the first line, too. It is a constant source of copy & paste errors and doesn’t convey any information.

The first line should also contain a brief description of the contents of the file, but often this description just doesn’t fit and becomes descriptiveless.
Personally, I just don’t add the description and have never seen any complaints.

10 Likes

Thanks. So, should we propose that change in the coding standard doc? (Eliminate the C++ tag in .h files and filenames in header line) So new files that are added do not need to follow it. The description can be optional.

//===--- <description>? -----------===//

I’ve never seen this work

Indeed it works! I wasn’t aware of this. It seems to have been introduced in 26.1 as the function c-or-c++-mode. Independent of that, I don’t really see why Emacs should be treated specially given how vastly configurable it is, although I myself use Emacs almost exclusively for editing code.

I would prefer a an empty line in the banner … no file name, no description, no mode indicator for Emacs.

1 Like

I support this proposal.

This header line has been the source of a bunch of copy-paste errors on my part, and would love to see it removed.

+1 to removing the header line altogether, the header line is annoying to keep up to date.

//===----------------------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

just the license should be enough

3 Likes

No. My Emacs (29.4, lsp-mode and clangd-20 if that matters) can figure it out.
But while working on all these files, would it not make sense to split the files between for example .h and .hpp according to the language? :thinking:

No. Changing the top line of the header files is already a fair amount of churn. Renaming the files and therefore changing the million places where they are #included would be pretty intolerable.

Yes, you are right, unfortunately.

Thanks all. Let me put up an PR with this change in the coding standard doc and see how it goes.

The coding standard amendment is now committed: [NFC] Eliminate need of Emacs tag and file name in file header (#118553) · llvm/llvm-project@e2c3d16

We can now update the file headers over time.

5 Likes