[RFC] Specify a community policy for use of private links

AaronBallman · July 21, 2023, 5:04pm

For background information, please see the initial post on this topic and the discussion of the proposed wording.

tl;dr: this proposes a community policy on use of links to resources not accessible to the whole community, such as private bug trackers, private documentation, and other kinds of dead links. The proposal is to disallow such links in source code, test files, or documentation while allowing these links to be used for purposes of process automation, including for downstream consumers, in commit messages, code review comments, issue comments, or other forms of discussion.

Problem

The community has never had a documented policy on whether private links were permissible, and as such, an ad hoc policy arose allowing such links under certain circumstances because they were thought to be mostly harmless metadata. The “rule” was effectively to only use these links in commit messages (or other forms of discussion) or as comments within test files and only when the links add extra information about the test case. However, this policy was never official and was never enforced, which means there are a large number of these links that have no supporting context or that are present in source files, etc. Over time, this has caused several concrete issues:

The links introduce confusion where there should be clarity, which adds extra burden on the community. For example, code reviews where diagnostic behavior changes a test case and the only surrounding context in the test is a link to a private bug tracker. The patch author and the code reviewers then have to expend more effort determining the correct path forward but without a straightforward way to do so.
Private resources are often private for legal IP reasons; the community should not have a policy encouraging the release of private information. Further, because of the legal risks involved, the information behind the links is often not released upon community request.
Allowing links to private resources is a relatively novel activity for an open source project. So these links are especially problematic for new contributors who have not encountered this practice before. We should not introduce needless barriers to entry (however small they may be) for newcomers without significant benefit to the community as a whole.
It doesn’t scale. Currently, only one company has been adding private links and so the scaling issue hasn’t been a problem. However, we have dozens of corporate downstreams and thousands of individual contributors and forks – we cannot allow all of them the ability to add links to private resources without negatively impacting our ability to maintain projects and it would be inappropriate to allow only one company to introduce these links.

Solution

The proposal is that, moving forward, links to private resources should not be added to source code, test files, or documentation (basically, anything that gets committed to the repo). Instead, please add sufficient comments to obviate the need for a link in the first place. However, these links are still allowed in commit messages, code review, issue comments, discussions, etc when helpful. This can be useful for downstream process automation and isn’t significantly different from other forms of metadata we allow in these contexts.

No project is required to remove existing private links. Like any coding style change, link removal can happen organically or on an as-needed basis. It is acceptable to remove any such link as needed without getting prior authorization from the link owner: either there is sufficient surrounding comments that the link is superfluous or there is not sufficient surrounding comments and the link is not providing value to the community in that case. However, if removal of a link causes a specific problem in practice (e.g., breaks a script somewhere), the link should remain and the owner of the link should work to resolve the issue such that the link can be removed.

cor3ntin · July 21, 2023, 5:07pm

Sounds good to me, for all the reasons previously stated in the previous, comprehensive thread.

rjmccall · July 21, 2023, 5:49pm

I definitely agree that (1) we shouldn’t allow private links in source code and (2) we should allow private links in commits. However, I’m ambivalent about disallowing private links from tests. I understand the frustration about seeing a commit that changes a test and only adds a private link as explanation, but I don’t see how seeing the exact same commit without the private link is an improvement. I don’t think developers are actually going to think, “Oh, I can’t leave a private bugtracker link, I’d better leave a detailed textual comment explaining why I’m changing this test.” The general expectation is that commit messages explain why the change was made, and people rarely think about leaving long explanatory comments in tests; to the extent that people fall short of the former, they’re also going to fall short of the latter. I think we should focus on a single, more achievable goal of encouraging better commit messages instead of also having high expectations about having extensive comments in every test. With that said, though, if the expectation is that people will look at the rest of the commit to understand why a test change was made, they should be able to find a private link in the commit message instead of expecting to see it in the test, so I don’t have very strong opinions about this.

reinterpretcast · July 21, 2023, 8:01pm

Having a comment to say what about a test is actually being tested will save people time on running git blame on the test. Additionally, a comment on the test helps to ensure that changes are being made with the purpose of the test in mind. This comes into play in scenarios where there is a choice of forcing a test to run under certain options or modifying the code to be more generically compatible. If there is no comment and the test is doing something (not necessary integral to its purpose) now considered non-portable or outdated, then it is far more likely that the test will be updated to run as something “niche”/“outmoded”. Once the test starts collecting “option cruft”, the history of the file will encourage future maintainers to add even more “option cruft”.

Now, an extensive comment is not really the ask: A comment can be a link, but it should be to a resource that people can expect to access (even if it is not yet available due to timing).

reinterpretcast · July 21, 2023, 8:05pm

@AaronBallman, I think that C++ committee papers/issues are now available in a timely manner, right? Otherwise, we may want an exception to links to such resources that are not live yet.

shafik · July 21, 2023, 8:22pm

@AaronBallman, I think that C++ committee papers/issues are now available in a timely manner, right? Otherwise, we may want an exception to links to such resources that are not live yet.

I think we need an exception for wg21.link content that has not become available yet. The chance that someone will go back and edit the link once it becomes available is likely zero. That context is extremely valuable.

reinterpretcast · July 21, 2023, 9:21pm

@AaronBallman, any thoughts on what to do about committee reflector references?

clang/lib/Sema/SemaTemplateDeduction.cpp:      // FIXME: Per core-2016/10/1019 (no corresponding core issue yet), permit
clang/test/CXX/expr/expr.const/p2-0x.cpp:// bullet and instead prohibit casts from pointers to cv void (see core-20842
clang/test/CXX/expr/expr.const/p2-0x.cpp:// and core-20845).
clang/test/SemaCXX/cxx1y-variable-templates_in_class.cpp:    // that this is intentional. See PR17294 and core-24030.

Generally, I believe that such references are integral to explaining why the comment is there: The comment should provide enough information, but the fact that there was a specific discussion via committee channels means that whatever was done does not simply have to “stand on its own”.

rjmccall · July 21, 2023, 9:32pm

If you want to argue that we should adopt a policy of requiring explanatory comments in tests, that needs to be a separate policy RFC. But we shouldn’t start from a premise that comments like that will naturally arise if we disallow private links. In practice, people do not write them.

pogo59 · July 21, 2023, 10:48pm

Well, as software developers we generally don’t do so great at testing (e.g. ⚙ D155539 [CUDA][HIP] Use the same default language std as C++ managed to change a test in a way that keeps it from verifying exactly what default language std clang actually uses). So I say, whatever makes tests easier to figure out is a Good Thing.

AaronBallman · July 22, 2023, 12:22pm

I think I may have misspoke somewhat with the solution part of the proposal when I said “sufficient surrounding comments”; I intended that more as “sufficient surrounding context”. e.g., comments are great (and preferred when appropriate), but so long as the blame gets you back to a commit message with more details, which perhaps gets you back to a review thread with more details, which perhaps gets you back to an RFC with more details, etc – that’s what’s important. What I want to prevent is crumb trails with avoidable dead ends that cause frustration or gatekeeping exercises.

AaronBallman · July 22, 2023, 12:27pm

Good point! Yeah, I don’t think it’s an issue to have links which are expected to be publicly available in a timely manner (by the next scheduled committee meeting mailing, perhaps?). If the link ends up not being available for some reason (which would be pretty rare, I believe), it can be removed as a dead link once someone notices it.

AaronBallman · July 22, 2023, 12:42pm

Thank you for raising this, I hadn’t thought of those!

These are a bit harder in some ways. The committee reflectors are closed to non-committee members, as are committee meeting minutes, which makes them inaccessible to most of the community. From that perspective, it seems like they should probably follow the same guidance as any other private link.

However, the same is true the standard itself – the published document is not freely available (there are unofficial drafts that you can use to work around that, but they’re still unofficial sources of information). I don’t intend to prohibit standards citations.

I think standards citations are always fine to add to source, tests, etc. because there’s unofficial documents that are usually “close enough” that the information is sort of generally available to the community. I think reflector message numbers, mentions of meeting minutes to go look at, etc are perhaps a bit more situational but should typically follow this proposed policy. Just like we shouldn’t encourage people to release private IP from their bug tracker, we shouldn’t encourage committee members to share (e.g.) private meeting minutes when the committee’s expectation is that those are private. Still totally fine to mention them in review discussion, commit messages, etc. WDYT?

tschuett · July 22, 2023, 1:45pm

It is surprising to me that this is such a hot topic. Disallow is a bit harsh to me. I prefer SHOULD NOT, which is a well defined term.

reinterpretcast · July 22, 2023, 4:17pm

On further reflection^[1], I agree that committee reflector references are too opaque: A reference does not (in itself) indicate what the reflector reference represents. For example, it could represent agreement that a specific interpretation of the existing wording is correct. In another case, it could be just the reporting of the issue by the Clang developer.

I am fine with prose comments for some of these cases. I still believe, however, that some standardized marker in the source would be useful for things still up in the air, e.g., [committee clarification requested YYYY-MM-DD; issue pending].

(pun intended)^[2] ↩︎
On that (foot)note, the compile-time programming study group of the committee should really look into having a meta-reflector. ↩︎

cor3ntin · July 23, 2023, 7:32am

In addition to prose, I think it’s easy enough to track that in llvm issues, and now we can also reference cwg on githubs even if they don’t have a number associated to them. I agree that reflector or wiki links in the code do not do anything for the community

bogner · July 23, 2023, 8:51am

I didn’t comment on the first RFC since it seemed to me that everyone agreed this is a good policy to codify, but given that we apparently need a second RFC I’ll state for the record that I strongly agree with the direction here. Thank you for spending so much of your time on this.

AnnaZaks · July 23, 2023, 8:54pm

Creating an official policy to discourage private links in code and documentation is a great initiative as it encourages the codebase to more self-contained and better documented. Thank you for looking into ways to improve that!

Since test cases need to be reduced, they sometimes benefit from storing full context, which could very well be private. (As Doug explained, this is why the current policy allows these.) However, that link could be stored indirectly by referencing the private bug tracker in open source bug report or in a commit message. This indirection would help ensure that people do not rely on the link to be the sole source of documentation for the test and could lead to better documentation overall. The main exception are the tests that require specific platform support (such as LLDB, sanitizers) as they would be impacted much more than others by the proposed change in policy. Also, associating a radar link with a platform-specific XFAIL seems pretty harmless, as in unlikely to cause confusion to others.

My main concern with this proposal is that it can lead to bulk removal of the existing links, which contradicts the goal of having better understanding of the codebase. Most of the people who added the links and implemented large parts of the codebase no longer work on the project, so they are not around to answer questions. The radar links could be very valuable in providing insight to people trying to understand the past decisions (Louis described this experience for libc++). Ideally, the existing links are replaced by better documentation but the scale of it (2K links in clang alone) makes it infeasible to do in a short period of time.

Some portion of the community does have access to the information. Note, there are many newcomers in that subgroup as well and they currently benefit from these extra references the most. One alternative to removing existing links and loss of the information (to the community as a whole) is to establish a more formal and documented collaboration channel. The engineers having access to the additional information could assist those seeking help. Over time the community as a whole should strive to remove these links and ensure they are replaced with better documentation over time. I do not think this type of collaboration would lead to release of private information as it’s not something we’ve seen happen in practice.

The following sentence reads unfriendly and possibly unnecessary to me:

“It is acceptable to remove any such link as needed without getting prior authorization from the link owner: either there is sufficient surrounding comments that the link is superfluous or there is not sufficient surrounding comments and the link is not providing value to the community in that case.”

I assume it’s common sense that code and links can be removed to follow established LLVM policies without ability of their “owners” to block that. However, reaching out to people impacted and giving them time to react (allowing for vacations and sick time) would lead to both positive community experience for all and better codebase. The link owner could improve the codebase replacing it with more useful information and also use the opportunity to understand the established policy making them a stronger overall contributor.

AaronBallman · July 24, 2023, 11:56am

The other thread made me question whether that common sense is shared by all, so I wanted to be explicit that it’s fine (but no one is obligated!) to remove old links and that nobody should say “you didn’t ask me first” in these cases.

100% agreed; if removal of a link causes some actual problems, we need to work together to resolve the situation to everyone’s satisfaction. But I think the onus is on the link owner to say “hey, this caused me problems, can we do X instead?” as post-commit feedback instead of on the patch author to ask “is it okay if I remove this?” requiring pre-commit review. (The theory being that any project-wide problems will be identified during the RFC or quickly after landing a cleanup patch should one be committed, so any remaining problems are likely to be uncommon and highly situational.)

rnk · July 25, 2023, 8:34pm

+1 to the original proposal, I think allowing private links in commit messages, issue comments, and code reviews is a reasonable compromise.

tahonermann · July 26, 2023, 3:17pm

+1 to the original proposal from me as well. I recall being confused by the rdar links when I first started working on Clang and then dismayed when I learned I couldn’t follow them. To a new comer, it is far from obvious who to ask and the move away from mailing lists to Discourse with all of the categories present there doesn’t make it any easier.

Searching for information about rdar links is not particularly helpful. Searches might lead to Open Radar which at first glance appears like it might be helpful, but generally is not. Its FAQ is an interesting read; its existence is a testament to the frustration people are experiencing.

Open Radar does offer an alternative for Apple employees that would like to embed links to rdar issues in source and tests; they can link to a corresponding entry in Open Radar instead.

Other projects appear to be encountering similar confusion and/or frustration regarding rdar links. An example of this can be found in a post to a Swift forum.

Topic		Replies	Views
[Code Review] Reminder about links in code & commit messages LLVM Project clang	86	4139	July 21, 2023
[RFC] A proposal for project governance LLVM Project clang , llvm	1	813	December 9, 2023
RFC: Define policy on AI tool usage in contributions LLVM Project	35	1698	September 10, 2024
RFC: Introducing an LLVM Community Code of Conduct LLVM Dev List Archives	140	209	October 15, 2015
RFC #3: Improving license & patent issues in the LLVM community LLVM Dev List Archives	28	117	May 1, 2017

[RFC] Specify a community policy for use of private links

Problem

Solution

Related topics