Hey Reid, thank you for continuing to work on this policy; I think the current policy isn’t working as well as hoped, so clarification is greatly appreciated!
My personal position, based on PRs, comments, and issues I’ve seen in our community already, is to have a blanket ban on use of AI. AI has already been extractive in terms of limited reviewer and triage resources in the community and I don’t think it provides sufficient value to be worth encouraging its use. What’s more, at no point have my original concerns been addressed by our policies: there is no reasonable way for anyone involved in a PR or an issue to know whether or not there are licensing issues with AI-generated content, but we know for a fact that many AI tools are trained on inputs with incompatible licenses to our own. Yes, the same can be said of human-generated content, but we have decades of experience of humans not doing it with the trivial ease with which AI tools have been demonstrated to do it. IMO, it’s reasonable to trust a real human being to put thought into a contribution and that same trust cannot be placed in an AI tool when it comes to verifying that the tool did not produce something with copyright or license issues; neither the reviewer nor the author have the information necessary to determine that.
If a maintainer judges that a contribution is extractive (i.e. it is
generated with tool-assistance or simply requires significant revision), they
should copy-paste the following response, add theextractivelabel if
applicable, and refrain from further engagement
I like the ease of this approach – it’s a simple copy/paste plus adding a tag. That’s a really nice property! However, I’m worried about the social aspects of this, in a few ways. This has potentially significant friction because it requires someone to make what amounts to an accusation. I think this leads to some contributors being comfortable with the guidance while other contributors won’t feel secure enough to make that accusation. Also, this is going to be very inconsistently applied and I think may lead to unintended outcomes. Consider this (IMO) reasonable scenario: a company is paying someone to implement a feature they want. They use an AI tool to generate a low-quality PR, it gets put up for review and three reviewers are added to it: a maintainer for that area, the lead maintainer, and a coworker. One of the maintainers adds the extractive label to the review, both maintainers back off the review due to bandwidth… and now the coworker is the only one left and accepts the review because they don’t find AI tools to be extractive (they’re required to use AI tools by their management). With our PR workflow, that’s now ready to land, which seems like the exact wrong outcome IMO.
+1 to this.
I think this is a case where having a dead simple policy makes the most sense. I like the general idea of “extractive” and I think that’s a good benchmark to measure against, but I think the policy itself should be more straightforward; I’d prefer a blanket ban on use of AI.