We recently discovered a significant amount of spam user accounts on Phabricator. If anyone has an idea on how they could have been created despite requiring a Github/Google account is appreciated.
I don’t think a Github/Google account is actually required? Unless that changed… I know my phab account isn’t tied to github and I’m pretty sure it isn’t tied to a Google account. Honestly it was so long ago I don’t remember how I created it!
This changed because of the spam, back in October. But these account are Google accounts, so I turned on email verification for Google/GitHub accounts now!
We should first look into existing contributors with a Google login. Otherwise what may be more practical (but I don’t know if there is an option for this) would be to prevent only new users with Google account.
I just had a quick look and disabled another 5 (or so) Phabricator accounts that were created today. It seems we still have a significant influx of new spam accounts.
I’m less concerned about not accepting new Google account to register, I think that’s fine (but is it possible? Do we need to patch the code?).
The concern I have is more about removing Google auth entirely with respect to existing contributors using it. So the data on this to get from SQL would be “how many Google accounts are actively contributing” to gauge the impact of removing Google auth entirely (new and existing account).
Is there an approval mode for Phabricator accounts? I hate to add more barriers, but we had to do this with Bugzilla and Mailman basically because of Spam.
Ideally, it would be good to get rid of the really bad ones on the site. It sounds like deletion is not straight forward?
Deletion is fairly easy, but it requires collecting all the usernames, and then it is one command through SSH access. I’m happy to run the command provided the list of usernames!
I also just found the setting to disable registration for Google accounts and triggered it: there shouldn’t be any new Google account: we’re only gonna have registration of GitHub account if everything works as intended.
I expect these folks are doing this primarily for SEO benefit. Phabricator does not set rel=“ugc” or the older rel=“nofollow” on outgoing user-generated links (see google’s docs). So, we’re giving our search-engine ranking credit to all these spam sites.
I don’t know whether we’ve gotten onto a specific list of “good sites to spam for SEO ranking”, or if it’s a more general “phabricator sites are good for this” kind of thing. I wouldn’t be surprised by either.
It looks like right now there’s not too much comment spam, but there are a TON of accounts that have profiles with spam links in them. (This, at least, doesn’t actively spam users of the site, but we’re still giving those spammers SEO-juice, unfortunately.)
Unfortunately, phab doesn’t have an option to add that the rel=ugc attribute. I think we’ll likely need to continue removing spam accounts in any case, but hopefully it’ll help if we can patch the remarkup->html translator in phab to add the attribute for explicitly-specified link targets.
Well, we could require a manual review/approval step for new accounts. Phabricator has an option for that. However I’m not sure how we would want to weed out spam accounts and who would have time for that.
In parallel I’m in contact with Google’s identity team. They offered some help in identifying the Google-account based spam accounts…