Updated polling logic in LockFileManager

Hi there. I submitted a patch I llvm that fixed polling logic in LockFileManager
https://reviews.llvm.org/D70563

This patch should fix
https://bugs.llvm.org/show_bug.cgi?id=20794

There were no activity so I decided to write directly to mail list

Is there anyone who can take a look? Thanks.

P.S. Are there some additional actions needed for patch to be reviewed? Thanks

Not much for it - it can take a while for folks to get around to reviews. Usually if you haven’t heard from anyone within a week it’s appropriate to ping the review thread (just reply with the word “ping” or similar) and/or see if you can find some other more appropriate/available reviewers you might be able to rope in. Though looks like you’ve probably got the right folks on that already

Hi there. I submitted a patch I llvm that fixed polling logic in LockFileManager
https://reviews.llvm.org/D70563

This patch should fix
https://bugs.llvm.org/show_bug.cgi?id=20794

There were no activity so I decided to write directly to mail list

Is there anyone who can take a look? Thanks.

P.S. Are there some additional actions needed for patch to be reviewed? Thanks

I recently reviewed https://reviews.llvm.org/D69575 which solves the same problem and seems simpler. Is there anything in your patch not covered by D69575?

  • Michael Spencer

Well, Yes.
There are a few differences with the https://reviews.llvm.org/D70563 solution.
First of all, in D70563, in case if file will be deleted between open and subscription to events, then there’ll be maximum timeout. it seems that kqueue is not validating file descriptor, so, in case if file descriptor is already invalid, listening for a kqueue won’t give any errors. It will just wait for a timeout and won’t receive any events.
This behavior can be simply simulated in the code sample, provided in https://reviews.llvm.org/D70563

This is why in D70563 registration to the queue and listening to are two different calls. So logic in the path is

  1. Register for events (now queue will receive events, but we haven’t setup listeners to it yet)
  2. Check if lock file was deleted
  3. Start receiving events

This logic allows preventing race condition if the file was deleted in between open/subscription calls.
It’s very unlikely to have this race condition, but 90 seconds is pretty long time.

Second. in D70563 there’s an additional event we’re listening to check if the process, lock owner has exited. So, in case if owner has died, but for some reason hasn’t deleted lock file, we’ll exit from the waiting queue.
D69575 solution will still wait for file to be deleted for a whole 90seoonds timeout.

I’m not sure, how often this lock owner dies without unlocking, but this was in the original code, so I implemented this part as well.

I seem that my patch is duplicate of https://reviews.llvm.org/D69575, It’s just I covered some corner cases.
I think the best way to solve this is to cover some race conditions in D69575 or simply ignore those corner cases if those are unlikely to happen.
I left some comments with explanations in https://reviews.llvm.org/D69575

There are seems no activity on this patch https://reviews.llvm.org/D69575
What is the best potential solution in this case?
Should I update own patch, or wait until the original will be merged in and then apply fix?

Thanks in advance