The reason is that when "FileCheck < %t %s" is called (line 2), the
lifetime of the handle to file "%t" is controlled by Windows. We don't
know exactly when it is released.
Then clang is called again (line 3) and because of revision 114187,
the clang frontend try to move a file to "%t" but Windows refuses
because the handle from the previous < redirection (line 2) is not
released.
Hence we get error from the bot like:
"error: unable to rename temporary
'D:/public/zorg/buildbot/osuosl/slave/clang-i686-xp-msvc9/llvm/build/tools/clang/test/CodeGenObjC/Output/image-info.m.tmp-000000'
to output file 'D:\public\zorg\buildbot\osuosl\slave\clang-i686-xp-msvc9\llvm\build\tools\clang\test\CodeGenObjC\Output\image-info.m.tmp':
'Can't move 'D:/public/zorg/buildbot/osuosl/slave/clang-i686-xp-msvc9/llvm/build/tools/clang/test/CodeGenObjC/Output/image-info.m.tmp-000000'
to 'D:/public/zorg/buildbot/osuosl/slave/clang-i686-xp-msvc9/llvm/build/tools/clang/test/CodeGenObjC/Output/image-info.m.tmp':
Access is denied."
is that clear?
well I can think of 1 solutions:
1: change all instances of
// RUN: FileCheck (blabla) < %t %s
by
// RUN: FileCheck (blabla.) -input-file %t %s
Wait, we can create a new %t (before r114187) but not move a file to it ?
Can you try the attached move_fix1.diff and, if it doesn't work, move_fix2.diff ?
Wait, we can create a new %t (before r114187) but not move a file to it ?
strangely yes.
none of your 2 patches make any difference. Your second patch doesn't
make sense. It delete the source file before renaming it. I suppose
you wanted to delete the destination file. I tried that, also doesn't
work.
Can you try the attached move_fix1.diff and, if it doesn't work, move_fix2.diff ?
The only way I found to make it work is to try to rename, then if it
fail, Sleep(50), then try again. Repeat 3-4 times. then it works.. But
clearly this is not a correct solution.
I just learned that lit has it own Shell parser/interpreter. The
problem might be fixed there I hope.
Wait, we can create a new %t (before r114187) but not move a file to it ?
strangely yes.
none of your 2 patches make any difference. Your second patch doesn't
make sense. It delete the source file before renaming it.
Oops!
I suppose
you wanted to delete the destination file.
Yup
I tried that, also doesn't
work.
Can you try the attached move_fix1.diff and, if it doesn't work, move_fix2.diff ?
The only way I found to make it work is to try to rename, then if it
fail, Sleep(50), then try again. Repeat 3-4 times. then it works.. But
clearly this is not a correct solution.
I just learned that lit has it own Shell parser/interpreter. The
problem might be fixed there I hope.
Can you look at the patch I posted on llvm-commit: doing a rename as a
copy+delete fixes the problem on Windows.
IMO, it is a better solution than disabling the temp code on Windows.
Disabling the new code means clang will behave differently on Windows
vs Unix. + introduce a platform #ifdef inside the driver.
The reason is that when "FileCheck < %t %s" is called (line 2), the
lifetime of the handle to file "%t" is controlled by Windows. We don't
know exactly when it is released.
I don't completely follow the sequence of events causing a failure. I
would need to look closer, but this sounds to me like either a clang
bug or a lit bug.
Presumably one of the two processes is either keeping a handle longer
than it should, or is creating a handle before it should.
I might find some time to investigate this today. Is it deterministic?
Is any particular test most likely to trigger it?
Can you look at the patch I posted on llvm-commit: doing a rename as a
copy+delete fixes the problem on Windows.
IMO, it is a better solution than disabling the temp code on Windows.
Disabling the new code means clang will behave differently on Windows
vs Unix. + introduce a platform #ifdef inside the driver.
I agree in principle, it is nice to have behavior be the same across
platforms. However, this is also intended to be a user-transparent QOI
improvement for Clang and it is closely tied to how Unix handles
files. If the underlying FS in Windows can't support the same QOI
tweak then it is reasonable to not do it.
Copy + delete seems suboptimal to me. It means Clang will be hitting
the FS twice as hard as need be.
My feeling is we should first figure out exactly why this fails, and
disabling the code is a good short term way to get the buildbot to
stop complaining while we investigate.
I might find some time to investigate this today. Is it deterministic?
Is any particular test most likely to trigger it?
The tests that trigger the failure are:
\tools\clang\test\CodeGenObjC\bitfield-access.m
\tools\clang\test\CodeGenObjC\image-info.m
\tools\clang\test\CodeGenObjC\metadata_symbols.m
\tools\clang\test\CodeGenObjC\ns-constant-strings.m
Some observation:
- It is not deterministic. Sometimes all the tests pass, most of the
time at least 1 of them will fail.
- The tests will always pass if run alone.
- The test will always pass if -j=1
- Doing a sleep long enough before renamePathOnDisk will fix the problem.
- changing "< %t %s" with "-input-file=%t %s" will fix the problems.
My theory is that lit, python or windows is holding a handle to '%t'
during the < redirect operation for too long which cause the second
run of clang to fail.
I committed the change (copy+delete) because Chris Lattner gave me the
ok and also to unblock the buildbot.
You are right this is suboptimal, I'll revisit the problem soon for a
clean solution.
Hi Daniel.. this lit patch fixes the problem.. I am still trying to
understand exactly why.. I moved the closing of the opened_files
before the procs[-1].communicate statement. I see a FIXME close by!!
any idea?