Yes this is quite typical. The usual approach is for the writing system to write the file with a filename extension, such as .tmp. The system that is polling the directory for files ignores files ending in .tmp. When the writing system is done writing the file, rename the file to remove the .tmp portion. That indicates the file is ready to be picked up.
Yes I totally agree with you. Well this is something I need to check with the src system.
In the mean time, out of curiosity I have checked this behavior with windows (by copying a big fat 900MB file on ftp dir) and tried ftp mget on the ftp directory and found out that windows do lock file and does not allow file access until its copied completely.
Different platforms manage files differently. The only mechanism that is the same on all platforms is a file rename–it is an atomic operation on all platforms. The rename technique will work everywhere.
As a followup on this, even if the OS locks the file, a write-and-rename technique will prevent a partial file from being picked up. If a source system fails or is disconnected for some reason while writing the file, it will never do the rename. So even though the file will be unlocked by the OS, the file is bad. With this rename approach, your process will never see a partial file.
I’m not sure I understand which system you’re referring to. If you’re referring to the system that is writing the file, it knows when it is done and can then do a rename.
If you’re referring to the target system (the system that wants to pick up the file after it is ready) the indicator is that the file is renamed to a pattern it is polling for. A rename is an atomic operation that once complete, the file is ready to be open and read by another process.
My requirement is similar to what you quoted here. Src is Unix and target is Unix and I am following the same logic as said by you like renaming the file to abc.xyz. Once it is completely renamed then I am picking the file.
The issue is sometimes I am getting partially completed files from src folder.