How to avoid webMethods scheduler from picking partially copied file

Hi,

We have a scheduler to pick a file from shared drive, that file will be copied by a external application. How to avoid scheduler from picking a partially copied which is still under process of copying by external application
Thanks in Advance

Probably the simplest way is to have the job that writes the file save it to a staging folder and then move it to the final folder. If the two folders are on the same drive, moving is a pretty quick operation.

You could also have the source system signal webMethods on some way - maybe launch the job rather than have it scheduled.

Finally, maybe have the source system write a second, tiny file whose purpose is to signal webMethods that the big file is done.

2 Likes

Good options.

The key for the move option to work is that the move MUST be on the same volume. A move (or rename) is an atomic operation – when the src and tgt names are on the same volume. If they are on different volumes, the OS will copy and delete which will re-introduce the “picked up before completely” issue.

The move does not need to be a directory change. Any filename that the retrieving system is not looking for can be used as the first filename. Then rename to what the retrieving system is looking for. Example: retrieving system looking for *.txt files. Write foo.tmp first then rename to foo.txt.

We’ve done all 3. The order of our preference:

  • Call the transfer server to indicate a file is ready. Avoid polling altogether. Or have the source system put the file – wM IS supports a couple of different ways to accept a file being put to it and running a service to process it.
  • Use the rename technique.
  • Use a sentinel file. This usually introduces other headaches and we try to avoid it like the plague. :slight_smile:
1 Like

The other option is to configure File stability check in your Find action. You can either skip the files that are being updated or scan multiple times for the update to complete. When this configured, the file size gets check in certain interval and only those files are picked whose size hasn’t changed from the last scan.

Please note that there will be performance hit with this check and file processing is delayed due to stability check. And it will even more if you are picking file from some remove server.

2 Likes

Ah, forgot about that one. We’ve done that too. Keep in mind that “file stability” is an imperfect check. If the app/system that is writing the file fails abruptly, the file will be stable but incomplete. But most of the time, it is fine. But if system that is writing the file is not “reliable” then this can also result in incomplete files.

Adding to the earlier list of preferences:

  • Call the transfer server to indicate a file is ready. Avoid polling altogether. Or have the source system put the file – wM IS supports a couple of different ways to accept a file being put to it and running a service to process it.
  • Use encryption. The decrypt will fail if the file is incomplete.
  • Use the rename technique.
  • Check file stability, realizing that a file might still be incomplete due to interruption while the file was being written.
  • Use header and/or trailer record in the file. Check that is is present/complete. (Not usually doable as many flat file exchanges don’t use header//trailer records.)
  • Use a sentinel file. This usually introduces other headaches and we try to avoid it like the plague. :slight_smile:

I believe, This would be the one best approach having write to temp/inprogress and move it to final/work folder (trigger your job to look only to this complete folder) for downstream outbound processing. etc…

HTH,
RMG

Hi,
There is an option in ‘Find’ properties for checking the file stability before transferring the file to target.
Under File stability and scanning, you need to check ‘exclude files that are being updated’ and in case of multiple files, you can check on ‘delay processing until all files are available for use’ option and mention the timing in seconds or minutes for scanning file stability but this would affect the performance if the scanning time is more as the action need to wait till that time limit is complete.
Try this, it works.

Yes @Bhaskar_Bhattarai mentioned that earlier. But as also noted earlier, the issue with this approach is if/when the app/system that is writing the file fails or loses connectivity – the file is stable but is incomplete.