File Polling Port - Cluster aware?

Hi,

I have a query regarding the File Polling Ports in webMethods (version 9.0 on windows in this example)

I am polling a directory via a UNC path for files. Once a file arrives, IS picks it up and sends it via HTTP to a cloud SaaS app.

In a single IS environment it works fine.
In a clustered IS (x 2) environment it works fine IF there is only one file at at time.

If there are multiple files dumped into the monitoring directory, and I have a clustered IS environment, this is where I run into trouble. The Saas provider will only accept one request at a time. The requests take about 12 seconds to complete.

Just say there are 3 files dumped: File1.csv File2.csv & File3.csv

Node 1 picks up File1.csv at 15:00:00 = processed and sent OK
Node 2 picks up File2.csv at 15:00:05 = process will fail, as File1 is still being sent and the SaaS provider will reject the second request until the first is complete.
If File2.csv errors within 60 seconds, then File3 will be sent ok.

I have configured the port with (see image for all details):

Enable Clustering Yes
Number of files to process per Interval (optional) 1
Maximum Number of Invocation Threads 1

I was hoping that IS would be smart enough to be cluster-aware and only ONE node would pick up ONE file every 60 seconds. But instead, it appears EACH node operated independently from each other. They pick up files at different times too, so the schedules are not in sync either.

Is this a bug / undocumented feature? I’ve scoured the web looking for clues but can’t find anything useful.

Hoping one of the geniuses out there can give me a hint!

Many thanks in advance.

Steve

Looks interesting and strange. Are you sure the file polling interval is same on two nodes. Please confirm.

My two cents:

  • The file polling port configurations must be identical on two IS nodes (say monitoring directory, enable clustering=yes, processing service and other details)
  • As per the doc it says "In a cluster of Integration Servers or non-clustered group of Integration Servers, file polling works much the same way as it does on an individual Integration Server. The only difference is that more than one Integration Server polls the monitoring directory. Once an Integration Server in a group retrieves a file from the monitoring directory, the file is not available to other Integration Servers in the group.

But unfortunately this is not case here, If it is the case, I assume you must try scheduler task or other custom solution. Let me know your comments. And also I would suggest to consult SAG experts on this.

Stephen : This is the default behaviour that we can see as part of clustering. I don’t have clue to suggest,let’s see what other suggests. Please also check with SAG.

Thanks,

I recently ran in to same requirement where only one file needs to be processed against back end in a clustered IS environment. Even though there is an option on File polling port where it asks for Cluster enabled Yes\No, it does not actually process single file in a cluster(on a node, yes). There are few options that you can do to workaround this product limitation that i can think of,

Option 1:
Enable File polling port only one node in webMethods IS Cluster and keep the max number of threads to execute and max number of files to process as 1.

Option 2:
Include locking logic in your processing service where keep the thread waiting until you are done processing any previous request and release the lock once the processing is finished. Even if the next file comes and tries to get processed, the locking logic within the flow will ensure that it will not call the back end.

I used option 2 as our backend response time is pretty quick and we need to use all nodes in the cluster to throttle requests and clear out files from the drop location.

In your case i would suggest option 1 since you have a long response time from back end.

Stephen,

Yes the two options mentioned above sounds like a best solution which can resolve this issue and please test it and let us know the outcome if it resolves your needs due to limitations for file polling port in a cluster env.

HTH,
RMG

I’ll add one more option: use messaging

After retrieving the file, map it to a publishable document type, and publish it to the bus (e.g. Broker or UM.) On the subscribing side, you can configure a serial trigger to ensure only one message is processed at a time across all servers in the cluster.

In addition to accomplishing the goal of the discussion, another big advantage to this approach is the decoupling of source and target, allowing new publishers and new subscribers to be easily added. It also prevents failures on the subscribing side from impacting the publishing side. Finally, it allows you to make use of the robust trigger retry capabilities that the IS provides.

Percio

1 Like

Great idea Percio!

I’ll use a Universal Messaging ‘Queue’ to line up the requests one by one.

Cheers,

Steve