Transactions taking more time to process via broker or sometime getting lost

Hi All,

As part of a real time implementation, Documents are published from one IS to another IS via broker gateway. We are calling publish and wait service to publish the document and waiting for reply for 60 seconds. We have two nodes of both publishing and subscribing IS.
We are seeing below error on one node of publishing IS server logs when documents are published from there.
Most of the time we are receiving documents in subscribing IS but sometime documents are not received at all.
Also for few instances they are received in subscribing IS but reply is not coming back in publishing IS. however we don’t see any issue in subscribing IS server log.
Please be noted that document type is volatile since they are published as part of real time implementation where expectation is to receive reply within 30-60 seconds. Below is the error logged in one of the node of publishing IS:

[b]2015-06-03 22:24:37 EDT [ISS.0098.0036E] webM IS RequestReplyHandler_1 encountered Transport Exception: com.wm.app.b2b.server.dispatcher.exceptions.RequestReplyException: [ISS.0098.9010] No waiting thread for Document Id: 23356. Requestor might have timed out.

2015-06-03 22:24:45 EDT [ISS.0098.0036E] webM IS RequestReplyHandler_2 encountered Transport Exception: com.wm.app.b2b.server.dispatcher.exceptions.RequestReplyException: [ISS.0098.9010] No waiting thread for Document Id: 23358. Requestor might have timed out[/b]

Any help is appreciated.

Thanks,
Ashish

For me I see pub and wait is not implemented correctly.

Can you share the code screen shots of your publishing service and the subscribing service.

Attached is the screenshot. Please be noted that this issue is sporadic. It’s not happening always.

Publishing service remains in waiting state and if no document is received before the time elapses, IS ends the request and creates a log message which you are getting. Try to increase the waiting time or check why reply documents is taking so long to reach back to publishing service. Hope this helps.

Hello Ashish,

This error happens when the pubAndReply service is disrupted.

Some scenarios:

(1) the thread that performs request/reply is interrupted (i.e. IS suddenly restarts).

When IS restarts while a thread performs a request/reply operation, this thread is lost. If the thread has already reached the Broker, Broker will try to send its reply to that non-existing thread, but fails.

(2) the reply is not sent in time - so the thread times out and stops waiting.

To fix this, you can enforce a document time-to-live (TTL) on the Broker Documents used in request/reply. If the TTL setting in the reply Document is longer than the IS restart period, or is longer than the caller’s timeout setting, the document will be discarded by the Broker before delivery.

The key concept here is to have the requester Document timeout close to the expected maximum TTL for the request/reply operation round trip, because setting TTL on both request and reply documents will affect the ability of the Broker to pass documents to clients when it’s loaded or when queues are long.

The request document can have a TTL property configured to prevent the receiving service doing unwanted work when the calling service has already given up waiting for the reply.

So in summary:
(1) Configure a maximum wait time for request/reply operations via Broker.
(2) Request Doc - expire this document before remote service is invoked
(3) Reply Doc - expire this document before the waiting service is responded to

To do this, you can configure the TTL of a Broker Document in Designer by clicking on its properties tab. The default setting for this is “Never Discarded”.

For reference, you can check on Error Messages Reference documentation, around page 1135:
http://techcommunity.softwareag.com/ecosystem/documentation/webmethods/wmsuites/wmsuite9-6/Cross_Product/9-6_Error_Messages_Reference.pdf

Hope that helps!

Ashish,
The error is clearly due to the reply document that has come after specified waiting period. You should verify the action taken by publisher service when reply is not received on time (if any is implemented). If not, suggest you implement the logic.

-Senthil