Publishing take more time -> [ISS.0098.0064I] Publishing delayed while outbound store

We are seeing the below error in IS server.log. We have 12 IS in a cluster. This error happens only on one IS. Whenever this error happens, publishing takes more time ( more than 30 seconds ). I have verified the Broker, Trigger and IS settings and they are all same across all the ISes. We have also seen that publishing takes more time even when the below error doesnt happen. Both the scenarios happens daily for sometime ( any random time ) only on one IS. I have gone through some of the earlier postings over here but couldnt find anything conclusive.
Any help is appreciated.
Thanks

2012-05-07 09:55:53 MDT [ISP.0085.9998E] Exception → java.net.SocketTimeoutException: Read timed out
2012-05-07 09:55:59 MDT [ISS.0098.0036E] DefaultProducer encountered Transport Exception: com.wm.app.b2b.server.dispatcher.exceptions.EndpointUnavailableException: [ISS.0098.9014]
BrokerException: Timeout (112-1450): The request timed out.
2012-05-07 09:56:00 MDT [ISS.0098.0036E] DefaultProducer encountered Transport Exception: com.wm.app.b2b.server.dispatcher.exceptions.EndpointUnavailableException: [ISS.0098.9014]
BrokerException: Timeout (112-1450): The request timed out.
2012-05-07 09:56:00 MDT [ISS.0098.0064I] Publishing delayed while outbound store is draining. Service: wm.server.publish:publish
2012-05-07 09:56:00 MDT [ISS.0098.0064I] Publishing delayed while outbound store is draining. Service: wm.server.publish:publish

Hi

It seems your IS for some reason loses the connection to the broker (EndpointUnavailableException).
When there is no connection, but the IS has not yet thrown an EndpointUnavailableException, the IS is trying to publish - the default timeout is 30s (I think).
During this time all documents are stalled until the connection is lost - after that all documents are directly placed in the outbound document store.
When the connection is finally up again the broker must “drain” the outbound document store, and during this time every publish takes longer than usual (approx. 5s in our case).

Unfortunately, this is by design and really hard to avoid when you’re having connection issues.
I created a thread on wmusers a while ago regarding our issue: wmusers.com.

In our case the problem was the SAN (excessive IO) - and when that was fixed we never got these problems again.
Investigate things like cpu, io, memory, network utilization etc on the IS that is affected and see if anything is out of the ordinary.

There are a few settings you can play around with regardning broker transports:
watt.server.brokerTransport.dur=60
watt.server.brokerTransport.max=60
watt.server.brokerTransport.ret=3

Regards,
Mike

Thanks Mike for responding. We will try to see what you have recommended.
Our Broker settings are the same as yours
watt.server.brokerTransport.dur=60
watt.server.brokerTransport.max=60
watt.server.brokerTransport.ret=3

We are using IS 8.2. We have ISes in cluster. We have 12 Front End IS that connects to 12 Brokers which connects to 12 Back End IS. This is one-one ie
FE IS1 - Broker 1 - BE IS1, FE IS2 - Broker2 - BE IS2, etc. But this behavior is observed only on one Front End IS. Also, we do not get the publishing delayed error always in the server.log and still publishing gets delayed sometimes. But whenever publishing delayed error happens in server.log, publishing always gets delayed during that time.
Rest of the time, the specific IS just works fine. We have examined load, traffic, etc and nothing is unusual as there is a load balancer which distributes the incoming traffic @ 12 FE ISs.