WebSphere and WebMethods broker Connectivity Lost

We are using webMethods broker as JMS provider and MDBs on WebSphere application server are consuming the messages from WM Broker’s queues. All the setup works and JMS communications works as expected. The issue we are seeing is, whenever WM Broker is down(these are scheduled outages for maintenance), we had to restart the WebSphere application server as websphere is unable to reconnect to WM Broker. Is there any settings/configuration on websphere or WM broker side to make the Websphere->WM Broker connection establish again without the need of restarting the Websphere application Server?

Ensure you are running the latest fix level for your wmjmsclient.jar and wmjmsnaming.jar, for that get in empower and check for your Broker version fixes.
If that still fails, take a thread dump from your WAS and check where is code hang.

If it points to wM, I would open a support request.

Regards

DevNull43,

Thanks for your suggestions. I will check the broker versions and its compatible broker client JARs.

fyi:

I see the following exception in WAS logs whenever there is a WM broker restart. Even after WM broker is started, WAS is not able to create a connection to WM broker.

[2/19/10 3:08:40:977 EST] 000001ce SystemOut O 3:08:40 AM LinkReader Broker1@lindrs001:6849 ERROR: [BRM.10.4100] JMS: Reconnecting to "Broker1@lindrs001:6849" in 15 seconds.
[2/19/10 3:08:59:979 EST] 000001cc SystemOut O 3:08:59 AM LinkReader Broker1@lindrs001:6849 ERROR: [BRM.10.4011] JMS: Unable to connect to Broker at "lindrs001:6849": A remote host refused an attempted connect operation

Yes I can see WAS is unable to recreate the connection, but if you restart WAS it works.
Maybe try to reload the MDB and not WAS itselves.

Either broker client was hung, and from your logs throws correctly the exception, or should should set a higher reconnect or even “-1” (forever):

This is a similar case for MQ:

Error: MQJCA4014

Message: Failed to reconnect one or more MDBs after a connection failure.

Reason: The connection supplying one or more MDBs failed, and the resource adapter was not able to reconnect within the number of attempts specified by the reconnectionRetryCount property.

Action: Make sure that the WebSphere MQ queue manager is running, and that any other required components such as a listener are also running. Examine the application server logs to determine which MDBs have failed and restart the MDBs manually.

I would say you face a configuration issue on WAS.