Integration Server 7.1.2 will not restart/shutdown from Admin screen

I have an Integration Server 7.1.2 that when we try to restart/shutdown it from the Admin screen it does not want to restart or shutdown. There are no entries in the server.log stating it is even trying to stop. It does go into a state where it quits responding and processing data. So we have to manually kill the process. The server itself is HP UX Itanium 64 bit.

On the same server we have another instance of Integration Server running as well and restart/shutdowns without issues. It is a mirror image of the problem instance, same code, same JDBC adapter connections, SAP connections etc. They are not clustered. This is a production server. Non-prod Integration Server restart fine, so they are no help in troubleshooting.

I was thinking about upping the logging but not sure which one would give me more information without flooding the log with clutter. We currenly have all logging set to Fatal errors only. If there is any other flows or extended settings that would help I am willing to try them as well.

The restarting is down by a central ops team but page (at 4 am!) the webMethods support team when the instances do not restart. So I am trying to save my team so sleep by figuring out why the admin screen restart will not work.

Thank you.

Neal Sabo

Did you check if IS/Broker stuck due to lot of threads/processing volumes are high being running and thus server not going down and LOCKFILE is still stays there:

When this happens you need to run diagnostic data and nohup see whats going on behind the scenes that moment:

/invoke/wm.server.admin/getDiagnosticData (to get all IS thread dumps)

Useful info for your debug:

To find the process ID (PID) of an IS Java process, execute ps -ef | grep java.
To obtain the thread dump, execute one of the following:
kill -3 PID
kill -QUIT PID
If the thread dump information is not displayed in the console window, check the nohup.out or javacore*.txt files.
If you started the IS Java process in a regular command prompt, press CTRL+\ in the window in which the Java program is started.

And better check with SAG tech support with all the dumps and analyze it per your environment:

HTH,
RMG

I hope that it will allow me to run the getDiagnosticData, since it is in the not completely up and not completely down status.

Thanks for the info as soon as I can get an OK from the business to restart the server I will try your suggestions.

You can get the diagnostic data retrieval as long the server is not killed already:

HTH,
RMG

I was able to get the Diagnostic data as well as the nohup and sent it to SAG.

Thank you.

Post the forum back with the SAG outcome on the resolution part:

diagnostic data retrieval can be made easily ! try that.

SAG wants me to install IS Core Fix 43. I have read through the documentation on the fix and cannot find where it states this fix cures the issue. I have asked for more information on the fix from SAG. My Management will not allow this fix to be installed since we cannot duplicate the issue in a non Prod enviroment and we cannot be sure it will fix the issue.

Try installing lower level systems first and after your team fully satisfied testing that nothing breaks then release to prod I would say:

HTH,
RMG

We have a full regression testing of our systems, DEV and QA, in September by the business users. If I do need to install the fix I will have to wait till then.

Since it is not causing issues with the intefaces the business does not see this as a critical issue, they are not the ones getting paged in the middle of the night. :frowning:

Sure…you are right:

well, found there was a problem with one of the NIC cards in the Server. Infastructure replaced and restart the entire server.

Great…case closed:

Hello Neal,

Came across this thread just today.

In the original post it was mention that there is another IS and it didnt have any issues. Any idea how it was working fine with the faulty NIC ?

Jafar

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.