UM Slow response and always getting timeout when try to publish message from IS or create topic from IS and Enterprisemanager

Hi team,

My Customer using UM 9.12 and IS 9.12. they face an issue where UM responds slowly until an error appears like the one below when they want to publish a message using pub.jms:send :

[2023-10-19 10:57:55.245] - [2023-10-19 11:00:55.259] - (2) Retry reach max. Fail callback to VMS (to inform order is completed). System/wM error [javax.jms.IllegalStateException: MessageProducerImpl.send failed: The server failed to respond within the time out value:180000 last ID = 3275 class com.pcbsys.nirvana.base.events.nTXPublishCommit com.wm.app.b2b.server.jms.ResourceUnavailableException: javax.jms.IllegalStateException: MessageProducerImpl.send failed: The server failed to respond within the time out value:180000 last ID = 3275 class com.pcbsys.nirvana.base.events.nTXPublishCommit ]

we also found an error and recorded it on nirvana.log

[Thu Oct 19 10:57:25 MYT 2023] [UM Server Status Generator] [com.pcbsys.foundation] - ServerStatusLog> Memory=721, Direct=1024, EventMemory=0, Disk=146190, CPU=1.5, Scheduled=749, Queued=0, Connections=351, BytesIn=3369242, BytesOut=4203699, Published=0, Consumed=145
[Thu Oct 19 10:57:27 MYT 2023] [ClosePool:0] [com.pcbsys.nirvana.server] - UserManager: User wmapp@10.200.76.31 Logged Out using nhp, Reason : java.io.EOFException Connection closed, session established for 1142 seconds, Session Id = 821e67bb00000000 ID: 10.200.76.31:38382
[Thu Oct 19 10:57:27 MYT 2023] [ClosePool:0] [com.pcbsys.nirvana.server] - Connection closed
java.io.EOFException: Connection closed
at com.pcbsys.foundation.drivers.nio.fBufferInputStream.allocateWorkingBuffer(fBufferInputStream.java:142)
at com.pcbsys.foundation.drivers.nio.fBufferInputStream.available(fBufferInputStream.java:48)
at com.pcbsys.foundation.drivers.fHTTPDSession$AsyncListener.dataReady(fHTTPDSession.java:440)
at com.pcbsys.foundation.drivers.nio.fChannelDriver.notifyListener(fChannelDriver.java:164)
at com.pcbsys.foundation.drivers.nio.fChannelDriver.packetArrived(fChannelDriver.java:240)
at com.pcbsys.foundation.drivers.nio.handlers.PacketChannel.processInBuffer(PacketChannel.java:96)
at com.pcbsys.foundation.drivers.nio.handlers.PacketChannel.handleRead(PacketChannel.java:73)
at com.pcbsys.foundation.drivers.nio.handlers.PlainChannel.handleRead(PlainChannel.java:74)
at com.pcbsys.foundation.drivers.nio.io.SelectorThread.processKey(SelectorThread.java:296)
at com.pcbsys.foundation.drivers.nio.io.SelectorThread.processSelectKeysIterator(SelectorThread.java:272)
at com.pcbsys.foundation.drivers.nio.io.SelectorThread.run(SelectorThread.java:177)
at com.pcbsys.foundation.threads.fThread.localRun(fThread.java:106)
at com.pcbsys.foundation.threads.hThread.run(hThread.java:102)
at java.lang.Thread.run(Thread.java:748)
[Thu Oct 19 10:57:27 MYT 2023] [ReadPool:25] [com.pcbsys.nirvana.server.handler] - Client session requested clean session close : wmapp@10.200.76.31 ID 10.200.76.31:38382

The error above happened after trying to create a topic directly from Enterprise Manager and Publish the message from IS using pub.jms:send as well.

I have done a few things, such as:

-Add EventTimeout to realm config to 180000 to 300000
-Create a new JMS in the IS admin and change the IP LB UM to the IP UM directly.
-Try to Create new topic from enterprise manager

this is my current ram stats, after 5 UM is running at the same time
[wmadmin@sgbbitwmbrks03 bin]$ free -m
total used free shared buff/cache available
Mem: 15884 6655 436 657 8792 8236
Swap: 3967 7 3960

what is the root cause of this problem?
and what is the solution?
does this happen because of the network?

really appreciate for the help!

Note:
I have suggested fixing the patch for version 9.12, but the customer will make that option the last option

Thanks,
Dimas

Did you check ACL configuration for IS servers? The exception they get seems to be a red herring. If you can create the channel from Enterprise Manager but not from IS, it may be related to ACLs. Also 9.12 is really old, should definitely upgrade it, that’s for sure. But is it patched to the latest available fix level? Most of these errors happened in the past and they are addressed in the following patches.

UM ACL configuration can be confusing. You set the permissions for “ldap_user_name”@“client_host_name”. You can use * for everything. like engin.sarlak@*. You should see something like *@* or *@"hostname of your IS". Make sure you have correct ACL for that IS. Try not to use *@* though. It practically includes everything. You can view ACL configuraiton from here.

Hi @engin_arlak

Thanks for your suggestion.

I have tried removing acl *@*\ but the error still remains,eventually I decide to recreate the UM instance and the customer agreed to do it, Below are my steps in recreating the instance.

Below step of recreate instances

  1. in EM, backup realm by exporting realm to xml
  2. backup DataFiles directory
  3. shutdown old instance
  4. Delete old instance by ninstancemanager.sh delete command
  5. Create new instance by ninstancemanager.sh create command
  6. start new instance
  7. Test publish message from IS5010
  8. in EM, Import backup realm by importing realm from xml
  9. in EM, config interface for port 60x2
    10.Test Publish message using wmapp users from IS5010 to UM [ : 60x2]

My reason for recreating the instance is because UM 9.12 fix 14 is very outdated and also customers refuse to patch the latest fixes because they had a bad experience after patching the fixes

Thanks,
Dimas

This was not the part of the solution, in fact it can lock you out if you do it wrong. Its a security precaution. If you exclude *@* and if you don’t have necessary ACL’s for your integration server and/or administrator user, you may lock yourself out of UM servers. When you disable *@* and “Everyone” you need to enable a user (user group preferred) and also you need to create custom ACL settings for your Integration Servers before removing that ACL. Check UM administrator guide for detailed explanation.

Did recreating the instance fix your problem? I don’t understand people being so scared of patching the environment. Leaving the environment unpatched usually leads to a bigger problem later (like getting a cyber attack or data leak or something), but I understand you since its customer’s request. When I was a consultant my customers were also reluctant for patching the environment most of the times. If you ask them they will still reject the idea. I stopped asking them when they called me for fixing the issues. It usually fixes the issue right away, if it goes wrong you can always roll back your fixes anyway.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.