Here is the issue begins, one of the UM node was down due to heap memory issue. And we’re bringing up the UM server immediately.
Ever since, duplicate document subscription is happening. Observed that 2 times the same document is being subscribed based on our transaction audit tables.
Note: No retry mechanism configured at trigger level. And the document processing is “Serial”.
First Subscription: Document is interfacing to target successfully.
Second Subscription: Document is being subscribed by wrong subscriber eventhough the filter conditions specified correctly.
As it’s serial processing, documents are processing one by one and after couple of days it’s behaving normally. Means, the new documents are coming and processing successfully.
Can you please suggest how can we identify the root cause of this issue.
As Mahesh suggested, contact Software AG support for this issue.
If you want to try out few things yourself, then you can check subscriber detail in all 3 nodes. Maybe as part of recovery, the filter got dropped in one node, so 3 nodes are not really in sync, and you see odd behaviour depending upon which node you are connected to. If you do see that issue appears to be cluster node inconsistency, then you should report a defect. Or, you can choose to just delete and recreate the Topic and subscribers.
FIX LEVEL CHECK FOR NODES OF CLUSTER
WARN: Versions of the realms in cluster: ‘***’ does not match. Please update them to the same version.
umserver01 : Server Release Details = 9.10.0.16.100675
umserver02 : Server Release Details = 9.10.0.16.100675
umserver03 : Server Release Details = 10.1.0.4.102145
STORE MISMATCHES CHECK
ERROR: Could not find store (/RealmSpecific/NirvanaRepublishClusterChannel) on realm [umserver01] but it is present on realm [umserver03]
ERROR: Could not find store (/RealmSpecific/NirvanaRepublishClusterChannel) on realm [umserver02] but it is present on realm [umserver03]
And I could see that UM versions are different among the cluster nodes. Can this cause any issue of duplicate subscription and UM selecotrs disappear ?
Is there any impact if store mismatches between the cluster nodes ?
This will cause ‘non-deterministic’ behaviour. UM cluster nodes must all be on the same version and fix level. In addition your clients must be lower or at the same level of the server. So you depending on what level your IS is as you can take everything to the latest fix level of 10.1 (that includes IS) or take the 10.1 node back to 9.10 so it is the same as the other nodes.
When I run the HealthCheck utility the environments versions are showing as below.
FIX LEVEL CHECK FOR NODES OF CLUSTER
WARN: Versions of the realms in cluster: ‘***’ does not match. Please update them to the same version.
umserver01 : Server Release Details = 9.10.0.16.100675
umserver02 : Server Release Details = 9.10.0.16.100675
umserver03 : Server Release Details = 10.1.0.4.102145
But, If I check the UM installation directory.
C:\SoftwareAG\install\products\NUMRealmServer.prop - Clearly shows Realm version is 10.1.
All UM servers should be at same version and fix levels.
Adding one more point to your issue : check towards mem files. If the mem files were too large and contains some non subscribed messages, and when you restart your UM node, all the other messages will also be subscribed again.
From mem files, the delivered data is not getting removed.
Please raise inc to SAG support team. You will get quick support(i guess there were couple of clients who faced the same issues and got the fixes from SAG).
I re-iterate my advice - all nodes in a cluster MUST be at the same version and fix level - this will be the first thing SAG Support will tell you. If you are running all nodes on the same machine from the same install then I assume this is a test machine. Without knowing how you have installed and set up I do not know how you have got a mixed release cluster. The 9.10 realms need to be upgraded, or the 10.1 realm downgraded.