I am trying to implement Universal Messaging clustering in Master/Slave model. The steps followed are as below .
Two Universal Messaging Servers are installed on different hosts(Here, the hosts are docker containers ).
Created one instance of Realm Server (other than default , after shutting down the default instance).
Used the command as ./ninstancemanager.sh create umserver1 RS 0.0.0.0 9000 and
./ninstancemanager.sh create umserver2 RS 0.0.0.0 9000
Confirmed that there is no directory/folder as /naming/defaultContext in the instances created . (As UM clustering guide states before clustering creation steps & verification)
Next , connected to Enterprise manager & then selected option of connect to Realms.So, one by one both Realms were connected .
Selected cluster creation option . Provided Cluster Name and added the Realms .
Post adding the realms in cluster I can see small ‘c’ tagging on the realm servers connected .
Clicked On cluster name created and then clicked on sites option . Added Primary & secondary sites and then voted Primary for IsPrime .
Now, the problem is in cluster Summary page . The realm servers are showing itself as local but the other as either offline/Disconnected . The screen shots are attached here .
I tried telnet one Um host to another on the docker public port and it was getting connected .
When the messages shown in Cluster summary page is also varying as one realm says other as disconnected and other says as offline ?
Have I missed anything in the cluster set up steps? Could you please validate my steps followed and suggest if any other configurations need to be done.
Steps followed are correct. Did you see any log statments ?
Based on step you have listed, it’s the right approach. However, make sure that secondary node is active, it looks nothing to do with clustering but umserver2 is not be active.
To achieve auto switch over in case of primary failure, it’s recommended to have at least 3 UM servers forming cluster instead of two.
Thanks for the response.
@ MR , I have checked the log files on both realm nodes .
The logs entry on of the node which is set as secondary keep on populating with message as
Cluster> Setting potential master to umserver2 yet master count is only 1.0 while we we need more than 1.0 for quorum
while the another node as message as
[Fri Jun 24 04:42:24 EDT 2016],MemoryManager: Called System GC to force memory to be released
[Fri Jun 24 04:42:24 EDT 2016],MemoryManager: Monitor: memory free, Before 909 MB, After 965 MB
As part of differentiating the set up , I have tried to create another real instance on the same node where umserver1 realm server was created . The realm server ,umserver3 was created with different port (9010).
After that I modified the cluster by taking out umserver2 and bringing umserver3 (hosted on same node as umserver1 created). The master/slave state is showing with other node as online . The attachment is there for the same . However on creation of one channel in umserver1 did not got reflected in umserver3. This is failure of successful cluster. Although the cluster summary page shows proper master/slave and all status as expected .
After this I reverted the node cluster with different host as previous implantation. This time I deleted the sited and recreated by giving the Primary to umserver2 (Earlier case umserver1 was set Primary) and voted IsPrime.
The cluster summary page shows strange message as both Master . Screen shot attached . However , the above log message stopped coming .
@ Ramesh , Both the server are active as you can see the realms are shown . Also, I have make sure the ports are listening by running the netstat command on both nodes .
$netstat -an|grep 9000
I am wondering whether there is any communication need to be open between the servers when realm server are on two different docker containers.
Although the cluster summary page shows the master/slave properly as in screen shot attached , why the channels are not getting reflected across realm cluster?
Is there any extra configurations /parameter need to be set up ?
Add another Realm node to the cluster. It is not able to form the quorum and hence the issue.
Thanks for the suggestion. Of course , the odd number of RS are recommend . Here is the requirement is of two node UM cluster .
However , my implementation method of the cluster is using site concept which overrides the quorum requirement
(availability of more than 50% of the servers), and the cluster can operate with only 50%
The cluster with two sites (primary site and secondary site) and configures with setting the IsPrime flag in the primary site. In this case, the primary site with the IsPrime flag gets an extra vote.
The weird part is that one one node other offline but other says its disconnected . later reversing the IsPrime flag the message changed but still other node says offline .
Also, when cluster gets establish with RS (Realm Server) created on same host , when I create Connection Factory and try testing from IS ,I am getting the CF name listed with successful lookup .
However , the channel & queue creation are not getting synch up across cluster while creating on one RS.
which version you are using and do you have all latest fixes installed ?
It should reflect when you create a channel from cluster in all realms forming the cluster, when you browse for .mem file of the channel it should appear in both installation locations. Note that, when a channel does not contain ‘c’ in a circle symbol then it’s not participating in cluster obviously becomes a local channel.
I don’t recall any specific configuration requirement.