Integration Server 10.15 clustered nodes - clustered using TSA have unequal load distribution and hence CPU utilisation is high on one of the nodes

Product/components used and version/fix level:

Integration Server 10.15 ,Clustered using Big Memory Max TSA 4.4

Detailed explanation of the problem:

I have 2 nodes of Integration Server 10.15 clustered using Big Memory Max TSA 4.4.
When there is load on the cluster it is equally distributed in both the nodes and CPU utilization increases by 30% or so in each node equally.
There are no issues reported on linux or kubernetes service aspects.When there is no load on the nodes the base CPU utilization of one of the cluster remains 3% while the other node is constantly remains hiked at 30-35%.Kindly suggest a look around for this issue.
Note that we have the default extended settings that is watt properties set on the cluster.Kindly help and list what all parameters or aspects I might be missing on.

Thanks in advance.

Error messages / full error message screenshot / log file:

Question related to a free trial, or to a production (customer) instance?

This much CPU utilization is OK imo. I set my environment to automatically scale at 80% (default value). If you are worried about the utilizations you can check again when there is high load. There is a lot can make you see a higher value in one node then the other. IS clustering with terracotta doesn’t have much effect on load balance. Check ingress and F5 configurations if you really find out what is causing that. You can set the load balancing algorithm to least CPU usage if you want. Where you need to make this change though will depend on your configuration. You mentioned kubernetes, so I presume you have integration server deployed as containers to kubernetes cluster. If this is the case, you should have an ingress. You need to check your ingress yaml file.

IS cluster with terracotta, in other words, stateful cluster can load balance as well. But it usually does that when consuming messages from queue, or if there is a service that can be split apart on different nodes.

Hello @4122-VAISHNAVI_FULSUNDAR

IS 2 nodes - I hope you have both the nodes with same CPU cores allocated otherwise 30% would mean different things in different IS.

If the IS constantly runs with certain CPU consumption, it could be some background processes

  • Check for Service Stats in IS admin page. You can click on ‘current running process’ which will explicitly show if any flow/java services are in-progress
  • Do you have connection pools defined in adapters? If so, check if the min/max configs, timeout configs are same across
  • Best is to take a threaddump and you can find the processes that are in running state. In fact, take from both IS (one with 3% and another with 30%), use threaddump analyzer, and also you can navigate manually through it to find what kind of processes are shown.
  • If you have any sidecar’s attached to this running container instance, any activity on any other processes will also shoot up CPU.

HTH
Senthil

HTTP traffic distribution to the nodes is not controlled by TSA nor the nodes themselves, as @engin_arlak notes. That would be managed by whatever network load-balancer that is in front of them. LBs can be configured in a myriad of ways, and usually have a affinity setting to support SSL handshakes.

What are all the processes on each? Scheduled tasks? Notifications (which are also scheduled tasks)? Broker/UM subscriptions?

TSA is about storage of data shared by nodes of the cluster. It does not control where execution occurs.

Unless it is causing an explicit issue with processing, I wouldn’t worry about it.