Product/components used and version/fix level: IntegrationServer 10.11
Detailed explanation of the problem:
I was looking into the task manager of my server and I’ve seen that there are a lot of processes marked as “Zulu Platform” that take up the CPU (basically one for every running instance of the Integration Server and of the Universal Messaging). Is there a way to have a unique “Zulu Platform” process that is shared among all the instances of the IS and one used by all instances of the UM?
Also, are there any known issues related to these processes having a high CPU consumption even though there are few to no services running?
how many instances of IS and UM are you running on this particular box?
Even on the old Oracle Java Platform, each instance was running in a dedicated JVM process.
You should also consider the memory situation. One bigger JVM instance (if possible at all) takes quite long to initialize while several smaller JVM instances might initalize faster.
Zulu is an open source JVM which SoftwareAG uses for their processes. There should be one for wrapper (service or deamon) and another for each process for each app. If you are running a platform manager, there will be another 2 for that, and so on. More info in below link.
What do you mean by no services running? How many products are you running on this server? Your question is too generic. For every product you have installed on that server, there will be 1 or 2 zulu processes attached. Its basically a JVM that runs your apps, just not an oracle JVM.
Practically this is not possible. First, the system would not be supported by Software AG anymore. Second, if possible (and that is not 100% certain), it would require a lot of work.
Besides, you actually really don’t want to do this. This isolation between JVM instances and therefore OS processes provides protection. For adapters that use JNI (e.g. SAP) it was even recommended to have a separate IS instance running. So if the native library would kill IS, which sometimes happeneed because of bugs, only the connection to the external system was affected and not the entire integration.
That is a quantitative question but underpinned only with qualitative data. What is “high”, what is “few”? What about OS and hardware? When in the lifecycle of IS did you notice this behavior?
Thank you all for the answers, they were very useful regarding the usage of only one instance of the Zulu Platform.
Regarding the request for more specifics on my system, here is a list of answers to your questions:
On my server there are 4 instances of the IS and their respective 4 instances of the UM
With “no services running” I mean that the servers where idle, no calls being made or flows being run (at most, just a couple of small scheduled flows)
Regarding other products running, these servers are basically dedicated to the IS and UM, only with some external (i.e. not provided by SAG) data analytics programs running in the background
With “high CPU consumption” I mean that, looking at the task manager, I see that the CPU is always at 100%. The main reason are the Zulu Platform instances, since there is always one instance requiring for example 90% of the resources.
OS = Windows, Hardware = VM with 2 cores
Lifecycle: this behavior was noticed when turning on the IS instances, but it keeps going while they are running
in this case you might want to check which instance is affected and to introspect the java process with some jvm profiling tools to see what exactly is causing the high cpu usage.
Does shutting down and restarting the instances (at least temporarily) solve the issue?
Please note, that during start up of the instances they are expected to require some higher cpu usage, which will get lower after the initialization is completed after some minutes.
Without additional information I would consider the VM way too small for such a workload. Windows (which version?) alone will likely be able to occupy the 2 VCPUs. Depending on the workload and expectation on performance I would not start with less than 8 VCPUs.
In addition, you will need to find out whether or not the hardware (=bare metal) utilization and also if VCPUs have been overprovisioned or not. And of course the underlying hardware will always have the final say about performace.
BTW: What about memory and I/O utilization? The latter can also be responsible for driving up CPU load.
What is your hypervisor? What hardware is it running on?
Imo you are lacking hardware resources. If one of these instances run out of memory, they will start doing mark and sweep garbage collections. That’s probably what you are seeing.