Integration Server high CPU usage

Aleksandrs_T · February 20, 2018, 12:20pm

Hello,

We are in progress in upgrade project from version 7.1.3 till 10 (latest currently). During our analysis and investigation of current solution, we identified that IS service (running on Windows Server 2008) is highly loading CPU (80%-85%) all time. Since it becomes as showstopper for upgrade project, could you please share the best practices how to perform profiling and identify what is causing such high CPU load? Is there anything what we should look first?

Thank you,
Looking forward for your help and answers.

Holger_von_Thomsen · February 20, 2018, 5:21pm

Hi Aleksandrs,

can you describe your migration path as there is no officially supported direct path from 7.1.3 to 10.1 (latest general release)?

Latest version with direct path from 7.1.3 is 9.5 SP1.
From there you need to migrate to 10.1 if this is possible.

See the Supported Upgrade Pathes documentation for details.
Additionally you will have to check all the Built-In-Services Guides for all intermediate versions for deprecations and changes.

Regards,
Holger

Mahesh_K_Sreenivas · February 20, 2018, 6:12pm

In addition to it, also plan to migrate to Universal Messaging from Broker (if used).

Gerardo_Lisboa · February 21, 2018, 8:29am

Hi,

You can request a free trial of the Wrightia’s Service Profiler ( wrightia ) to analyse which of your services is having a bigger impact on the CPU usage.

The installation requires a server restart.

If that is not possible, see on the Service Usage which service is running most of the time and get a Thread Dump to see which threads are running.

You can also analyse the logs, for any relevant messages and extract the stats.log and parse it elsewhere (I used excel) to see how/if the IS is calling the GC too often.

You can also get a Diagnostics file from the IS and request an analysis from SAG.

Best regards,

Holger_von_Thomsen · February 21, 2018, 4:43pm

Hi,

you can add some parameters to JAVA_OPTS to enable GC logging into a file:

Under profiles/IS_default/bin in the setenv.sh:
“-verbose:gc -Xloggc:/IntegrationServer/instances/default/logs/gc.log”

You can change the filename (here: gc.log) to any name you want.

Restart the IS after changing the setnev.sh.

Regards,
Holger

Aleksandrs_T · February 21, 2018, 7:26pm

Hello Holger,

We are following the official way 7.1.3 → 9.5 → 10.1. Thank you for pointing us about deprecations and changes, I would like to let you know that the codebase analysis did performed and there no issues during this upgrade path.

KR,
Alex

Aleksandrs_T · February 21, 2018, 7:27pm

Hello,

According our plan and platform, we don’t use broker at full cycle. So the switch from broker to UM is planned during first 6 months after go-live in 10.1.

KR,
Alex

Aleksandrs_T · February 21, 2018, 7:30pm

Hi Gerardo,

Thank for suggestion and option. I really aprecciate that. I will try to follow the option without server restart, since our project has high-availability requirement.

KR,
Alex

Holger_von_Thomsen · February 22, 2018, 2:31pm

Hi Aleksandrs,

if there are no issues encountered during codebase analysis you had luck.

But you should plan a thorough regression test on your code to make sure everything works as expected.
We have found some issues during our migration from 7.1 to 9.5 which were not detected during codebase analysis.

Regards,
Holger

Aleksandrs_T · February 22, 2018, 7:57pm

Hej Holger,

Many thanks for your input. Is that possible for you, just describe shortly the issues you experienced? Just as experience sharing.

KR,
Alex

Percio_Castro1 · February 22, 2018, 10:50pm

Alex,

The trick to finding the root cause of CPU issues is to correlate information from the operating system to that of the JVM. By that I mean, you typically have to leverage tools/commands at the operating system level to identify the thread ID, within the JVM process, that is taking up most of the CPU. You can then generate a thread dump from the JVM and use that ID to find the specific thread that is causing the issue. Please note that the thread ID may be represented differently, so you may have to convert from decimal to hex, for example, to find the correct thread.

For Windows operating systems, I have typically used Process Explorer (Process Explorer - Sysinternals | Microsoft Learn) for these types of tasks.

Having said this, in my professional experience, the root cause of most (not all) CPU issues related to webMethods tends to be insufficient memory allocation (i.e. the heap is not large enough). Similarly, it’s not uncommon to run into heap sizing issues during upgrades because of the significant changes in the underlying software but often also in the environment. For example, many customers use the opportunity to upgrade (or change) operating systems or to move from 32-bit to 64-bit architectures.

Given this, you may want to hook up visualVM to the JVM (or turn on GC logging as suggested here) to determine if the issue is memory. You could also just bump up the heap size and see if the problem goes away. If it turns out not to be memory-related and there’s a service behaving badly somewhere, then a profiler like the one suggested by Gerardo or Nibble Technologies’ Nanoscope (Nanoscope) could come in handy. Full disclosure: I work for Nibble Technologies.

Percio

Holger_von_Thomsen · February 23, 2018, 4:04pm

Hi Alex,

we had to adjust our process models regarding receive steps when the models uses several of them for joining running instances.

Regarding OR-Joins check the 9-5-SP1_BPM_Process_Development_Help.pdf, page 125.

Undocumented change of behaviour for i.e. pub.xml:xmlNodeToDocument regarding parameter makeArrays in combination with parameter keepDuplicates.

When using pub.prt.log:logCustomId, make sure that the input for the customId is filled. Otherwise the service will fail instead of using “NO VALUE” as was done in wM 7.1.

Not all default values for parameters mentioned in the various Built-In-Services guides are really handled as default values.
These need to be set manually sometimes.

Remember to migrate the JDBC Adapter services with the script provided by the new Adapter if you are using and migrating the JDBC Adapter (from 6.5 [for wM 7.1] over 9.0 [possiible for wM 9.5] to 9.10 [current version for wM 9.10 and newer]).

Regards,
Holger

Aleksandrs_T · February 25, 2018, 1:46pm

Percio Castro:

Alex,

The trick to finding the root cause of CPU issues is to correlate information from the operating system to that of the JVM. By that I mean, you typically have to leverage tools/commands at the operating system level to identify the thread ID, within the JVM process, that is taking up most of the CPU. You can then generate a thread dump from the JVM and use that ID to find the specific thread that is causing the issue. Please note that the thread ID may be represented differently, so you may have to convert from decimal to hex, for example, to find the correct thread.

For Windows operating systems, I have typically used Process Explorer (Process Explorer - Sysinternals | Microsoft Learn) for these types of tasks.

Having said this, in my professional experience, the root cause of most (not all) CPU issues related to webMethods tends to be insufficient memory allocation (i.e. the heap is not large enough). Similarly, it’s not uncommon to run into heap sizing issues during upgrades because of the significant changes in the underlying software but often also in the environment. For example, many customers use the opportunity to upgrade (or change) operating systems or to move from 32-bit to 64-bit architectures.

Given this, you may want to hook up visualVM to the JVM (or turn on GC logging as suggested here) to determine if the issue is memory. You could also just bump up the heap size and see if the problem goes away. If it turns out not to be memory-related and there’s a service behaving badly somewhere, then a profiler like the one suggested by Gerardo or Nibble Technologies’ Nanoscope (https://nibl.tech/nanoscope) could come in handy. Full disclosure: I work for Nibble Technologies.

Percio

Hello Percio,

Many thanks for comprehensive solution, starting from this week me and my team will review start looking into this issue to solve that before going live with 10.1

KR,
Alex

Aleksandrs_T · February 25, 2018, 1:49pm

Holger von Thomsen:

Hi Alex,

we had to adjust our process models regarding receive steps when the models uses several of them for joining running instances.

Regarding OR-Joins check the 9-5-SP1_BPM_Process_Development_Help.pdf, page 125.

Undocumented change of behaviour for i.e. pub.xml:xmlNodeToDocument regarding parameter makeArrays in combination with parameter keepDuplicates.

When using pub.prt.log:logCustomId, make sure that the input for the customId is filled. Otherwise the service will fail instead of using “NO VALUE” as was done in wM 7.1.

Not all default values for parameters mentioned in the various Built-In-Services guides are really handled as default values.
These need to be set manually sometimes.

Remember to migrate the JDBC Adapter services with the script provided by the new Adapter if you are using and migrating the JDBC Adapter (from 6.5 [for wM 7.1] over 9.0 [possiible for wM 9.5] to 9.10 [current version for wM 9.10 and newer]).

Regards,
Holger

Hi Holger,

For some reason, we don’t use business models in our project, but we already noticed some issues with the xml parsing and wrapping into xml content. Mainly we faced with the namespace issue which historically had been incorrectly passed (hardcoded). Regarding JDBC, we didn’t found out any issues at all.

KR,
Alex

Topic		Replies	Views
JCO_ERROR_REQUEST_CANCELLED error EDI	13	3671	April 2, 2021
Integration Server Upgrade or Migration Knowledge base groci , ismigration	0	1290	November 23, 2015
Unable to restart the Integration Server Application-Platform	15	5449	April 2, 2021
Need performance advice for translating very large EDI files EDI	14	4344	April 2, 2021
EDI and EAI migration from 4.1 to 6.5 EDI	12	2466	April 2, 2021

Integration Server high CPU usage

Related topics