Out of memory while executing triggers IS 601

wayne.leishman.20059 · August 3, 2004, 3:02am

Hello,

We are running our integrations in a development environment (IS 6.0.1., Broker 6.0.1.) and have started getting the following error in our server log:

2004-08-02 16:23:45:912 EDT [ISS.0098.0049C] Exception:java.lang.OutOfMemoryError while executing trigger. Rejecting Document for TriggerStore:<trigger_name_goes_here>

We DON’T have client side queueing turned so according to the webMethods IS admin document, the persistent trigger document store would be in memory and not on disk. The broker maintains the persistent document queue.

It appears we are running out of JVM memory ? It seems like as soon as we get any out of memory errors the entire Integration server is toast and we have to re-boot.
Our IS uses the following jvm min and max settings:

set JAVA_MIN_MEM=756M
set JAVA_MAX_MEM=1536M

Any comments on this ?

Regards,

Wayne

gupta_r.17495 · August 3, 2004, 3:31am

Depends on the disk space you have on that server box,increase the
setting JAVA_MAX_MEM=2048M if it is feasible.

Guest · August 3, 2004, 3:42am

Hi Wayne,

check the timestamp on your runserver.bat/sh file. This file is generated on any restart, however if the update manager fails it will not quit the whole start script but proceed with a previous version of runserver.bat/sh
.

In any case you can check the effective MX setting in this file to be in the save side.
Also the IBM JDK seems to be susbectible to OutOfMemory failures. I had way better experience with SUN.

saurabhm · August 3, 2004, 10:34pm

Wayne,
what is the size of the document the trigger is associated with?

hari3 · August 3, 2004, 11:23pm

HI all,
I have a similar situation here. In my case the size of the document the trigger is trying to process is around 50kb. These are the settings for the trigger :
Document store capacity is 10
Refill level is 4
Process documents concurrently is set and
Maximum number of threads is set to 5.
I have disabled client side queuing.

I noticed in the administrator page that there is enough free space (around 250MB) available to IS but still it crashed after throwing this exception “Exception java.lang.OutOfMemoryError: requested 64000 bytes”.
Any suggestions or comments please?

Regards,
Hari.

saurabhm · August 4, 2004, 12:01am

Hari,
Can you successfully reproduce the problem? If yes, then definitely we can work together to get this fixed…

Since I am yet to face this problem, I am not quite sure about the cause of it. But if you can reproduce the issue, I am sure we can nail it down.

BTW, do you know how many times the trigger was to be invoked when this condition occured?

Also, you may try increasing the JAVA_MAX_MEM and ( I know this will get me a lot of flak ) possibly see if you can have gc scheduled !! ( this ensures added visibility/responses for the thread, Hari ;-))

Saurabh.

wMusers.Com1 · August 4, 2004, 12:22am

ARRRRGGGGHHHH. {Shoots flak at Saurbh}

Before resorting to remedies from old wives tales (scheduling garbage collection), there are a few other things to try.

Before getting into those, Hari, can your provide more info on your JVM vendor, JVM version, OS vendor and version and available physical memory? Are you able to add the appropriate verboseGC parameter to your java startup command in server.bat or server.sh?

Mark

wayne.leishman.20059 · August 4, 2004, 3:00am

Hi Mark. Here’s some feedback on our JVM:

Vendor: Sun
Version: jdk1.3.1_10
OS Vendor: Microsoft Windows 2000 Server
OS Version: 5.0.2195 Service Pack 4
Available physical memory: 3,654,080 KB

In server.bat:
JAVA_MIN_MEM=1536M
JAVA_MAX_MEM=1536M (this appears to be the max size we can use with this JVM)

We don’t have the verboseGC in our server.bat but can add it to see what GC is doing.

Trigger Throttle info:

Maximum Document Retrieval Threads 100% of Server Thread Pool (400 Threads)
Trigger Document Store Capacity Throttle 50%

Maximum Trigger Execution Threads 100% of Server Thread Pool (400 Threads)
Trigger Execution Threads Throttle 50%

Client side queueing is OFF.

Any other tips ?

hari3 · August 4, 2004, 7:12am

Mark/Saurabh,
Here is the information:
JVM Vendor : Sun
Version : jdk1.3.1_10

OS Version : SunOS 5.1
Total Physical Memory : 8GB
Memory allocated to Integration Server : 1GB (Min and Max)
Broker also runs on the same box.

Maximum Document Retrieval Threads 100% of Server Thread Pool (100 Threads)
Trigger Document Store Capacity Throttle 50%

Maximum Trigger Execution Threads 100% of Server Thread Pool (100 Threads)
Trigger Execution Threads Throttle 50%

These are the settings for the trigger (This is the only trigger that was running):
Document store capacity is 10
Refill level is 4
Process documents concurrently is set and
Maximum number of concurrent threads is set to 5.
I have disabled client side queuing.

Publishing service publishes around 1000 to 1200 documents (Each document size is around 50k) to the Broker. Integration server crashed after processing around 700 documents. At one instance, it crashed just after all the messaages are processed.

We did not add verboseGC paramter yet. I will add it if suggested.

Regards,
Hari.

Guest · August 4, 2004, 5:04pm

Hari,
Some suggestions for you, You need to change couple of properties to overcome this problem.
You are publishing 1200 documents (1200 x 50 KB size) and asking your trigger to process 5 at a time. This is going to choke your wm infrastructure.

Couples of things you need to take care when you are doing a mass publish:

In Trigger Settings
Document Processing> Document Store> Capacity: Increase the number (Default is 10 which is very less in your case) so as the refill level.
Capacity is the number of documents, the trigger is going to retrieve from Broker for processing.
Refill Level is the number of unprocessed documents that remain in the store, before IS retrieves more documents from the Broker.

Document Processing> Document Store> Document Dispatching: Increase the number of document process concurrently. The number 5 is very small for your 1200 documents scenario.

In Publishable Document Properties:
Storage Type: Volatile/Guaranteed
Time to Live: Better to discard after some time than to keep it fore ever (default)
Also check the broker document type properties.

One more thing you can do to optimize your integration is: Do not try to publish all the documents at a time (1200 in your case). You can put a loop and publish may be 100 at a time and put a time delay in each iteration.

Also you can write your own GC program and schedule it to run in every minute.

Let me know if you want to discuss more on this.
Thanks

ashok_bohra · August 4, 2004, 5:55pm

I wonder if changing the trigger settings this way will help. Since the higher number of documents in IS memory as a result of the increasing the capacity will make the situation worse. It is better left in broker’s possession than to bring in IS, if IS is not able to process efficiently.

In my opinion, it would be advisable to scrutinise trigger processing service in this case, lest it has something that is causing the error. One simple test would be to short circuit the service (I mean just keep the shell) and test for same load.

HTH
Ashok

Guest · August 4, 2004, 6:51pm

Once documents are published by IS, they are in Broker possession. I think this out-of-memmory is coming from the broker only. These documents are staying for a longer time in the broker, becuse our IS is slow in retreiving and processing the documents from Broker. And in the mean time broker hits out-of-memory as broker is keeping all the documents in memory.
By changing the trigger property, we are going to process the documents faster by IS and at the same time make more room in Broker memory for other documents.
Thanks

wMusers.Com1 · August 5, 2004, 3:29am

Ashis,

On what do you base your statement “I think this out-of-memory is coming from the broker only”?

What does WmBrokerAdmin show for the utilization statistics for this broker server?

1200 documents x 50kb each = 60,000kb = approx. 60Mb. That should easily fit in most broker configurations and most IS JVM heaps, right?

Now, if you have a high processing load on an Integration Server that is using a lot of memory and add this trigger processing load to it, you could run out of CPU, memory or both.

What else is happening on the server?

Mark

wMusers.Com1 · August 5, 2004, 4:09am

One free tool that is often useful for analyzing your verboseGC output is GCViewer.Supported verbose:gc formats are:

Sun JDK 1.3.1/1.4 with the option -verbose:gc []Sun JDK 1.4 with the option -Xloggc:<file> (preferred) []IBM JDK 1.3.0/1.2.2 with the option -verbose:gc

Add the appropriate verbose GC option to your JVM startup command and pipe the output to a file (if your are not using Sun JDK 1.4’s -Xloggc option). Note: Some OS/JVM combinations will require piping stderr to a file rather than stdout in order to capture the verbose GC data.

Launch GC viewer using the command “java -jar gcviewer.jar” and open the file containing your verbose GC output. You can refresh the file from within the tool to see the analysis in near real-time.

The HPjtune tool is also useful (and free).

Oh, by the way, for those of you who still think it is necessary to schedule System.GC() just watch the verboseGC output for a period of time with a little processing load on the server.

You’ll note several GC’s at server startup followed by additional incremental or full GC’s as needed based on the processing load, heap size and memory utilization in most cases.

Mark

Guest · August 5, 2004, 4:21am

If you are sending large numbers of documents to the Broker, and those documents are sitting in the Broker queues rather than being processed right away, then you may be running into the same issue I faced recently. It sounds like this is your situation.

Our symptom was the same - Broker out of memory error. If we had a large number (10,000+) documents sitting in Broker client queues for more than a few seconds, we would eventually get the out of memory error and the Broker would crash.

I highly recommend that you get webMethods Support involved. We received several patches to the Broker and the IS (version 6.1) to attempt to resolve the issue. I don’t know exactly which patches, sorry - we have a different group that does administration and they handled all of that. We developers eventually implemented some workarounds to reduce the number of documents, so the issue has not recurred for us.

wMusers.Com1 · August 5, 2004, 4:25am

Skip,

What error message told you (and WM Support) that the root cause was the broker running out of memory?

Mark

hari3 · August 5, 2004, 7:05am

All,
I do not think that Broker is running out of memory in our case. WmBrokerAdmin page shows that there are enough resources available for Broker. Also, it is the Integration server that crashes, not the Broker.

Ashis : we would prefer to have documents stored at Broker instead of bringing them quickly to IS by increasing number of concurrent threads and capacity level. The trigger service actually updates a database table and we don’t want more than 5 concurrent threads running at a time.(we want to reduce database contention too)

Mark - Thanks for providing valuable information about the GC analyser tool. I will use it as soon as I am done with the other tasks I am currently assigned to. What kind of information does this tool provide that can help us in resolving this issue?

Thanks,
Hari.

wMusers.Com1 · August 5, 2004, 7:01pm

Hari,

These tools will show you graphically how your JVM heap is being used while processing the work that is giving you problems. Specifically, the tools plot each full and incremental garbage collection event and show the amount of memory reclaimed and the resulting available memory. It will also show when the heap has to be expanded due to lack of memory to handle a particular allocation request.

You use this information to determine the correct initial and maximum (ms and mx) heap size and to learn whether some of the JVM’s advanced garbage collection features may be necessary either to increase processing throughput or to reduce pauses while the JVM is doing GC.

Another benefit is that you get to see graphically, that there is no good reason to schedule System.GC() since the JVM is already doing it quite well (at least in most modern JVMs).

Mark

hari3 · August 6, 2004, 6:17am

Mark : Thanks for the information.

I ran the trigger service manually (without involving broker or trigger) 1000 times in a loop and IS did crash again. So, out of memory error might not be because of publish/subscribe logic (may be too early to conclude this).
This service mainly deals with parsing a flat file, doing validations and uploading data to database. I have dropped all the variables in the pipeline whenever required. Also, I do not see any significant increase in memory while this service is running.
The service might be opening too many cursors which I need to double check. Can the file descriptor count (256 on our box) and number of open cursors cause out of memory errors?

Thanks
Hari.

Mike_Zhou1 · August 11, 2004, 9:26pm

Hari,

A couple of things you can try…
System wide:

Increase your file descriptor to 4096 if avail since you.
Use JVM 1.4.2 if possible which have better GC algorithm,
try the -server, -AggresiveHeap -concurrentio, to decide if you heap
size is correct, use either Mark’ tool or jvmstat (from SUN)
to decide the young generation vs old generation on heap GC
collection. Either one of them setting wrong could create the
out of memory issue. Make you heap size 2G-3G if possible.
Threading
Don’t starve your other threads, by reducing the max retreive threads to 75% of max.

App wide,
Doing flatfile processing consumes a large memory footprint, look for
some other way like node iterator or partial load (Refer the large file
doc process ) to reduce the memory consumption.