Need performance advice for translating very large EDI files

We are doing an EDI translation of some very large files and are seeing that when a large amount of EDI data hit our server at once it takes a long time for the processing to complete. During this time the memory usage stays stable but the CPU is heavily used.

To describe the EDI processing, we treat our EDI files as flat files and each file that is translated has one interchange. These files are dropped in a directory from where polling service picks them up for processing. The polling service has 10 threads and polls every 10 seconds. Once the required recognition parameters have been picked from the file, the file is passed to the TN for processing. Most of the processing is then accomplished in the processing services. The mapping services that use heavy looping are written as Java services. We are also using the large file handling feature for TN and EDI. The threshold for the large file is set to 1 MB.

We can process a two large files of 320 MB in 7.5 hours but the CPU usage is at 87% during this time. But the bigger issue is when we have four 120 MB files that are in process at the same time, it takes almost 15 hours to complete the translation and also the CPU usage is at 90%. Does anyone have any recommendations for improving the processing time and/or also lowering the CPU usage? Any help on this issue will be highly appreciated.

Tom,

How is the server configured in terms of CPU’s, memory and access to fast disk storage?

Have you been able to conduct any analysis to determine which steps in your process are consuming the most CPU?

Also, does your process involve communicating with back-end systems that may not be able to handle the load?

Mark

Mark,

Thanks for your reply. The programmer working on this has identified an area of the code that is a bottleneck and is rewriting it in Java. As for our configuration, here it is:

[FONT=Helv][SIZE=2]Environment:
Product(s) and version number(s) webMethods 6.1

  • Installed service packs (if any) SP1
  • Hardware platform – Sun, 4 CPUs of 900 Mghz each and 16 GB RAM
  • Operating system Solaris 8
  • JVM version java 1.4.2
  • IS memory allocation 500 MB min and 1 GB max

Concerning the back-end systems, this is not an issue, since we have separated out their processsing from the translation.

Thanks,
Tom
[/size][/font]

Tom,

It is often helpful to collect verboseGC statistics during a load test to determine whether garbage college is a bottleneck. With four CPU’s and a current JVM version you should be able to add some arguments to the command that starts Java in the server.sh script such that GC performance is improved.

Note that this may not be necessary and should only be done after seeing that this is an issue by analyzing verboseGC output.

Other threads on wMUsers discuss some utilities such as HP JTune that are useful in analyzing GC output.

Mark

Mark,

Here is a description of the code rewrite. Please let us know if you have any insights/experiences/comments on this. As far as the GC, I do not think that this is an issue, but I will run it by our programmer.

Thanks,
Tom

The translation service processes the EDI document as a large document, thus iterates over the EDI segments during translation. Most of the segments repeat once in the transmission but there is one segment which can occur 100s of thousands of times if the file is large, hence the loop is executed 100s of thousands of times + a few more iterations. This loop is written as webMethods flow steps. We are rewriting this webMethods flow service in Java. Will this help us to decrease the CPU usage and speed up the translation?

“We are rewriting this webMethods flow service in Java. Will this help us to decrease the CPU usage and speed up the translation?” --Yes It does helps in the memory/performance wise and you should see some difference in processing time.

Please check this thread discussion by Rob,especially when handling Segment Loops/Iterations etc…
http://wmusers.com/forum/showthread.php?t=10540&highlight=appendtoDocumentlist

HTH,
RMG

Maybe. It depends on what the FLOW steps are doing and what the Java code will do.

“Iterate” can be interpreted a couple of ways. How exactly is the transaction set being translated? Node iterator? When the segment list is being looped over, how is the target list being created?

Simply converting a portion of the code to Java may provide performance improvements but there is likely more to it than that. Can you post the FLOW snippet?

Reamon,

Here’re some code details from our programmer. Please check it out and let us know what you think.

Thanks,
Tom

[FONT=Helv][SIZE=2][FONT=Helv][SIZE=2]Here is a quick look into the actual translation service:

EDI (TransactionSet = 203) [/size][/font]

[FONT=Helv][SIZE=2]ST
BGN
DTP
LX (one per ST)
DTP
AMT
RLT (Multiple per ST based on the file size)
AMT (min. 4 per RLT)
DTP (min. 4 per RLT)
INT (one per RLT)
SE

[U]The above is translated to Flat File with following format:

[/u]Header Record - (Gets information from ST, BGN, DTP and LX segments)
Loan Record - (Gets information from RLT and its sub segments)

The flow service doing this translation has following steps:

Repeat
wm.b2b.edi:convertToValues - (has iterator = true)
Branch - “isValid”
Branch - “EDIValues/ST” (Translates ST, BGN, DTP segments to
create header fields)
Branch - “EDIValues/LX” (Translates LX, AMT, DTP segments to create
more header fields)
Branch - “EDIValues/RLT” (Translates RLT, AMT, DTP and INT segments
to create loan record fields)
Branch - “EDIValues/SE” (Not significat from translation stand point)
Branch - “ediObject” (Exits loop once ediObject is null)[/size][/font]


We are concerned that since this is a flow service this loop when iterating over RLT segments which are close to 250 thousand in our large files is taking up too much CPU and time. By converting this loop into Java service we can drop the processor usage and processing time. Hope this provides more information into the situation.
[/SIZE][/FONT]

Reamon,

Unfortunately the indentations did not take, so I’ll try to do them textually.

In the “Repeat” loop:

Indent 2 spaces for wm*
Indent 2 spaces for first Branch
Indent 4 spaces for the next four Branches
Indent 2 spaces for the last Branch

Let me know if you need indents for the EDI description.

Thanks,
Tom

Have you taken some measurements to determine that this is indeed the case? It may be a safe assumption but it’s probably wise to check.

Is the plan to replace the convertToValues call too? Or to call that from Java? What is being done in the RLT processing? Is that being handled in a “large document” way or are the resulting target records being collected in memory? Are you using appendToDocumentList or appendToStringList? Are many objects being created in the loop?

Reamon,

Here’s some more info from our programmer concerning the flow service that is being rewritten in Java:

=================================

[FONT=Helv][SIZE=2]“We are pretty certain that the flow service loop which doing the translation is where we are spending most of the time. We are planning to call convertToValues from the java service. The theory is that the flow loop “Repeat” is taking up resources once RLTs start getting processed.
The RLT loop is only 10 segments at the max. So one RLT is processed in one iteration of the “Repeat” loop once the translation gets to RLTs.
We are using appendToStringList but this service is called only when we process LX segment which in only one per transaction set.”

=================

Also …

We are as much concerned about bringing down the CPU usage as speeding up the translation time. One of things noted was that the memory usage was less then 500 MB, even though the Integration Server is configured to use 1 GB max. Any ideas on how to use more memory? I would think that more memory usage would mean less physical i/o and therefore possibly less CPU usage. We have configured the
tn.BigDocThreshold to be 1 MB. Is this having an impact on the memory usage? Would a larger threshold cause more memory usage? Any ideas on the above are appreciated.

Thanks again for your help,
Tom
[/SIZE][/FONT]

[/size][/font]
Yes, that makes sense since it is the top level loop that is processing the transaction set. All of your time translating the transaction set will be in this loop.
[FONT=Helv][SIZE=2][FONT=Verdana]

[/size][/FONT]
What resources? The overhead of the repeat itself is next to meaningless. It has no more overhead than a Java for loop. What you do within that loop determines entirely how fast/slow things will go. Focus on the steps within the repeat–I’m confident in saying it’s not the repeat that’s causing the issue.
[FONT=Helv][SIZE=2][FONT=Verdana]

[/size][/FONT]
Right. But earlier you said there are some 250,000 RLT segments. Where are you storing the translated results? With 10 segments within each RLT loop (worst case) you could have 2.5 million data elements. Where are you putting them? I think you mentioned that they are loan record fields. To know how this impacts your IS, I need to know what the steps are in the “Branch - “EDIValues/RLT” (Translates RLT, AMT, DTP and INT segments to create loan record fields)” section of the service. What these steps do will have significant impact on performance when the data set is large.
[FONT=Helv][SIZE=2]

Cool.

Why are you concerned about CPU? Are there other applications being hosted on the box? Are you using TN? Is there a DB server on the same box?

If the JVM never needs more the 500M, then it will never ask the OS for more. Another possibility is that the JVM asked for more but the OS couldn’t give it more contiguous memory. I think you’d see out of memory errors though if this were the case. I’m an advocate of setting the min and max mem settings to be the same. In your case, set both to 1G (or more).

How big is the incoming file? How many groups and transaction sets does a typical file have? I wonder if the transaction set splitting is what’s chewing up time. Do you have logic in place that can handle docs that are large and docs that are not?

Rob,

[FONT=Helv][SIZE=2]By rewriting the flow service in Java, we were able to improve the processing time by 30%. But CPU usage came down by only 5%.
Since our application is on a shared box we might have to recommend that the service should be deployed on a separate box.

Tom
[/size][/font]