Large XML handling - Issue

wMBot · April 25, 2012, 3:56pm

Hi,

We have a requirement to process a huge (1GB) XML file in webMethods.
Given below is the sample XML and wM code to process the huge file using ‘nodeIterator’ approach.

Sample XML:

Item 1
100
100.00

Item 2
200
200

…

Item 100000
700
700.00

wM Code:

1.11 pub.file:getFile (Input → loadAs = “stream”)
1.12 pub.xml:xmlStringToXMLNode (Input → filestream)
1.13 pub.xml:getXMLNodeIterator (Input → criteria = “OrderItem”)
1.14 SplitProcess:Repeat
1.141 pub.xml:getNextXMLNode
1.142 Branch on ‘/next’
1.1421 $null:EXIT “SplitProcess”
1.1422 $default:SEQUENCE
1.14221 pub.xml:xmlNodeToDocument
1.14222 pub.xml:documentToXMLString
1.14223 processOrderItem

When we tested the code with 25MB file, the processing took about 13secs.
However, when we gradually increased the size to 250MB, the processing took about 1hr 20mins, and it affected the server performance severely and the processing became horrendously slow.

Any inputs to the below questions will be of great help:

Are we loading the entire data into the memory as per this approach?
If yes, what would be the alternate way to process this data.
Similar to ‘LargeFileHandling’ in EDI, can we process this data in chunks by writing to an alternate hard disk location.

We have referenced to below link while deciding the approach.
[URL]wmusers.com

Thanks,
Bot

saravanan1 · April 26, 2012, 4:49am

Are you using moving window property? If not set movingWindow to “true” in pub.xml:getXMLNodeIterator service. It will disregards old nodes in memory

Regards,
Saravanan.E

wMBot · April 26, 2012, 12:28pm

Hi Saravanan,

Thanks for the reply.

Currently, we have the movingWindow property set to true.

Earlier, we processed a 250MB file without setting the property and the processing took about 1hr 20mins.

After setting the property to true, a 125MB file took about 1min 2secs to process.

There is definitely a noticeable difference in performance.

However, we had tested it when the server memory usage was around 90% which was consumed by other processes.

Will run further tests with larger files and provide an update.

Meanwhile, any inputs to the below questions is highly appreciated:

Are we loading the entire data into the memory as per this approach?
If yes, what would be the alternate way to process this data.
Similar to ‘LargeFileHandling’ in EDI, can we process this data in chunks by writing to an alternate hard disk location.

Regards,
Bot

saravanan1 · April 27, 2012, 1:08am

Are we loading the entire data into the memory as per this approach?

if we use movingWindow - true then you are loading only current node in memory. Old nodes will be disregard

If yes, what would be the alternate way to process this data.
Similar to ‘LargeFileHandling’ in EDI, can we process this data in chunks by writing to an alternate hard disk location.

Other way is you can split each node and publish to broker and subscribe the document in multiple IS
Regards,
Saravanan.E

Balaji_Seshadri1 · April 28, 2012, 1:33am

We moved our service to WebLogic because webMethods couldnt handle 25MB size XML.

wMusers.Com1 · April 28, 2012, 3:54am

As you can see above, IS can handle large XML documents, if the integration is designed and implemented correctly.

Two key concepts are needed: 1) don’t load the entire document into memory. Instead iterate over the nodes 2) implement a mechanism for processing individual records/documents in parallel.

Generic statements such as “X couldn’t handle Y” are usually incorrect if X is in the hands of a person with the right skills and experience.

Mark

wMBot · April 30, 2012, 12:46pm

Hi Saravanan,

Thanks for the confirmation!

Regards,
Bot.

Akshith_Arremreddy · April 30, 2012, 11:21pm

Not a true statement. I have seen XML files almost 200 times the size mentioned above processed in WM with no issues under 30 secs.

As Mark mentioned above, If you have the right person with the skills and experience they will make it happen.

Cheers,
Akshith

vikas_g · May 13, 2012, 3:29pm

Intresting any other solutions…

Akshith_Arremreddy · May 14, 2012, 6:10pm

I have previously implemented large file XML processing using open source STAX API in a java service (receiver service) .

[URL]JDK 19 Documentation - Home

I did not have a chance to look at ehcache but may be you could use it too for large file processing.

Cheers,
Akshith

Jeevan_Kiran · May 30, 2012, 4:11pm

HI Saravanan.E and Akki

This wmuser thread helps to create an flowservice for large file handling.
Thanks a lot

Regards,
Jeevan

Topic		Replies	Views
Large file handling EDI	11	5516	May 14, 2021
Read File List Content Performance inefficiency EDI	17	3313	April 2, 2021
Handling large messages in webMethods EDI	9	1255	April 2, 2021
Large Flat File Parsing and Publishing the Generated Document EDI	7	2594	April 2, 2021
Large EDI TN Processing EDI	4	747	April 2, 2021

Large XML handling - Issue

Related topics