Finding better way to process large flat file

pth30041 · February 25, 2012, 9:02pm

I have a pipe delimited file that contains about 115k+ lines records. I use the “convertToValues” service and it takes very long time to process. I even use the recommended process for handling large flat files which is set the “iterator” to true and use the “Repeate” operation with “convertToValues” and exit when “ffiterator” is $null. It still takes awhile to process. I’m looking for ways to optimize this process. If anyone have any suggestions , please let us know. Thanks.

reamon · February 26, 2012, 1:49am

How long is “a while?” What is the processing done for each record?

An approach I’ve used for simple DB loading is to create a FLOW service that reads a group of records from the FF at one time (using repeat and ffIterator). Then uses batch insert to write them to the DB. This cuts down on the overhead.

pth30041 · February 26, 2012, 3:22am

I’m running this via the Developer. I ran the flow service that does the “convertToValues” and I got tired waiting for it to complete which is about 15-30min time frame. I thought I must be doing something wrong. I’m not inserting it into a database. I just need to parse the FF so that I can use it for the next process of the step. I’m also using “appendToDocumentList” after the “convertToValues”.

Martin_Wroblinski · February 27, 2012, 3:03pm

There are several things to consider when using the FFIterator:

You need to parse the file as stream, otherwise the Iterator won’t change loading the whole file. Changing to stream enables to load only parts of the file into memory. Please see the Documentation on how to configure FFIterator correctly.
The before means you need ot do further processing inside the loop, so the process part can be dropped before next iteration. If your processing requires the complete contentx in memory, FFIterator won’t help. If you append all data to a document list, you still get the complete filecontents in memorey, so you may need to rethink your processing approach at all.
Running this from developer won’t work well, as this transfers all the data to developer. You should use Developer only with small test data and run from IS via invoke when parsing the complete file.

Regards

Martin

pth30041 · February 27, 2012, 7:36pm

Martin,

Thanks for the suggestion. I didn’t realize I was loading it as “bytes” instead of a “stream”. I will make that change and see how it goes. However, based on what you said, it would be pointless to use “appendToDocumentList” since that’ll load all the data into memory. What I’ll do is parse only the record that I need and “appendToDocumentList”. When you say don’t use Developer to test but use the IS invoke from the admin console?

Martin_Wroblinski · February 27, 2012, 8:33pm

Yes, invoking from Developer is fine for debugging but highly inefficient as develoepr controls the complete flow. Invoking from admin console, either via package management or just adding the invoke part to the URL is far better if dealing with a lot of data.

Topic		Replies	Views
Best approach to handling large flat files webMethods , Integration-Server-and-ESB , Flow-and-Java-services	2	938	April 2, 2021
Large Flat File parsing and Database update. webMethods , Integration-Server-and-ESB , Flow-and-Java-services	9	2890	April 2, 2021
Read File List Content Performance inefficiency webMethods , Integration-Server-and-ESB , EDI	17	3177	April 2, 2021
Large Flat File - no ffiterate webMethods , Integration-Server-and-ESB , Adapters-and-E-Standards	7	2056	April 2, 2021
Handling large flat files webMethods , Integration-Server-and-ESB , Flow-and-Java-services	1	1741	April 2, 2021

Finding better way to process large flat file

Related topics