Large file handling

Hi ALL ,

Did anyone knows how to process the large files (Flatfiles) to EDI

right now I am approching the following process

Getting the File into Integration Server and based Mod value spliting that into multi file then processing that file and converting into EDI(Multi ST).,

But here we are getting OUT OF MEMERY EXCEPTION … we are divinding file into 8mb file each.

Please let me know is there is any other way to process the large file into EDI .

When you say “Getting the File into Integration Server…” does that mean you’re loading the entire file into memory? How big is the file?

The docs from wM provide information about large file handling.

Hi ,

Here is my process…

File will be FTP’ing to the wm Unprocessed directory . WM scheduler will trigger the service which on start up will check the Unprocessed directory and select all files for pick up, then picked files moved to temporary directory and split on subfiles … each subfile will be loaded into hash memory on the server and passed to the parsing service where it will be converted to IData based on the flatfile schema… Each record will be validate for required fileds and if all records pass the validation data they will be moved to the conversion service(to EDI)

I’ll rephrase my questions: Are you loading the entire file into memory? How big is the file?

How many threads are configured on the flat file polling port?

“…each subfile will be loaded into hash memory on the server” Are multiple subfiles loaded at the same time?

  1. Yes , file size is 31,737 KB
  2. All subfiles loaded at same time to hash memory.

service which will check the unprocessed directory for input files. This service is triggered by webMethods scheduler on weekly basies.
if files are presend and ready for processing service will take one file at a time:
move original file to processing directory,
split original file to subFiles based on MOD provided by Trading Networks (TN) Extended Fields parameter
construct Transport Document and submit this document to TN
if no files are ready for processing the service will log entry into Service Log file and exit until next run
this service has no input or output parameters.

A 31 M file isn’t all that big, however…

If you’re loading it completely into memory (31M)
and converting it to a String (don’t know if you’re doing this) (62M)
and splitting it into 4 8MB copies, all in memory (96M)
and processing each of those at the same time
to create multiple IS documents (>127M, in memory representations are much larger than the strings they were built from)
and submitting those to TN
and the processing rule is synchronous
and each rule is creating a target document (>158M)

Still not outrageously big but it depends on what else is going on within IS, how much heap is available and how fragmented it may be, etc.

When you say subfiles, do you literally mean you’re writing the contents to disk? Or are they just kept in memory?

This following is all speculation so if I guess wrong on your scenario, please let me know.

I infer from your posts that you’re getting one big file from somewhere. That file contains a bunch of documents for the week. Each document within will be translated into a single EDI transaction set. You want the resulting transaction sets to be batched.

One approach is to not treat original file as a single entity. Don’t split it by size (8M boundary) but split each and every document/transaction out of it. Process each individually. Using stream techniques to split the original file. Post each transaction to TN for validation and translation. Queue each resulting EDI transaction set for batching. At a given time, run the service to batch the waiting transactions sets into interchanges and send them.

If you try to program solutions in IS the way you would in classic programming, it’s not going to work very well. Don’t assume that the big flat file needs to be processed atomically. It’s probable that it has a bunch of documents within it just because it can (and is mainframe driven?). If there is no business process reason that the documents within the file need to be kept together as a group, don’t.

Just a suggestion to consider.

Hi ,

Thank you for your suggestion … what you understood is right .

what I understand from your reply is use getfile service to stream the file , then use ConvertToValues (Validate against schema here) map do mapping then use wm.tn.queuing:deliverBatch for queuing.Is it right ?

I am doing same way but first getting data FILE and convert that into IDATA …here I am constructing 834 EDI…then ConvertToString then queuing wm.tn:receive

Use the iterate feature and don’t collect all the documents into memory. Read one, convert it, give it to TN.

That’s for after the docs have been queued.

How? Using convertToValues?

That’s good. Just change your rules to queue the 834s and then schedule a task to batch and deliver your interchange.

Thanks Rob. I appretiate your help.

–Raj

Good morning,

If this is still a problem, with the DataFeedr companion software you will be able to feed any file of any format and any size into IS without having to worry about memory or other system resources. The software governance is fully configurable and gives you strict control over how a file is processed and how much memory and CPU threads are assigned to each file integration.

Contact me for more information at christian.schuit@centipod.nl or visit https://datafeedr.io

Thanks, Christian

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.