Did anyone knows how to process the large files (Flatfiles) to EDI
right now I am approching the following process
Getting the File into Integration Server and based Mod value spliting that into multi file then processing that file and converting into EDI(Multi ST).,
But here we are getting OUT OF MEMERY EXCEPTION … we are divinding file into 8mb file each.
Please let me know is there is any other way to process the large file into EDI .
File will be FTP’ing to the wm Unprocessed directory . WM scheduler will trigger the service which on start up will check the Unprocessed directory and select all files for pick up, then picked files moved to temporary directory and split on subfiles … each subfile will be loaded into hash memory on the server and passed to the parsing service where it will be converted to IData based on the flatfile schema… Each record will be validate for required fileds and if all records pass the validation data they will be moved to the conversion service(to EDI)
service which will check the unprocessed directory for input files. This service is triggered by webMethods scheduler on weekly basies.
if files are presend and ready for processing service will take one file at a time:
move original file to processing directory,
split original file to subFiles based on MOD provided by Trading Networks (TN) Extended Fields parameter
construct Transport Document and submit this document to TN
if no files are ready for processing the service will log entry into Service Log file and exit until next run
this service has no input or output parameters.
If you’re loading it completely into memory (31M)
and converting it to a String (don’t know if you’re doing this) (62M)
and splitting it into 4 8MB copies, all in memory (96M)
and processing each of those at the same time
to create multiple IS documents (>127M, in memory representations are much larger than the strings they were built from)
and submitting those to TN
and the processing rule is synchronous
and each rule is creating a target document (>158M)
Still not outrageously big but it depends on what else is going on within IS, how much heap is available and how fragmented it may be, etc.
When you say subfiles, do you literally mean you’re writing the contents to disk? Or are they just kept in memory?
This following is all speculation so if I guess wrong on your scenario, please let me know.
I infer from your posts that you’re getting one big file from somewhere. That file contains a bunch of documents for the week. Each document within will be translated into a single EDI transaction set. You want the resulting transaction sets to be batched.
One approach is to not treat original file as a single entity. Don’t split it by size (8M boundary) but split each and every document/transaction out of it. Process each individually. Using stream techniques to split the original file. Post each transaction to TN for validation and translation. Queue each resulting EDI transaction set for batching. At a given time, run the service to batch the waiting transactions sets into interchanges and send them.
If you try to program solutions in IS the way you would in classic programming, it’s not going to work very well. Don’t assume that the big flat file needs to be processed atomically. It’s probable that it has a bunch of documents within it just because it can (and is mainframe driven?). If there is no business process reason that the documents within the file need to be kept together as a group, don’t.
Thank you for your suggestion … what you understood is right .
what I understand from your reply is use getfile service to stream the file , then use ConvertToValues (Validate against schema here) map do mapping then use wm.tn.queuing:deliverBatch for queuing.Is it right ?
I am doing same way but first getting data FILE and convert that into IDATA …here I am constructing 834 EDI…then ConvertToString then queuing wm.tn:receive
If this is still a problem, with the DataFeedr companion software you will be able to feed any file of any format and any size into IS without having to worry about memory or other system resources. The software governance is fully configurable and gives you strict control over how a file is processed and how much memory and CPU threads are assigned to each file integration.
Contact me for more information at christian.schuit@centipod.nl or visit https://datafeedr.io