Best Practice for Determining Large Document

I am trying to find out at what size we need to begin thinking about large document handling for both TN 4.6 & 6. Is there a best practice or an equation that is used to figure out??? Anyone have suggestions??

Thanks

MCB,

There is no such specific number defined that says begin with this size and consider the document as large.

And its all depends on your current transactions documents size that is routed to your TN.so check the max size and so depending on that configure the size like above docs 4MB TN considers as large documents.

So that it will not effect your existing integration code of extracting the content from bizdoc directly,since for large docs extracting the content part is handled differently in the processing service.

Just my thoughts,

Also of note especially for EDI documents. TN considers any document that is 1/3 the size of BigDocThreshold to be a large doc. That is if you set BigDocThreshold to be 999999 then any document that is 333333 or larger will be considered a large Doc. That is because TN splits the docs by Envelope, Group and then Transaction Set.

My take on this is a bit different.

Large document handling is usually invoked because managers discover their crashed systems smoking and scratch their heads to determine why.

It’s usually because of out of memory errors because the “normal” method used to manipulate data take up roughly 15 times the document size in memory.

A 100k document will temporarily use 1mb of memory. These are generalizations and other factors such as number of processors, amount of memory and specific processing steps should be noted.

With this in mind, I usually configure my systems to process documents with an average size of 250k. I consider anything larger than that outside of the norm and invoke large document handling procedures.

HTH

Ray

Different applications will have different requirements. We’re often involved in RosettaNet projects where the data is inheritently “large”. Large enough that built-in services such as HTTP, recordToDocument (4.6), etc will fail intermittently. Without the largeDoc facilities, these data cannot be processed at all.

Just in case people start objecting, let me give an example. We have a lot of Semiconductor customers, and they need to send periodic updates of all the work in progress (called Work-In-Process) to their customers. That’s probably hundreds of die on a silicon wafer, and hundreds (or thousands) of wafer per customer. And they need to be transmitted as a document (breaking the data up create a lot more problems). Or another example is PC manufacturers sending Shipment Notices (for container loads of PCs) to customers, and customer wants list of all serial numbers in the Shipment. Easily over 10 MBs. In fact, for our Semi customers, we’re frequently asked to test wM processing of data approaching 100MB. With or w/o inefficiencies in handling the data in memory, these are not possible to handle w/o the “largeDoc” facilities.

A general guideline that we use:

  1. As RMG said, 4MB is a good size. We’ve testing Java to somewhere around 10-12MB before it starts to give intermittent errors.
  2. More importantly, largeDocs are handled quire differently than “normal” docs. Validation is different. Mapping is different. Even sending to backend might be different. The worst case is having a process that needs to handle both large and normal docs. That is, you want to pick a largeDocThreshold which, if possible, will separate your processes into either largeDoc (always) or normal (always).
1 Like