Do not use the same list as the in-array and the out-array in a LOOP. It won’t work.
When creating a list using more than one loop (e.g. the first loop adds several entries and the second loop adds more), do not simply make it the out-array of both loops. The second loop will overwrite the existing elements. Instead, do not specify an out-array for the second (and subsequent) loops. Create an individual entry (string or record) and use pub.list:appendToStringList or pub.list:appendToRecordList to append entries to the list.
It is sometime useful to know when you are processing the last item in a list during a loop. Before entering the loop, invoke pub.list:sizeOfList to get the size and store it in a var (e.g. “sizeVar”). Within the loop, use two sequences: one that tests for $iteration = %sizeVar% and another that is $default. The iteration sequence will execute when the last item is being processed by the loop.
I would really advise against using pub.list:appendToRecordList or pub.list:appendToStringList inside a loop. If you are dealing with large lists, this will have a great negative effect on performance. Each invoke of appendTo…List creates a new list (array) of size sizeOfList+1 and then copies all entries of old list to the new list before adding the last element.
If you need to create a list using more than one loop, you should rather use two seperate out-arrays, one for each loop, and then use only invokation of appendTo…List to join both output lists, after both loops have finished. This will result in a significant performance boost.
ingolfur:
Your response implies a knowledge of the inner-workings of appendToStringList and appendToRecordList. Are you privvy to this? Do benchmarks show this will indeed provide a “significant performance boost?”
List operations often have a “capacityIncrement” factor that would extend the capacity of the list by more than 1. I’m not sure this is the case here but perhaps these grow by more than 1.
Not trying to be defensive…just making sure I have all the info.
The original tip was primarily intended to present something that works. Efficiency is another issue and is best addressed through benchmarking.
Some testing has revealed that appendTo… is indeed slower than using an out-array on a loop, as ingolfur stated. For 10000 items, appendTo… was 60 seconds slower than an equivalent loop with an out-array.
Interestingly, for loops over lists with 100 entries, appendTo was slightly faster. A strange anomaly.
Again, I can share the service and my results with anyone who’d like to see them.
The lesson in both this and the invoke vs. transformer guidelines is to measure to see what makes a difference–never assume!
I found out about the performance issues with appendTo…List when I was working on a complicated map between 2 Invoice formats. While using appendTo…List, the performance of the mappings were always unacceptable when dealing with large invoices.
I then found out that the “lists” are implemented as IData in java. The only way to expand the IData would be to create an "new IData
[list.length + 1]
" and then copy all the elements.
This enabled me to reduce the time of mapping very large invoices (several thousand lines), from 30-60 minutes, down to 1-3 minutes. I don’t remember the exact times, but the difference was very significant.
Since then I’ve always been careful about using appendTo…List, and I only use it if I’m dealing with small lists.
Thanks for the advice, I do have this issue, if you look at the sample package inside b2b you will see an integration that takes a flatfile and crunch it into a b2brecord. How would you replace the append…tolist to increase the performance of this sample. I’m using this integration to dynamically take any csv file to bring it into a record list.