Flow problem: reduce a list to uniques

john.hartnup · February 6, 2009, 4:33pm

Has anyone achieved the following neatly in Flow?

I have a list of documents, each of which contains an “id” tag. The same “id” may be repeated in more than one document.

I want to reduce this to a list of “ids” with no repeats.

So input:
Doc[0] → id=“john”
Doc[1] → id=“paul”
Doc[2] → id=“john”
Doc[3] → id=“george”
Doc[4] → id=“ringo”
Doc[5] → id=“paul”

Would give output: “john”,“paul”,“george”,“ringo”.

Obviously it’s a trivial Java service, but it feels like something that ought to play to Flow’s strengths.

reamon · February 7, 2009, 1:39am

If order doesn’t need to be retained, sort the list, then loop over the sorted list and append to a new list each time the current list item differs from the previous.

john.hartnup · February 7, 2009, 1:58pm

Hmm, I’ve never found a built-in service to sort String lists.

I think what you’re telling me here is that problem does not play Flow’s strengths as much as I imagined it would - in as much as you have to write an algorithm (however simple) with temporary variables and so forth. It felt as if since Flow is good at mapping, it might also be good at folding too, and I’d missed a feature.

Just for reference the Java version is:

Set set = new TreeSet();
set.addAll(Arrays.asList(unfiltered));
String[] filtered = set.toArray();

So I put that in a Java service and put it in my “things that should be BIS” package.

mwarden · February 10, 2009, 8:18am

Here you create a temporary treeset instance, which is by definition sorted in ascending order, and you use a method which only adds if not already present. This is almost exactly what was suggested in the flow solution, which, frankly, would have been more readable.

john.hartnup · February 10, 2009, 3:55pm

I could equally have used a HashSet, which is not sorted. The important ‘by definition’ here is that as Set is “A collection that contains no duplicate elements.”

True, although it’s not important whether this method “only adds if not already present” or “always adds, replacing the previous instance”. The point is that any method that adds to a set will not result in a duplicate item.

I can’t – and I’ve tried – make this brief or readable in Flow. Here’s my best attempt:

1. INVOKE sort inList (presupposes a sort service)
2. MAP (fabricate lastItem variable)
3. LOOP over inList
3.1 BRANCH (evaluate labels)
3.1.1 inList L_EQUALS lastItem: SEQUENCE
3.1.1.1 INVOKE pub.list:addToList (inList, outList)
3.1.1.2 MAP (inList -> lastItem)
4. MAP (drop lastItem variable)

This says to me “here’s an algorithm: read it”, whereas the Set example says to me “throw everything into a Set, then get it out again”.

I have not found a BIS to sort a list.

I’d hoped this was a common enough requirement that Flow had anticipated it. For example, if there were an option to addToList that suppressed duplicate items, the Flow could have been:

4. LOOP over inList
4.1 pub.list:addToList (inList, outList, noDuplicates=true)

That’s much more readable to me. An efficient implementation would require some cached indexing, but if it were a BIS that would be WM’s problem.

Topic		Replies	Views
vectorToArray Vs appendToDocumentList EDI	14	2359	April 2, 2021
Null elements in output array when mapping EDI to output FF schema EDI	58	10970	April 2, 2021
Pub.client:http send String to post in the body instead of URL Application-Platform	24	3484	July 15, 2021
Making a Prime Number with a Flow Service Integration-Server	9	4149	August 20, 2021
webMethods Flow Tutorial - No.4 Create a LOOP Operation Knowledge base Integration-Server , tutorial	0	8949	April 7, 2014

Flow problem: reduce a list to uniques

Related topics