Java service to append document list using Vectors

Hi,

I’m trying to optimize my code by rewriting the AppendDocumentList Flow in Java so that instead of copy everything in large document lists, that it only appends elements.

I’m working off code I found on this forum to append a document to a document list via a vector. What I want to accomplish is appending a document list to another document list via a vector. Here’s what I have as code (no error catching for now) :


IDataCursor pipelineCursor = pipeline.getCursor();

// get "vector" and "inList" from pipeline 
Vector documents = (Vector)IDataUtil.getIData( pipelineCursor, "inList" );
Vector vector = (Vector)IDataUtil.get( pipelineCursor, "vector" );
pipelineCursor.destroy();

//append "documents" to "vector"
if ( documents != null ) {	
  vector.add(documents);
}

IDataCursor pipelineCursor1 = pipeline.getCursor();
IDataUtil.put( pipelineCursor1, "vector", vector );
pipelineCursor1.destroy();

But this does not seem to work for some reason. Anyone have any idea why?

Thanks!

The vector.add(documents) is adding the vector object referenced by documents rather than the elements within documents. Use vector.addAll(documents) instead.

Hi reamon, thanks for the reply. Changing the add to addAll did not fix the problem. When I trace the code, I see that the list I am trying to append to the vector is not empty, but after the service is called, it is not appended to the list.

I also tried changing the documents declaration as an IData variable but the code does not compile like this as add or addAll expects a Vector I think.

I did find a work around in which I use a loop and call a service that adds single documents of my list to the vector, but this seems like unnecessary overhead. I’m also intrigued as to how I can get this to work!! :slight_smile:

Thanks in advance for any help!

Olografix,

I’m a little confused by what you’re trying to accomplish with the code above. I see a few things that don’t look right to me:

  1. “inList” is a Document List, right? If so, you should grab it from the pipeline via IDataUtil.getIDataArray and not getIData.
  2. If inList is really a document list (ie. an IData array), why would you cast it to a Vector object? I don’t see how that would work.
  3. At the end of the service, you’re putting a Vector object in the pipeline. How do you plan on using this object? Are you going to write other Java services to operate on it? Sounds to me like what you actually want to do is to put an IData array in the pipeline and not a Vector.
  • Percio

Let’s back up a little and make sure we understand what we’re trying to do.

The primary goal is to use Vector instead of a document list (IData array) for creating and building a collection of documents. This is to avoid the performance issues associated with appending a document one at a time to an IData array.

The desired functions for this are:

  1. Create a new Vector
  2. Add a document to a Vector
  3. Add multiple documents (an IData array) to a Vector
  4. Add one Vector to another Vector
  5. Transform the Vector a document list (an IData array)

Code for 1, 2 and 5 is in this post.

For 3 and 4, slight tweaks to the code for 2 would do the trick.

[highlight=java]
// Code for adding a document list to the Vector
IDataCursor pipelineCursor = pipeline.getCursor();

// get Vector and docList from pipeline
IData docList = IDataUtil.getIDataArray( pipelineCursor, “docList” );
Vector vector = (Vector)IDataUtil.get( pipelineCursor, “vector” );
pipelineCursor.destroy();

//append all documents in docList to Vector
if ( docList != null) {
for(int i=0; i<docList.length; i++)
vector.add(document);
}

// Don’t need to do the following–vector is already in the pipeline
//IDataCursor pipelineCursor1 = pipeline.getCursor();
//IDataUtil.put( pipelineCursor1, “vector”, vector );
//pipelineCursor1.destroy();

///////////////////////////////////
// Code for adding the contents of a Vector to another Vector
IDataCursor pipelineCursor = pipeline.getCursor();

// get the vectors from pipeline
Vector v1 = (Vector)IDataUtil.get( pipelineCursor, “vector1” );
Vector v2 = (Vector)IDataUtil.get( pipelineCursor, “vector2” );
pipelineCursor.destroy();

//append contents of first vector to second vector
if (v1 != null && v2 != null) {
v2.addAll(v1);

}[/highlight]

Hope this clarifies things.

Thanks for the clarification Reamon. Everything you mention is perfectly clear and what I was looking for. My only problem is that when compiling I get an error saying “cannot resolve symbol”, “symbol : variable size” at “docList.size”.

ideas?

Should be docList.length. Sorry for the mistake.

Rob,

What you mentioned makes sense. It looks like some of the distinct functions were mixed up in the original post.

Olografix,

I believe Rob intended to type docList.length and not docList.size. Try that.

  • Percio

There you go. He was quicker on the trigger than I was. :slight_smile:

Thanks guys, you were more than very helpful!

Have a good one!

Hi,

I was the one posting code in the original thread on this subject. I just have some comment on this, also relevant in the original thread, that might be useful (although I’m not sure if this will affect performance in any noticeable way).

  1. The use of an ArrayList instead of Vector as the list type, might be better (or any other collection class as mentioned in reamon’s post below). The Vector is a synchronized object type, which in this case isn’t necessary (as far as I can tell) as the flow is running in a thread. The ArrayList (or other collection class) should therefore (theoretically at least) have better performance.

  2. In the case of using the ArrayList one could also pass the size of the document list as input to the appending service, setting that value as initialCapacity (default value 10), to avoid repeated re-sizing of the ArrayList.

This might be overkill but worth looking into, to fully optimizing the service.

Regards,
dezer

[Edit: Wrong information originally put in option no. 1. In option 2 the list size should be used as input to the creation service]

  1. Any collection class can be used. I’ve used LinkedList typically.

  2. Did you mean passing an initialCapacity to the vector creation service, not the appending service? That would be a nice addition. Also, in the appending service one could call vector.ensureCapacity with the sum of the existing capacity + the length of the docList to increase the vector capacity just once per append call.

Yes, I pass the size of the documentList to the service that creates my ArrayList.

does the code above tranforms the vector into documentlist?.

No, it does not.

I understand this is an old thread, but I would like to give my 2 cents about the alternative for appendToDocumentList by using java service of which posted by reamon’s.
By courtesy of reamon’s solution and question from Siddiraju, the solutions can be done by ArrayList and transform back into IData (DocList)


		// pipelineInput
		// IDataMap is alternative to Interface IDataCursor
		IDataMap iMap = new IDataMap(pipeline);
		
		// Instantiate List<IData> toList
		List<IData> toList = new ArrayList<>();
		
		// toList IData ? null : add to toList ArrayList
		IData[] toListIData = iMap.getAsIDataArray("toList");
		if (toListIData != null) {
			for (IData iData : toListIData) {
				toList.add(iData);
			}
		}
		
		// fromList
		IData[] fromList = iMap.getAsIDataArray("fromList");
		// fromList ? null : append toList ArrayList
		if (fromList != null) {
			for (IData iData : fromList) {
				toList.add(iData);
			}
		}
		
		// fromItem
		IData fromItem = iMap.getAsIData("fromItem");
		// fromItem ? null : append toList ArrayList
		if (fromItem != null) {
			toList.add(fromItem);
		}
		
		/**
		 * Create IData[] array template for converting ArrayList into IData[]
		 * object
		 */
		IData[] template = new IData[toList.size()];
		
		// pipelineOutput
		/**
		 * for sending pipeline output toList after converting List into IData[]
		 * template object
		 */
		iMap.put("toList", toList.toArray(template));

The objective is to:

  1. Create an ArrayList named toList to append all existing DocList and/or append all new incoming DocList and/or Document
  2. Capture incoming pipeline of IData of “toList” (named by var toListIData. The destination and also existing DocList if any) and then recursively append ArrayList of toList
  3. Capture incoming pipeline of IData “fromList” and then recursively append to ArrayList of toList.
  4. Capture incoming pipeline of IData “fromItem” and then append to ArrayList of toList.
  5. Create an IData “template” with size equal to ArrayList “toList” for converting ArrayList into IData itself for pipeline output (I suppose you could use existing IData e.g “fromList” for template, but I’m suggesting to create new IData just to make code looks cleaner and easier to read. It’s your call. This step can be skip).
  6. Parse the ArrayList “toList” into IData based on “template”, and throw to pipeline output.

DISCLAIMER
There some other things to consider that the time complexity for this operation is linear O(n) depends on recursion of the length of “toList” and “fromList”.
I haven’t obtain a solution for parse directly from IData to List/Vector/Set, so that I could append more easily without recursion.
I also I haven’t compare the performance against pub.list:appendToDocumentList. So I don’t know if my code is better in performance or not, I just modify from reamon’s code, of which rather from using Vector I use List and then eventually to parse into IData.

Hope this help.

Thanks

Hey, I forgot to mention that I’m compiling the code in Integration Server 9.12 (IS_9.12_Core_Fix21); Java Version: 1.8.0_202 (52.0).

And also attachment for input/output pipeline.

Thanks