Most efficient way to search a large recordlists for a specific key

Background

During the course of this B2B integration I build a recordlist (list A) of telecomm circuit components. Each entry has a key called componentID. I use this list to submit a request to an internal system that provides a price for each circuit component. Its results are also stored in a record list, but not every component can be successfully priced so there may not be an entry in the pricing list (list B) for every circuit component in list A.

Once the pricing service completes I need to build a response document that combines the telecomm circuit details from list A with the pricing from list B. I loop over list A and need to find the corresponding pricing for each component from list B.

Problem

What is the most efficient way to look up these prices? Obviously, I can loop over list B until I find the componentID that matches the one from list A. However, this is very inefficient and gets progressively worse as the number of elements in the list grows.

Current Solution

My current approach is to use a simple custom java service (recordListToSearchableIData) to convert list B into an IData where each item in the list becomes a key value pair in the IData where the keyName is the value of the componentID field and the key value is the entire record from the original record list. This allows me to use another very simple java service (searchableIDataLookup) that takes advantage of the IData type’s hashtable-like properties to locate the desired entry in the pricing “list” using the IDataCursor’s first() method.

The net effect is that I loop over list B once to build what I’m calling a “searchableIData” and then locate each element with a single call to the IDataCursor.first() method followed by a call to the getValue() method.

This works, and appears to be fairly fast but is it the best way?

Mark

I would say yes. As I was reading through your description, I immediately started thinking of using a java.util.Hashtable object. I’ve done this for a couple of different integrations and it works well, though size of the table was never a concern.

Is there any reason that a java.util.Hashtable solution would be preferable to using an IData (which is just a hashtable on steroids after all)?

-mdc

I don’t think so. Hashtable has some constraints but those may not be applicable for you. If you’ve got an implementation using IData and its performance is okay then I’d stick with it.

Here are the utils I built to create the “searchableIData” from a recordlist and to find a single entry. The “test” service shows how the two services can be used together.

Mark

ConnevaUtils package containing the two “searchableIData” java services
ConnevaUtils.zip (13.1 k)

You can do this in another way.

  1. Convert the record to document
  2. Using XQL, extract the portion of document using your criteria.
    eg. //Components[componentID/text()=‘test’]
  3. Convert this extracted information to record.

I think this will be faster.

Abdul

Abdul,

I had thought of that at one time, but did not carry it forward into a prototype. I’ll give it a shot and maybe run some performance tests.

Thanks for the suggestion,

Mark

OK, that works and appears to be faster, but only if I hardcode the componentID string value. How do I create an XQL query that uses the contents of a variable (in this case componentID) as the search criteria?

My current XQL query is:
//serviceOrderComponents[componentID/text()=%componentID%/pricingInfo

But that doesn’t work because it’s interpreting the ‘%’ as part of a regex expression.

A brief perusal of the Developer docs did not yield an answer.

Hmmm. Maybe there’s a way to dynamically generate the “fields” recordlist and pass that to the queryDocument service…

Mark

Yep, you can build the “fields” recordlist setting the query to something like the following (note the quotes around the variable name):

//serviceOrderComponents[componentID[0]/text()=’%componentID%’]/pricingInfo

It’s a little tricky since the variable produced by the queryDocument service is not in the “serviceOut” section of the pipeline editor, but it does in fact work.

Nice tip, Abdul, I’ll add that one to my “bag o tricks”.

Mark

Try creating your XQL query string with your variable earlier in the flow and check the checkbox with ‘Perform variable substitution’. Then map this string into the query field of the pub.xml:queryXMLNode service. I know I got variables to work in the XQL before…

I have looked at wM Building Your TN .pdf documentation regarding XQL query syntax but still can’t seem find a match with the pub.xml:queryXMLNode. Perhaps someone can find my mistake. Here is what my XML looks like:

<IDataXMLCoder version=“1.0”>
<array>
<record javaclass=“com.wm.data.ISMemDataImpl”>
<value name="@record-id">GLB</value>
<value name=“gcin”>052035185</value>
<value name=“src_sys_cust_idntn”>HESSNY </value>
</record>
<record javaclass=“com.wm.data.ISMemDataImpl”>
<value name="@record-id">GLB</value>
<value name=“gcin”>053382495</value>
<value name=“src_sys_cust_idntn”>FEDHOME </value>
</record>
</array>
</IDataXMLCoder>

What I would like to extract are all the <value> elements whose @name attribute = ‘gcin’ and whose ‘src_sys_cust_idntn’ = ‘FEDHOME’

The XQL query I’m using in the service is:
//value[(@name = ‘src_sys_cust_idntn’) and (value = ‘FEDHOME’)]

I think the following does what you’re looking for:

//record[value[@name=‘src_sys_cust_idntn’]/text() = ‘FEDHOME’]/value[@name=‘gcin’]/text()

One question though: any reason why you don’t reconstruct the IData from this and use the cursor traversal or just query the document Node directly instead of persisting as an IData XML? Assuming the document comes in like:

<document>
<glb>
<gcin>052035185</>
<src_sys_cust_idntn>HESSNY</>
</>
<glb>
<gcin>053382495</>
<src_sys_cust_idntn>FEDHOME</>
</>
</>

the query gets rid of the attribute complications e.g.
//GLB[src_sys_cust_idntn/text() = ‘FEDHOME’]/gcin/text()

hth,
Ed

Thanks Ed - Your XQL query worked! The reason I am not using the XML structure you described above is b/c the XML that gets generated by one of my flow services (pub.document.documentToXMLString) lost my Root node. I would have thought it would have created on by default (e.g., <document> or something), but it did not.

So the result was that I had 3 root nodes and thus could not query all of them with one service call. I noticed that by using pub.document.documentToXMLValues, I got a built-in Root node of <idataxmlcoder>.

Hi all,

I’m trying to use “queryXMLNode” to obtain from my input XML document (in fact is the XML format of a SAP idoc) only the data that I need. I’ve been able to get elements thanks to WQL or XQL, but I’m unable to return a node with child elements and convert it to a document type, or even to a document list.

As an example, suppose that’s my XML file:

<root>
<element1>
<element2>
<var_a>00</var_a>
<var_b>01</var_b>
<var_c>02</var_c>
</element2>

&#60;element2&#62; 
&#60;var_a&#62;03&#60;/var_a&#62;  
&#60;var_b&#62;04&#60;/var_b&#62;  
&#60;var_c&#62;05&#60;/var_c&#62;  
&#60;/element2&#62; 
.... 

</element1>

<element1>

</element1>
</root>

How can I get all “element1” nodes and convert them into a document list? Or even “element2” nodes. It’s possible using queryXMLNode?

Any idea?

Thanks in advance,

Ignasi

Hi all again,

Is there some way to obtain only some of the childs nodes from a parent one. The scenario is that I don’t need all the data included in the XML document, and I’m trying to convert it to a “lighter” version. So I’m omitting some fields in the structure needed.

I’ve already obtained with queryXMLNode the list of main nodes that I need, then I use xmlNodeToDocument to tranform it to a document type. The main problem is that even I don’t define some fields in the data type that I’m passing as documentTypeName, they appear in my final document!! How this can be possible?

Thanks in advance,

Ignasi

The problem that I have is related to a very common requirement - joining two document lists using a key - like querying two database tables joined by a foreign key. I searched the forum and found numerous threads hovering around the same topic, just like this one and the solution seems to lie with the mysterious (for me) queryXMLNode. I read the BISReference Guide and tried to use queryXMLNode but to no avail… I just couldn’t make it to work. Populating the input (fields) beats me. I even tried to read the TN User guide for XQL reference, but that didn’t help either.

I guess I need some sample service to demonstrate how queryXMLNode works and how I can combine fields from two different document list on the basis of a key.

In my case, I have a string list ‘appIds’ (consisting of keys called appId) and a document list that looks like this:

I finally want to create a document list with one document for every ‘appId’ in the string list ‘appIds’. Each of these document should contain ‘appId’ and the corresponding ‘appName’ taken from the document list shown above. Guys I would greatly appreciate any help.

Thanks in advance,
Rohit

Please don’t post duplicate entries.

Mark