Very slow motion ...

Hi,
For example, I have a result set about 41300 in size like the following…













I am now using jsp & servlet and using the following coding with org.w3c.dom API to get the values of position.sequence (i.e. A02, A04 etc) in distinct values…

public String[] getDistinctAttributeValues(TXMLObjectIterator iterator, String tag)
throws Exception{
int posTag = 0;
int itemNo = 1;
ArrayList list = new ArrayList();
while (iterator.hasNext()){
Document doc = (Document) iterator.next().getDocument();
String attribValue = doc.getElementsByTagName(tag).item(posTag++).getAttributes().item(itemNo).getNodeValue();
if (!list.contains(attribValue))
list.add(attribValue);
}
String[] sortlist= (String[])list.toArray(new String

[list.size()]
);
Arrays.sort(sortlist);
return sortlist;
}

However, it takes ~2000 seconds… very slow, is the coding has some errors and how to modify it?

Thanks

hi,

have you considered using a sax parser instead of a DOM parser?
to me, it looks as if the overhead for building a dom tree out of these small instances could be enormous.
(this is more than likely only one part of the problem.)

regards,
andreas f.

Hi there,

I loaded 41300 documents and with a java program that uses the Tamino API for Java SAX event based parsing I can read in all 41300 documents in 13 seconds. In my test data I generated 50 distinct values so sorting took less than 1 second.

So I would follow Andreas’s advice and use SAX.

hope this helps, Stuart

How to use SAX in my situation?

Actually, I am new in this area. Where can I find the related information? Thanks a lot.

There is an example of how to use SAX with the Tamino API for Java by clicking on the following link. Hope this helps.

http://tamino.forums.softwareag.com/viewtopic.php?p=2322

Hi, I’ve used the following code to try to follow the example given in the zip file…

TConnectionFactory connectionFactory = TConnectionFactory.getInstance();

TConnection connection= connectionFactory.newConnection(dbconnect);

MessageDefaultHandler messageDefaultHandler = new MessageDefaultHandler();

DocumentDefaultHandler docDefHandler = new DocumentDefaultHandler(messageDefaultHandler);

ElementDefaultHandler elDefHandler = new ElementDefaultHandler(messageDefaultHandler);

TSAXObjectModel saxObjectModel = new TSAXObjectModel(“MessageSAXObjectModel”, Message.class, Message.class, docDefHandler, elDefHandler);

TXMLObjectModel.register(saxObjectModel);

TXMLObjectAccessor accessor = connection.newXMLObjectAccessor(TAccessLocation.newInstance(“cnews”), saxObjectModel);

TResponse resp = accessor.query(TQuery.newInstance(xql));

where cnews is the collection name an xql is the query ‘/NewsML’ for searching the root node of the XML document…
I have XML documents in the database and they are in the following format…












However, it prompts out the following Exception:

3com.softwareag.tamino.db.api.accessor.TQueryException Response could not be built NestedException:Response could not be built for XML access. NestedException:Interpreting the input stream for JDOM failed! NestedException:Error in building: null

How to solve this, thx

Actually, my aim is to retrive the value of Element A and the attribute values of C … hope I can do it quickly in using SAX