Hi,
when I have established a connection to Tamino (V4.1.1.1) and do a query against a certain collection, which contains about 20000 documents, my Java programm just hangs (because of the huge amount of returned data).
I have tried to limit the number of returned documents with the PAGE_SIZE in the query-method, but as I interpreted the documentation, the PAGE_SIZE just tells how many documents are fetched at one time (opposed to always fetching one document at a time, fetching more documents can enhance the performance):
>>>>>>>>>>>>>>
TResponse response = accessor.query(query , PAGE_SIZE);
TXMLObjectIterator xmlObjectIterator = response.getXMLObjectIterator();
// Write the XML content into a StringWriter
StringWriter stringWriter = new StringWriter();
while(xmlObjectIterator.hasNext()) {
try {
TXMLObject xo = xmlObjectIterator.next();
xo.writeTo(stringWriter);
} catch(Exception e) {}
}
<<<<<<<<<<<<<<<<
So the ObjectIterator automatically fetches the next PAGE_SIZE-amount of documents until all are fetched.
Can I limit the number of returned documents without having to read all of them?
Tanks for any help,
Robert
Have a nice day,
Rob
Hello Rob.
I’m not sure that I understand your question properly.
If you set the page size to “10” and then query for all 20,000 documents in the collection, you could get the first 5 from the iterator and then simply stop.
In this case 10 documents will have been sent from the Tamino Server to your application.
The iterator’s hasNext() method will return true until the 20,000th object has been returned, and the next() method will automatically retrieve the next page as required.
So, in the code you posted the usage of PAGE_SIZE is not too useful - all 20000 documents are retrieved in the loop, they are just sent from the Server in chunks of PAGE_SIZE rather than all 20000 at once.
Of course, performance can also be improved by carefully designing queries so that only the required documents (or fragments thereof) are returned, and placing indexes appropriately so that the queries execute as fast as possible.
It is faster to query for 20 documents and process them than it is to query for 20000 documents and process them…
Hope that helps,
Trevor.
Hi Trevor!
Thanks a lot. I tried to escape from the while-loop of the iterator when I reached the amount I wanted to read - and that works fine for small collections (with my big collection of 20000 the program still hangs).
But that maybe has other reasons too, which I will try to find.
The limitation basically works, so thanks,
Rob
Hi Rob,
if it would help you, perhaps you could post the code and schema (for the big collection) and we can have a look at the problem with you.
Thanks,
Trevor.
Hi Trevor!
Thanks for the offer, but I refined my query some more and then I get a smaller number of field back, so that is OK.
Thanks,
Rob