How to retrieve a lot of documents with ino:id

HC_Hammerstoft · January 23, 2002, 10:15pm

Hi,

I am trying to retreive a large amount of documents with their respective ino:id’s.

Querying through a browser or “Interactive interface” cannot handle this.

Propagator ignores ino:id.
Data loader creates a ino:object tag but ignores ino:id.

Any suggestions?

Regards
HC Hammerstoft
Software engineer

system · January 24, 2002, 7:21pm

http://servername/tamino/dbname/collection?_xql=doctype should serve you all the data from the doctype within an ino:response xql:result-envelope, with each document carrying the ino:id attribute.

If you’re doing this from within a programming environment, you could use a sax parser to tokenize the big chunk in documents.

Helps?

Best regards, Andreas

HC_Hammerstoft · January 24, 2002, 8:19pm

The problem is the size of my collection.

If I use Propagator I the resulting file is app. 14 MB (1600 documents) works fine.

If I use a direct URL like
http://server/tamino/db/col?_xql=doctype
my browser is working very hard giving up

If I split it up like:
http://server/tamino/db/col?_xql(1,100)=doctype
http://server/tamino/db/col?_xql(101,100)=doctype
etc.
can I then be sure to get all documents?

(I need the value of the ino:id to introduce a new attribute id in my documents to obtain persistence.)

Regards
HC Hammerstoft

Guest · January 24, 2002, 9:11pm

Why don’t you write a program that does a loop reading the documents (say) 10 or 100 at a time using one of the APIs and writing them to the new doctype?

HC_Hammerstoft · January 24, 2002, 9:43pm

Yes I guess that is the solution, but I just wanted to make sure that such a program not already had been written eg. in the inoxmld program.

Thanks for your input.

Regards
HC Hammerstoft

Hermann_Gundel · January 29, 2002, 8:48pm

One way to do it is: set up a CURSOR with quantity let’s say 100 with the following query:
…tamino/mydb/mycoll?_cursor=open&_xql=/mydoctype/@ino:id

Then you can FETCH the cursor and loop over the ino:id’s and do a HTTP GET for each id(eg http://myhost/tamino/mydb/mycoll/@10 ).

With this you get back the plain XML documents one by one and can manipulate them (e.g. strip prolog, add ino:id and (perhaps) a wrapper). Repeat until finished. Thats the way its done in inoxmld.
Using CURSORs is a convenient way for tasks like yours.
All the best,
Hermann Gundel

HC_Hammerstoft · January 30, 2002, 3:32pm

Thank you Hermann, that was exactly what I was looking for.

Regards HC Hammerstoft

Topic		Replies	Views
indexing ino:id Tamino	4	4052	April 2, 2021
without ino:id ??? Tamino	3	3840	April 2, 2021
ino:id's and re-indexing Tamino	4	3674	April 2, 2021
XQuery ino:id Tamino	3	3562	April 2, 2021
How to unload and load collections that also contains NonXML Tamino	2	4477	April 2, 2021

How to retrieve a lot of documents with ino:id

Related topics