ino:id's and re-indexing

The ino:id attributes given by tamino seem to be added sequentially ie, 1, 2, 3 etc. However, when a record, or set of records in a collection are deleted, the ino:id’s don’t seem to be reused.

Is there a way, either through the Java API, or HTTP protocol that a collection can be prompted to re-index, or does this occur automatically within a set period, if at all?

Hi,

ino:id is associated with one document for life. This is good, because currently we have no other means of an unique identifier under the control of the database server. Although we do not recommend using it as a unique property, it is often used as just that. Now, imagine re-indexing, assigning each document a new id… all your associations based on this ino:id are gone!

ino:ids of deleted documents get eventually recycled, i.e. assigned to new documents.

ino:ids are assigned on a per doctype basis, not a per collection basis (as in earlier versions of Tamino, anyone remeber that?).

Okay?

Best regards,

Andreas

Thanks again Andreas, but I don’t think I explained the problem too well.

I’ll give an example . . .

I have 5 xml documents, with ino:id 1,2,3,4 and 5, then delete one of these, say 4, leaving 4 xml documents in my collection, which have ino:id 1,2,3 and 5.

Then I add another couple of documents, which, for arguments sake, are assigned ino:id 6 and 7. If I were to query the whole database now, I would get a result of count 6, and the ino:id’s of the returned documents would be 1,2,3,5,6, and finally, 7.

Say I had not deleted document 4, and still added two more documents, I would have 1,2,3,4,5,6 and 7. Note that the last document has the ino:id the same as the count, which is what I am after, as this provided a simple way of identifying the next ino:id to be assigned, and I could use the ino:id as a unique identifier on its own, rather than generate another unique identifier to work alongside. I just figured that the best way of doing this would be to reindex the collection.

Also, is there an alternate way of finding what ino:id will be assigned next, other than this (rather dodgy/unstable) solution that I have described above?

======
John K

Sorry, but you’re down a dangerous path!

quote:
Originally posted by Jay Kay:
Thanks again Andreas, but I don’t think I explained the problem too well.

I’ll give an example . . .

I have 5 xml documents, with ino:id 1,2,3,4 and 5, then delete one of these, say 4, leaving 4 xml documents in my collection, which have ino:id 1,2,3 and 5.


Just a minor wording issue: what you’re talking about is most probably a doctype, whereas a collection is a uh, well, collection of doctypes. ino:id are assigned per doctype. So, say, your doctype is “email” based in collection “groupware”, and you have five emails stored with ino:id 1…5, from which you delete the fourth. Okay?

quote:

Then I add another couple of documents, which, for arguments sake, are assigned ino:id 6 and 7. If I were to query the whole database now, I would get a result of count 6, and the ino:id’s of the returned documents would be 1,2,3,5,6, and finally, 7.

Say I had not deleted document 4, and still added two more documents, I would have 1,2,3,4,5,6 and 7. Note that the last document has the ino:id the same as the count, which is what I am after, as this provided a simple way of identifying the next ino:id to be assigned, and I could use the ino:id as a unique identifier on its own, rather than generate another unique identifier to work alongside. I just figured that the best way of doing this would be to reindex the collection.



Ok, I get your intention. Sadly, I can only say:

If you need the number of documents matching a query, use X-Query function count(email).

If you need a really unique identifier, please generate one yourself. The problem with using ino:id is that they are not necessarily available. For example in conjunction with X-Node (mapping SQL-content into XML-documents) there is no ino:id assigned to content that resides completly outside of Tamino.

ino:id is Tamino’s identification mechanism for documents, and it’s under the sole control of the server. There is no re-index function, as it is not an index but a record-identifier rather.

The ino:id is brought to you for the sole purpose of identifying a previously stored document. In fact, the presence of the ino:id attribute is the only thing that lets Tamino perform an update on the existing doc, instead of inserting a new one.

quote:

Also, is there an alternate way of finding what ino:id will be assigned next, other than this (rather dodgy/unstable) solution that I have described above?

======
John K


Sorry, not that I am aware of. Please consider using a unique identifier (the like java.rmi.UID provides) best practice when dealing with such issues, and using ino:id being depreceated. Please, do not use ino:id in any other way than for performing updates. Leave ino:id to the server. It’s its!

Best regards,

Andreas