Document content indexing

Hello Community!

I have several questions on indexing document content with Non-XML Indexer.

1. The schema (template.tsd) supplied with the utility has an element with name “content” for storing document content. So if I would like to find all documents with word “tamino”, I guess I should ask something like this:

nonxml[properties/content ~= ‘tamino’]

But after closer look at this schema, I found no index on element “content”. Analysis function in TII confirmed that the query would not be optimized. Is this correct that if I would like to have fast search on document content, I should specify text index on element “content”?

2. Does option “With text index” for Non-XML doctype help somehow to index content of binary documents? Or it can only be used for text indepedently of Non-XML Indexer?

3. I also haven’t found any text index for document content in xdav_nonXML.tsd. Should I specify it to optimize DASL queries on document content and where?

Thanks in advance,
Alexander

Hello Alexander,

1. yes, you are right, your query should find all documents with “tamino” in content. template.tsd is what the name says, a template. So you are free to taylor the indexes to your specific needs.

2. text index for binary docs does only make sense in conjunction with NonXMLIndexer.

3. xdav_nonXML.tsd is exactly the same as template.tsd, only the schema- and doctype name set to xdav_nonXML, as required by Tamino WebDAV Server.

Regards,
Martin