Document content indexing

system · June 23, 2003, 6:21pm

Hello Community!

I have several questions on indexing document content with Non-XML Indexer.

1. The schema (template.tsd) supplied with the utility has an element with name “content” for storing document content. So if I would like to find all documents with word “tamino”, I guess I should ask something like this:

nonxml[properties/content ~= ‘tamino’]

But after closer look at this schema, I found no index on element “content”. Analysis function in TII confirmed that the query would not be optimized. Is this correct that if I would like to have fast search on document content, I should specify text index on element “content”?

2. Does option “With text index” for Non-XML doctype help somehow to index content of binary documents? Or it can only be used for text indepedently of Non-XML Indexer?

3. I also haven’t found any text index for document content in xdav_nonXML.tsd. Should I specify it to optimize DASL queries on document content and where?

Thanks in advance,
Alexander

Guest · June 24, 2003, 4:58pm

Hello Alexander,

1. yes, you are right, your query should find all documents with “tamino” in content. template.tsd is what the name says, a template. So you are free to taylor the indexes to your specific needs.

2. text index for binary docs does only make sense in conjunction with NonXMLIndexer.

3. xdav_nonXML.tsd is exactly the same as template.tsd, only the schema- and doctype name set to xdav_nonXML, as required by Tamino WebDAV Server.

Regards,
Martin

Topic		Replies	Views
Confused about Non-XML Indexer Tamino	5	12247	April 2, 2021
How does Tamino store non-XML object Tamino	6	3569	April 2, 2021
How does tamino store binary objects? Tamino	3	11033	April 2, 2021
Search across multiple document types Tamino	2	3190	April 2, 2021
Data space management Tamino	2	2974	April 2, 2021

Document content indexing

Related topics