How does Tamino store non-XML object

Guest · December 5, 2001, 10:40pm

Hello,
what kind of structure does Tamino store non-XML object in DB ? tree ? or file ?
Can I build index on those non-XML object ?
How does Tamino organize index ? Does Tamino optimize index for query ?

Regards

Guest · December 6, 2001, 3:42pm

Currently Tamino doesn’t build indexes on non XML objects. You can only query by ino:id or document
name.
What sort of non XML would you like to index?

Guest · December 6, 2001, 8:58pm

Thanks,Nigel

I wonder if Tamino can not build index on non XML object,how can Tamino speed up query operation? Especially if a XML document contains those objects.
By the way,could you tell me the physical storage structure of non XML object.

Guest · December 7, 2001, 2:28pm

Currently Software AG don’t describe how non XML is stored
But essentially you can think of it as a blob.

What I do to index HTML is as follows,

The tool that loads the nonXML reads any HTML extracts meta data and content words and creates meta data . Both metadata and non XML are stored.
The meta data references the documents.
The application queries the meta data and get Tamino URLs. The application uses these URLs.

The meta data implemenation is based on RDF and implements the Dublin Core document meta data standard.

The method is also applicable to other non XML formats - you just need to build a component that extracts the data for each document type.

The indexing method is uniform for all possible metadata vocabularies so there is no fiddling about with Tamino Schemas.

You can extend the meta data for instances manually - for instance if it is missing for an instance or if the NonXML document has no meta data.

This should also be implementable as a server extension but I haven’t done that yet.

If you want a copy of the implemenation just ask

Fernando_Ito1 · June 30, 2004, 1:26am

Hi Nigel,

As I can see, Tamino’s search engine is based on Dublin Core and RDF implementation. I’ve been studying how can I create an application which brings me nixe documents based on semantics.
I wonder if is necessary to create any additional schema into Tamino to do that because nixe has every information we use to query (description, subject, etc). My doubt is how can I use loadlists and stoplists to store every word in “doctype/properties/content” element. Should I set something on Tamino or I have to create an application to do this kind of indexing stuff?

Thanks in advance, Ito

Juliane · July 14, 2004, 4:15pm

Hi Ito,

please note that Nigel’s answer to this thread dates from December 01.
Things have changed in the meantime. You are mentioning two different
topics in your post.
Loadlists and stoplists are terms from the realm of text retrieval. You can define loadlists in Tamino to declare certain terms as crucial to
your application and thus enhance querying using text retrival. This is documented with Tamino 4.2 and works for any Tamino collection, nixe or
other. Tamino 4.2 does not document the concept of stoplists though if
you are in need of those, we can do something for you.
The concept of nixe is different. Tamino offers the possibility to store
non-XML documents accompanied by so-called shadow documents. These are
XML documents that allow better querying. The query results are than excerpts of the shadow document. This information can then be used to
obtain the original data requested. The user must provide both a schema for the shadow documents and an SXS function that maps the non-XML data into an XML instance adhering to that schema. The nixe project now offers such schema-SXS pairings for certain well-known non-XML data formats as
Word, PDF, etc. With Tamino 4.2 the nixe sources are packed with Tamino.

Regards,
Juliane.

Topic		Replies	Views
How does tamino store binary objects? Tamino	3	11033	April 2, 2021
Storing and retrieving non-xml data Tamino	6	3739	April 2, 2021
Confused about Non-XML Indexer Tamino	5	12247	April 2, 2021
Loading non-XML Tamino	5	4509	April 2, 2021
How does Tamino work internally? Tamino	2	3590	April 2, 2021

How does Tamino store non-XML object

Related topics