I am wondering if non XML-compliant HTML (pre XHTML) could be stored and indexed in Tamino use the non-XML indexer. If this was done, what would the structure of the XML document look like and what queries would it support (eg. what metadata available)?
if you index an html file, you will get the flat content in the “content” Element of . Now you could make text search on the content. No more metadata is extracted currently.