nixe, html and charset

I’m using the nonXML indexer (version1.0).
In the XML properties of my database my default encoding is UTF-8 (I can’t do it otherwise).

I have to add html documents on my webdav server. More precisely they are html documents generated from Word.
The indexer doesn’t recognize the correct mime/type of my document. It puts “application/octet-stream” rather than text/html and so

I have done some tests and concluded that the indexer worked properly only if the charset of the document is UTF-8 and all the characters of the document are UTF-8 characters.

What can I do ?

Thanks for you answers.

What platform is this on?

The reason I ask is that I had a problem uploading XML files with non-XML extensions using mozilla on a Unix platform.

It turned out that the mime type “application/octet-stream” was used when an unknown file extension was used. To override the behaviour one had to populate /etc/mime.types or ~/.mime.types with a suitable extension to mime type mapping.

Of course this may be totally unrelated to your problem.

The database is installed on Unix.
The webdavserver is either on linux iether on windows.

But my problem deals nothing with file extension. Indeed, I use the same file -only charset is changed- and the result depends only on the charset which is used.