full text search optimization

hello,

I’ve optimization problems. In order to search in a node and all its children, i used tf.containsText() function this way

let $ensemble:=(
for $a in input()/document where tf:containsText($a/root,“paris”)
return $a)

A search in the following document didn’t gave me expected results :



paris


To correct this, I changed my query :

let $ensemble:=(
for $a in input()/document where $a/root//*[tf:containsText(.,“paris”)]
return $a)

I get all expected results. Problem is that this kind of request takes more than 15 seconds. All text nodes have been indexed using ‘text’ indexes, all attributes using ‘standard’ indexes, my database has 13000 documents, them weight of each of them is between 10 and 20ko.

Did i make mistake indexing documents ? Is there a way to perform better full text search in a complete document ?

Arnaud

Hi Arnaud,

without knowing oyur data, I can only guess, but I think that the problems with your first query is that you did not set the XML property “markup as delimiter” to “yes” or “mixed”. This causes all text nodes to be concatenated, so CityParis becomes “CityParis” which does not match your query. Setting the property changes this. Unfortunately, you will have to recreate text index afterwards. Normally it should not be necessary tro have a text index on each node, a text index on root should be sufficient for your purpose. Having many nested indexes causes considerable effort for insertion and update

Regards

Harald

Hello Harald,

You guest perfectly well. I corrected these two things (setting “markup as delimiter” to “yes”, and recreating index only on target nodes). It works perfectly well.

Best regards,
Arnaud