full text search optimization

arnaud_colin · January 20, 2006, 11:05pm

hello,

I’ve optimization problems. In order to search in a node and all its children, i used tf.containsText() function this way

let $ensemble:=(
for $a in input()/document where tf:containsText($a/root,“paris”)
return $a)

A search in the following document didn’t gave me expected results :

paris

To correct this, I changed my query :

let $ensemble:=(
for $a in input()/document where $a/root//*[tf:containsText(.,“paris”)]
return $a)

I get all expected results. Problem is that this kind of request takes more than 15 seconds. All text nodes have been indexed using ‘text’ indexes, all attributes using ‘standard’ indexes, my database has 13000 documents, them weight of each of them is between 10 and 20ko.

Did i make mistake indexing documents ? Is there a way to perform better full text search in a complete document ?

Arnaud

Dr_Harald_Schoening · January 24, 2006, 6:49pm

Hi Arnaud,

without knowing oyur data, I can only guess, but I think that the problems with your first query is that you did not set the XML property “markup as delimiter” to “yes” or “mixed”. This causes all text nodes to be concatenated, so CityParis becomes “CityParis” which does not match your query. Setting the property changes this. Unfortunately, you will have to recreate text index afterwards. Normally it should not be necessary tro have a text index on each node, a text index on root should be sufficient for your purpose. Having many nested indexes causes considerable effort for insertion and update

Regards

Harald

arnaud_colin · January 24, 2006, 7:14pm

Hello Harald,

You guest perfectly well. I corrected these two things (setting “markup as delimiter” to “yes”, and recreating index only on target nodes). It works perfectly well.

Best regards,
Arnaud

Topic		Replies	Views
using stop words Tamino	5	5292	April 2, 2021
how to index properly? Tamino	5	3945	April 2, 2021
[XQUERY] prolem xquery response too long Tamino	2	6154	April 2, 2021
Problem to get correct search results with tf:containsText() Tamino	4	6406	April 2, 2021
compound index and count() Tamino	5	5080	April 2, 2021

full text search optimization

Related topics