xquery performance

Pamina · May 10, 2006, 9:00pm

Hi all, we have a collection with 18766 very small and simple documents: <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:tsd=“http://namespaces.softwareag.com/tamino/TaminoSchemaDefinition” xmlns:xs=“XML Schema”> xs:annotation xs:appinfo <tsd:schemaInfo name=“Index”> <tsd:collection name=“Index”/> <tsd:doctype name=“Index”> tsd:logical tsd:contentclosed</tsd:content> </tsd:logical> </tsd:doctype> tsd:adminInfo tsd:versionTSD4.2</tsd:version> tsd:created2005-07-28T09:15:52.912+01:00</tsd:created> tsd:modified2006-03-24T09:22:38.980+01:00</tsd:modified> </tsd:adminInfo> </tsd:schemaInfo> </xs:appinfo> </xs:annotation> <xs:element name=“Index”> <xs:complexType mixed=“true”> <xs:choice minOccurs=“0” maxOccurs=“unbounded”> <xs:element ref=“subIndex”/> </xs:choice> <xs:attribute name=“id” type=“xs:string”> xs:annotation xs:appinfo tsd:attributeInfo tsd:physical tsd:native tsd:index tsd:standard/ </tsd:index> </tsd:native> </tsd:physical> </tsd:attributeInfo> </xs:appinfo> </xs:annotation> </xs:attribute> <xs:attribute name=“type” type=“xs:string” use=“required”> xs:annotation xs:appinfo tsd:attributeInfo tsd:physical tsd:native tsd:index tsd:standard/ </tsd:index> </tsd:native> </tsd:physical> </tsd:attributeInfo> </xs:appinfo> </xs:annotation> </xs:attribute> <xs:attribute name=“subType” type=“xs:string”/> <xs:attribute name=“order” type=“xs:string” use=“required”> xs:annotation xs:appinfo tsd:attributeInfo tsd:physical tsd:native tsd:index tsd:standard/ </tsd:index> </tsd:native> </tsd:physical> </tsd:attributeInfo> </xs:appinfo> </xs:annotation> </xs:attribute> <xs:attribute name=“available” type=“xs:integer” use=“required”> xs:annotation xs:appinfo tsd:attributeInfo tsd:physical tsd:native tsd:index tsd:standard/ </tsd:index> </tsd:native> </tsd:physical> </tsd:attributeInfo> </xs:appinfo> </xs:annotation> </xs:attribute> <xs:attribute name=“tot” type=“xs:integer” use=“required”> xs:annotation xs:appinfo tsd:attributeInfo tsd:physical tsd:native tsd:index tsd:standard/ </tsd:index> </tsd:native> </tsd:physical> </tsd:attributeInfo> </xs:appinfo> </xs:annotation> </xs:attribute> <xs:attribute name=“note” type=“xs:string”/> <xs:attribute name=“rif” type=“xs:string”/> </xs:complexType> </xs:element> <xs:element name=“subIndex”> xs:complexType xs:simpleContent <xs:extension base=“xs:string”> <xs:attribute name=“id” type=“xs:string”> xs:annotation xs:appinfo tsd:attributeInfo tsd:physical tsd:native tsd:index tsd:standard/ tsd:text/ </tsd:index> </tsd:native> </tsd:physical> </tsd:attributeInfo> </xs:appinfo> </xs:annotation> </xs:attribute> <xs:attribute name=“subType” type=“xs:string”/> <xs:attribute name=“available” type=“xs:integer” use=“required”> xs:annotation xs:appinfo tsd:attributeInfo tsd:physical tsd:native tsd:index tsd:standard/ </tsd:index> </tsd:native> </tsd:physical> </tsd:attributeInfo> </xs:appinfo> </xs:annotation> </xs:attribute> <xs:attribute name=“tot” type=“xs:integer” use=“required”> xs:annotation xs:appinfo tsd:attributeInfo tsd:physical tsd:native tsd:index tsd:standard/ </tsd:index> </tsd:native> </tsd:physical> </tsd:attributeInfo> </xs:appinfo> </xs:annotation> </xs:attribute> <xs:attribute name=“note” type=“xs:string”/> <xs:attribute name=“rif” type=“xs:string”/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> </xs:schema> but this query: declare namespace xs=“XML Schema” declare default collation “collation?language=it;strength=secondary” let $q := for $a in input()/Index[@type=“author” and @tot>0] sort by (./@order ascending) return $a takes about 15 seconds (and involves 6736 documents). This is too long! I guess this is not normal and we missed something! Any kind of tuning suggestion is welcome. Thanks in advance, Pam

Finn_the_Dane1 · May 10, 2006, 9:40pm

Hi Pam, I think the collation declaration is the culprit ! The purpose of having indexes is to optimize search and sort. If you then specify an alternate language and strength from when the index was generated, I guess this will force Tamino NOT to use the know indexes. The solution, I think, would be to remove the collation etc from the query and instead set these special properties in the schema. regards Finn

Pamina · May 11, 2006, 2:17pm

Hi Finn! I have removed the collation etc from the query and this query: declare namespace xs=“XML Schema” for $a in input()/Index[@type=“author” and @tot>0] sort by (./@order ascending) return $a takes only four seconds! Please can you explain how to set these special properties in the schema? Thanks in advance! Pam

Finn_the_Dane1 · May 11, 2006, 2:28pm

Hi Pam, These properties are in the “collation” properties in the locical properties of each field if you use the schema editor. BTW I’m almost willing to make a bet that the four seconds responsetime is mainly cause by your IE-browser building up a DOM tree with the 6736 docs So - do you really need the full resultset ? Finn

Pamina · May 11, 2006, 3:19pm

Hi Finn! We need the full ordered resultset in our application because we have to browse it. Perhaps, in future, we’ll change method! Thank you very much for your precious help! Pam

Finn_the_Dane1 · May 11, 2006, 3:36pm

Why not ask for bits of the result ie. 100 at the time ? Tamino has excellent support for this with the filter [position() >100 and position()<200] Finn

Pamina · May 11, 2006, 4:59pm

Hi Finn, what do you mean? Our really xquery is like this: declare namespace xs=“XML Schema” let $q := for $a in input()/Index[@type=“author” and @tot>0] sort by (./@order ascending) return $a let $name := “BONIFACIUS” let $offset := 10 let $termPosition := 0 let $pos := xs:integer(count($q[./@order < $name])) + 1 let $delta1 := min(($pos - $termPosition, count($q))) let $startPos := min (( max((1, $delta1)),count($q)-$offset+1 )) let $delta2 := max(($startPos,$pos + $offset -1 - $termPosition )) let $endPos := max( (min((count($q),$delta2)),$offset) ) return $q[position() >= $startPos and position() <= $endPos] It orders all document Index[@type=“author” and @tot>0] and returns 10 of these; the first is that whose @order is the nearer to “BONIFACIUS”. Have you any suggest? Thanks in advance! Pam

Finn_the_Dane1 · May 11, 2006, 6:01pm

OK - I lost that bet Isn’t there a way of filtering the the “vicinity” of BONIFACIUS (is that a constant???)directly in the query ? I would guess that the post-selection done with the 6700 documents would be relatively timeconsuming ! Another question why do you use the let $q := for $a … instead of just let $q := input()/Index … you don’t appear to be using $a for anything ? Finn

Pamina · May 11, 2006, 7:31pm

Hi Finn, your observation is right! “BONIFACIUS” is not a costant. Excuse me: do you want suggest that there is another way of filtering “vicinity” of “BONIFACIUS”, and of the other following ordered nine elements, directly in the query? Regards, Pam PS: Can I explain another question? I have set collation directly in the schema but the order of the result is changed!!! For example the terms that begins with "

Finn_the_Dane1 · May 11, 2006, 7:40pm

Hi Pam, In danish the letter "

Pamina · May 11, 2006, 8:32pm

Hi Finn, you have right, but the result must displayed exactly 10 record so, even if I search for the last term in Index collection, the result must be the last 10 records (or all the record if lower than 10). For example, if I search for “ZY”, the result is: [color=“darkblue”] [size=“9”]Zillio, Francesco, f. 1682-1692 Zmajevic, Andrija, arciv. di Antivari, m. 1694 Zuccoli, Ludovico, sec. XVI-XVII Zuccolo, Vitale, O.S.B., 1556-1630 <Index available=“0” id=“@7665” note=“Conte di Miranda [IAM].” order="ZUNIGA, JUAN DE, VICER

Finn_the_Dane1 · May 11, 2006, 8:59pm

Hi Pam I still believe the sorting can be fixed in the schema - if you take the time to have a deep look in the documentation. Regarding BONAFACIUS I came to think of an example the XQUERY guru Harald gave for finding the hits before and after if there was no exact match. If you search for “hong” but only “hang” and “hung” is present. Perhaps this kind of approach can be used ? Finn let $searchterm := “hong” let $match := input()//a[.= $searchterm] return if ($match) then $match else let $lower := max (input()//a/text()[. < $searchterm]) let $upper := min (input()//a/text()[. > $searchterm]) return input()//a[. =($lower, $upper)]

Pamina · May 18, 2006, 6:28pm

Hi Finn! I have not been yet able to find in the documentation what you suggest. …I have some problems for the definition of the collation (see my new topic if you want!). Can you help me? Thanks for all, Pam

Topic		Replies	Views
xquery performance Tamino	3	5623	April 2, 2021
about collation Tamino	20	13714	April 2, 2021
property collation Tamino	2	9666	April 2, 2021
Query performance Tamino	2	3211	April 2, 2021
Q: Xquery how to improve performance? Tamino	4	3232	April 2, 2021

xquery performance

Related topics