about collation

Hi,
can you please help me to solve this problem?:

  • the result of the xquery

declare default collation “collation?language=it;strength=secondary”
for $a in input()/Index[@type=“author” and @tot>0] sort by (./@order ascending)
return $a

  • is different from the result of the xquery

for $a in input()/Index[@type=“author” and @tot>0] sort by (./@order ascending)
return $a

where the collation is defined in the Index schema

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:tsd=“http://namespaces.softwareag.com/tamino/TaminoSchemaDefinition” xmlns:xs=“http://www.w3.org/2001/XMLSchema”>
xs:annotation
xs:appinfo
<tsd:schemaInfo name=“Index”>
<tsd:collection name=“Index”/>
<tsd:doctype name=“Index”>
tsd:logical
tsd:contentclosed</tsd:content>
</tsd:logical>
</tsd:doctype>
tsd:adminInfo
tsd:versionTSD4.2</tsd:version>
tsd:created2005-07-28T09:15:52.912+01:00</tsd:created>
tsd:modified2006-05-11T15:49:53.062+01:00</tsd:modified>
</tsd:adminInfo>
</tsd:schemaInfo>
</xs:appinfo>
</xs:annotation>
<xs:element name=“Index”>
xs:annotation
xs:appinfo
tsd:elementInfo
tsd:logical
tsd:collation
<tsd:language value=“it”/>
<tsd:strength value=“secondary”/>
<tsd:caseFirst value=“off”/>
</tsd:collation>
</tsd:logical>
tsd:physical
tsd:native
tsd:index
tsd:standard/
tsd:text/
</tsd:index>
</tsd:native>
</tsd:physical>
</tsd:elementInfo>
</xs:appinfo>
</xs:annotation>
<xs:complexType mixed=“true”>
<xs:choice minOccurs=“0” maxOccurs=“unbounded”>
<xs:element ref=“subIndex”/>
</xs:choice>
<xs:attribute name=“id” type=“xs:string”>
xs:annotation
xs:appinfo
tsd:attributeInfo
tsd:physical
tsd:native
tsd:index
tsd:standard/
</tsd:index>
</tsd:native>
</tsd:physical>
</tsd:attributeInfo>
</xs:appinfo>
</xs:annotation>
</xs:attribute>
<xs:attribute name=“type” type=“xs:string” use=“required”>
xs:annotation
xs:appinfo
tsd:attributeInfo
tsd:physical
tsd:native
tsd:index
tsd:standard/
</tsd:index>
</tsd:native>
</tsd:physical>
</tsd:attributeInfo>
</xs:appinfo>
</xs:annotation>
</xs:attribute>
<xs:attribute name=“subType” type=“xs:string”/>
<xs:attribute name=“order” type=“xs:string” use=“required”>
xs:annotation
xs:appinfo
tsd:attributeInfo
tsd:logical
tsd:collation
<tsd:language value=“it”/>
<tsd:strength value=“secondary”/>
<tsd:caseFirst value=“off”/>
</tsd:collation>
</tsd:logical>
tsd:physical
tsd:native
tsd:index
tsd:standard/
tsd:text/
</tsd:index>
</tsd:native>
</tsd:physical>
</tsd:attributeInfo>
</xs:appinfo>
</xs:annotation>
</xs:attribute>
<xs:attribute name=“available” type=“xs:integer” use=“required”>
xs:annotation
xs:appinfo
tsd:attributeInfo
tsd:physical
tsd:native
tsd:index
tsd:standard/
</tsd:index>
</tsd:native>
</tsd:physical>
</tsd:attributeInfo>
</xs:appinfo>
</xs:annotation>
</xs:attribute>
<xs:attribute name=“tot” type=“xs:integer” use=“required”>
xs:annotation
xs:appinfo
tsd:attributeInfo
tsd:physical
tsd:native
tsd:index
tsd:standard/
</tsd:index>
</tsd:native>
</tsd:physical>
</tsd:attributeInfo>
</xs:appinfo>
</xs:annotation>
</xs:attribute>
<xs:attribute name=“note” type=“xs:string”/>
<xs:attribute name=“rif” type=“xs:string”/>
</xs:complexType>
</xs:element>
<xs:element name=“subIndex”>
xs:annotation
xs:appinfo
tsd:elementInfo
tsd:logical
tsd:collation
<tsd:language value=“it”/>
<tsd:strength value=“secondary”/>
<tsd:caseFirst value=“off”/>
</tsd:collation>
</tsd:logical>
tsd:physical
tsd:native
tsd:index
tsd:standard/
tsd:text/
</tsd:index>
</tsd:native>
</tsd:physical>
</tsd:elementInfo>
</xs:appinfo>
</xs:annotation>
xs:complexType
xs:simpleContent
<xs:extension base=“xs:string”>
<xs:attribute name=“id” type=“xs:string”>
xs:annotation
xs:appinfo
tsd:attributeInfo
tsd:physical
tsd:native
tsd:index
tsd:standard/
tsd:text/
</tsd:index>
</tsd:native>
</tsd:physical>
</tsd:attributeInfo>
</xs:appinfo>
</xs:annotation>
</xs:attribute>
<xs:attribute name=“subType” type=“xs:string”/>
<xs:attribute name=“available” type=“xs:integer” use=“required”>
xs:annotation
xs:appinfo
tsd:attributeInfo
tsd:physical
tsd:native
tsd:index
tsd:standard/
</tsd:index>
</tsd:native>
</tsd:physical>
</tsd:attributeInfo>
</xs:appinfo>
</xs:annotation>
</xs:attribute>
<xs:attribute name=“tot” type=“xs:integer” use=“required”>
xs:annotation
xs:appinfo
tsd:attributeInfo
tsd:physical
tsd:native
tsd:index
tsd:standard/
</tsd:index>
</tsd:native>
</tsd:physical>
</tsd:attributeInfo>
</xs:appinfo>
</xs:annotation>
</xs:attribute>
<xs:attribute name=“note” type=“xs:string”/>
<xs:attribute name=“rif” type=“xs:string”/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:schema>

Why is there that difference?

Every reply is appreciated!
Thanks in advance,

Pam

Hi Pam,
Have you remembered to recreate the indexes (or reload the data) after you changed the schema ?
Finn

Hi Finn,
I have recreated the indexes:

http://…/tamino/bav/Index?__encoding=utf-8&_admin=ino:RecreateIndex(“Index”,“Index”)

but without success!

I have also defined the schema with collation and without indexes and after redefined the schema with collation and indexes. Without success.

At the end, as you have suggested, I have reloaded the data but the result is the same: it seems that collation isn’t defined in the schema!!! (I have Tamino Server 4.2.1)

Thank you for all!

Pam

PS: just a curiosity: it is necessary to only define the collation for the attribute @order, is it true?

Hi Pam,
odd !!!???
Before answering you I did a small test with some documents with the same Danish special characters in to fields, one with Danish and one with French collation.
And then did a xquery with sort on one of the fields yields different (and correct!)results than on the other field - so it works for me ! :wink:
BTW I’m using v441 but this should also work in 421.

Could you give a small example with two docs that sort the wrong way ?

Finn

Hi Finn!

For example, if I execute the xquery:

declare default collation “collation?language=it;strength=secondary”
for $a in input()/Index[@type=“author” and @tot>0] sort by (./@order ascending)
return $a/text()

and the collation is not defined in the schema, the first and the last results displayed are:

[size=“9”][color=“blue”]? Asinius Pollio, C.
? Assemani, Giuseppe Luigi, m. 1782
? Guillelmus de Malliaco, O. P. ?, Sec. XIIImed.
'Abd al-Malik I (M

Hi Pam,
From my Danish point of view both sortings looks wrong !
In the first example "

Hi Finn,
ours db contains manuscripts data and the alphabetical ordering does not have to consider the diacritics and the accents.
we hope that it will be soon available on Internet! I will inform to you when it will come published :slight_smile:

the relative problem to the definition of the collation inside of the tamino schema remains: it seems that it comes completely ignored (at least in tamino 421).

thanks to you of all!
best regards,

Pam

Hi Pam,

You are right! The collation defined in the schema is ignored by the XQuery expression. This means the person who is stating an XQuery has to specify the needed collation. This gives the explanation why the two example queries have a different result.

Best Regards,

Thorsten

Hi Thorsten,
thank you for your authoritative confirmation.

this problem from what depends? it already has been resolved in one version of tamino successive to the 4.2.1?

thanks in advance,

Pam

Hi Pam,

The collation specified in the schema is ignored on purpose. Let me explain the reason for this by the following example query:

for $a in collection(“mycoll”)/a
for $b in collection(“mycoll”)/b
where $a/v1 = $b/v2
return $a

Assuming that the elements v1 and v2 have different collations. Which collation should be used for the comparison?

I think the best way to avoid this kind of problem is to follow the approach suggested by the XQuery spec. This means the collation to be used has to be stated in the query.

Regards,

Thorsten

Hi Thorsten,
Could you please elaborate on the use of existing indexes for sorting in case of specifying the collation in the xquery ?

regards Finn

Hi Thorsten,
please can you explain when the collation specified in the schema is not ignored?

Thanks in advance,

Pam

Hi Pam,

From XQuery’s point of view the collation defined in the schema are only relevant when it comes to the point where it is decided which indexes can be exploited for query processing. This means the collation defined for a certain node determines the indexing of this node. Due to this the collation specified in a query has to correspondent to the collation in the schema to allow indexed based processing.

You can also access the collation information defined in a schema via the tf:getCollation() function. This function should be applied on a node. It retrieves the collation information that is associated to this node by schema definition.

Best Regards,

Thorsten

Hi Thorsten, hi all!
I have observed that the execution time of the query:

declare default collation “collation?language=it;strength=secondary”
for $a in input()/Index[@type=“author” and @tot>0] sort by (./@order ascending)
return $a

depends on the document number of type Index in the db (it increases with this number); while the execution time of this other query:

for $a in input()/Index[@type=“author” and @tot>0] sort by (./@order ascending)
return $a

is independent from the number of documents Index (it’s the same time).

To this point my question is: the declare default collation "collation?language=it;strength=secondary"involves an ordering of all documents (of type Index or of the collection) and not only of those filtered in the xquery?

Thanks in advance

Hi Pam,

Your observation is right. The declared default collation affects the ordering of all documents of the “Index” doctype. But I guess the runtime problems result from the fact that the collation also affects the filter predicate

@type=“author”.

This means the predicate can not be evaluated via an index.

Best Regards,

Thorsten

Hi Thorsten,
please could you suggest an alternative?

Best regards and very thanks,

Pam

Hi Pam,

In order to use the given collation just for the sorting I would state the query in the following way:

for $a in input()/Index[@type=“author” and @tot>0]
sort by (./@order ascending collation “collation?language=it;strength=secondary” )
return $a

As the rewritten query shows you can add a collation to the sort specifier. The advantage is that the string comparison of the filter predicate in your query is not affected by any default collation definition. Due to this the query processor should be able to exploit the index defined on the @type attribute.

Best Regards,

Thorsten

Hi Thorsten,
thank you for your suggest but TII response is:

[color=“blue”] <?xml version="1.0" encoding="UTF-8" ?>

for $a in input()/Index[@type=“author” and @tot>0]
sort by (./@order ascending collation “collation?language=it;strength=secondary” )
return $a

]]>
</xq:query>

  • <ino:message ino:returnvalue=“6352”>
    <ino:messagetext ino:code=“INOXQE6352”>XQuery parsing error</ino:messagetext>
    ino:messagelineSyntax Error at line 2, column 29: found QName when expecting any of: “)”, “,”</ino:messageline>
    </ino:message>
    </ino:response>[/color]

???

Best regards,

Pam

Hi Pam,

I assume that you are using 4.2. But specifying a collation in a sort-by expression is only supported in Tamino 4.4. Due to this I would suggest to switch the Tamino version.

Best Regards,

Thorsten

Yes Thorsten,
it’s true.

Thank you very much for your help,

Pam