starts-with and ~=

The 2.3.1.1 Tamino documentation claims that the query

code:
/patient/name[surname~=‘At*’]

will return all name nodes whose value starts with “At”. However, I have found in a similar query

code:
/Journal[/Journal/@JournalName~=‘Ya*’]

that I don’t just get journalName nodes whose value starts with “Ya,” I also get “Abya Yala News” and “TEB Yatirim”…as if the contains operator is looking for any Ya* words, contrary to the documentation. By the way, I have created both text and standard indexes for JournalName, because I’d like to be able to do TRS-like searches on the field.

It looks like the starts-with XPath function would solve this problem, but it is not implemented in the version of Tamino I’m running. Does this mean I’m out of luck for the time being? Is there any way to anchor the string in the contains clause, a la regular expressions ($Ya*) ?

thanks

Have you tried the BETWEEN operator? e.g.

yacht[name between “E”,“S”]

In your case it would be something like

/Journal[/Journal/@JournalName between ‘Ya’,‘Yazzzzz’]

To return the documents with JournalName starting ‘Ya’.

rpn…I had indeed tried the query:

/Journal[/Journal/@JournalName between ‘Ya’,‘Yz’]

and I was getting “XQL Request processed, no object returned”.

However, your tip led me to remember the “string” X-Path function, so I tried

/Journal[string(/Journal/@JournalName) between ‘Ya’,‘Yz’]

and I am getting much better results. However, I now have some other problems.

1. the query is slower now.
2. the query is now case sensitive, so “between ‘y’, ‘z’” returns a different result set from “between ‘Y’, ‘Z’”.
3. the count of records being returned is inconsistent. If I specify a result size that is greater than the actual result size, I get an accurate count. But if the result size is lower, I get a wildly inaccurate and inconsistent overcount. It don’t need the count on this particular query, I just find it rather strange that this is happening.

any thoughts on this?!?

  1. the query is slower now.
    I can believe this, as you seem to be doing Data Conversion which I guess involves the post-processor.

    How is the attribute defined on the schema (Map type, indexes etc)?

    2. the query is now case sensitive, so “between ‘y’, ‘z’” returns a different result set from “between ‘Y’, ‘Z’”.
    Again I can believe this as the indexing used (STANDARD) is case sensitive.

    3. the count of records being returned is inconsistent. If I specify a result size that is greater than the actual result size, I get an accurate count. But if the result size is lower, I get a wildly inaccurate and inconsistent overcount. It don’t need the count on this particular query, I just find it rather strange that this is happening.

    No idea on this one. I’d have to look at the schema and data.

How is the attribute defined on the schema (Map type, indexes etc)?

Map-type: InfoField
Data-type: WCHAR

These are the defaults from the DTD import into the Schema editor.

Try

/Journal[@JournalName between ‘Ya’,‘Yz’]

Does it work ? It works for me with a similar structure.

I created another schema and I was able to get the between clause to work on an attribute with the same structure as the Journal/@JournalName. When I examined the difference between the journal schema and the new one, the only difference was in the indexes…I didn’t create any for the new one. So I removed the indexes from the journal document type, and voil