X-Application Version: 3.1.2
Tamino Version : 3.1.1
Platform : Win2k
WebContainer : Tomcat 3.3
JDK Version : 1.3.1
Hi,
I would like to get a result set that is ranked according to certain rules (best “fitting” string first.) I’m thinking of using n-grams to do this. Where would I start looking? Will I have to add a new type of tag? - or does tamino offer query parameters that help me do what I want?
Hello,
I had a short look at n-grams, but not the time to understand the concept behind it. If you have an external ranking tool which is able to handle a set of XML documents,
(A) I would propose to invoke it within the directcommand tag if it is an application specific command.
I gave a similar propose within the following topic
Expanding Search and query capability → extending the directcommand tag
As described there you an add you ranking as a special form of query which for instance not passed to the workspace, but uses the basics of the Tamino API for Java to get Stream Object of queried data. This query would be based on XQL expressions (see Tamino documentation). The resulting stream could be transferred to the ranking tool. And the result displayed as a page object.
(B.1) I would integrate it into the TaminoStore class if it is a general functionality you want to use for each query. Within this class the query (method name is query) is processed. Add the invocation of the tool to post process the queried documents.
(B.2) I would implement it as Tamino Server Extension (SXS). In this way you could extend the query capabilities of Tamino.
Bye,
Christian.
thanks for your help so far!
You mentioned SXS as a way to change to way Tamino handles queries “at the source” of the data. All I found is:
“The XSLT Server Extension provides Tamino 3.1”
do you mean this or is there something else?
back to the n-gram-stuff:
what I want to do is send a query string to the server, e.g. “CBT installation problem”
The server should now find all the existing answers in the database that “somehow” fit the query string and return them sorted according to some rules. I would like to test several ranking algorithms, one of them would be n-grams:
1st naive try: chop string to be compared into pieces of n characters (eg. n=3 :CBTinstallationproblem —> CBT, BTI, TIN, INS, NST … and find out how many of the fragments are found in the field that’s queried. If all fragments are found give it 100%, if none is found give it 0%. Then return every “hit” that is above a certain threshold (like 40%) in the order of the %-value.
I guess the right place to do this is as close to the original data as possible - as you suggested. But - how?
thanks again,
Ralf.
Hello Ralf,
yes you are right. When looking at the Tamino Documentation you will have difficulties to find the entry to the server extensions.
It is part of topic
“Installation and Getting Started”
→ Advanced Concepts
→ Utilizing Server Extensions
Another source of information about SXS is the SXS form. Have a look at
Tamino X-Tension
Here, the experts for SXS will help you in case of problems with SXS.
Bye,
Christian.