Exponential time for query processing / avoiding Tamino Post

Michael_Pollecker · January 30, 2002, 7:07pm

Hi Tamino fans,

we are concerning about a Tamino query, related processing time and the interpretation of the ino:explain-command:

The query expression:
FpML[trade/swap/swapStream/calculationPeriodAmount/calculation
[notionalSchedule/notionalStepSchedule/currency=‘USD’ and
floatingRateCalculation/floatingRateIndex=‘USD-LIBOR-BBA’ and
dayCountFraction=‘ACT/360’]]

When we test the query with steps of 100.000 instances the processing time for the first 400.000 instances is linear but from 500.000 to 900.000 the processing time increases exponential. The results:

instances time [msec]
20000 1309
40000 2445
60000 3579
100000 6541
200000 11793
300000 17906
500000 43124
700000 98195
900000 127128

All fields we are using within the filter expression are indexed with the ‘standard’ option but the ino:explain-command always returns an ino:postprocessing=“TRUE”.

Has anyone got an idea why Tamino uses postprocessing or why the processing time increases exponential???

We are using a sun fire 880 with 4 GB RAM and two processors, buffer pool size is 1 GB.

Enclosed you will find the schema, the result of ino:explain and a sample instance.

Thanks in advance
Michael

Michael Pollecker

SAG Systemhaus GmbH
Niederlassung Darmstadt
Professional Services

Alsfelder Str. 15-19, D-64289 Darmstadt
Telefon +49 (6151) 92 31 28, Fax +49 (6151) 92 31 11
E-Mail: Michael.Pollecker@softwareag.com
Michael.Pollecker@partner.commerzbank.com
ino_explain.zip (7.6 KB)

system · February 1, 2002, 3:51pm

Standard index means that Tamino remembers in which documents a particular value occurs.

So if your query is something like that

path[node1=‘xxx’ and node2=‘yyy’]

Tamino finds two lists of document IDs where node1 and node2 occur and returns you only common documents.

This technique seems to be fast enough ( from O(n) to even constant time, depend on what type of indexing is used, i don’t know it unfortunately).

Your query is a litte more difficult because of the nested conditions. Stardard indeces can’t be applied here straightly. That is why post-processor is involved in calculations.

The only thing I can advise you is to restructure your documents so that the queries you perform more often take less.

Alexander

Michael_Pollecker · February 4, 2002, 4:04pm

Hi Alexander,

thanks for your answer! In the meantime we found out why tamino uses the postprocessor: It’s the cardinality > 1 of a node we query. In detail:

A query like
/A/B/C[D/E[F=‘xxx’ and G=‘yyy’]]
causes no (!) postprocessing if the cardinality of the nodes D and/or E is zero or one. If the cardinality of these nodes is > 1 the postprocessor is invoked.

Your suggestion changing the schema is not useful in our case because the schema is standardized.

regards
Michael

system · February 4, 2002, 6:15pm

Yes, I wanted to explain you the same. Standard indeces can help only in the question: are there DOCUMETNS that have this value in this node or not. The world “document” is crucial.

Your query is more difficult exactly because of multiple cardinality of the nodes you said (by the way B and C as well). Otherwise the query could have been simplified to

/A/B/C[D/E/F=‘xxx’ and D/E/G=‘yyy’]

If you are not permitted to change the schema entirely, you could rely on open schema concept and add a little auxiliary node to it with standard index. For example, auxiliaryNode under calculation. It contains a combination of all values you need in the query.

Now this will be faster

FpML[trade/swap/swapStream/calculationPeriodAmount/calculation/auxiliaryNode=‘USD;USD-LIBOR-BBA;ACT/360’]

You can add the node when loading documents in Tamino and delete it if necessary when retrieving.

Of course, this approach is defensible if you have a very limited number of queries that should be performed fast.

Topic		Replies	Views
INOXYE9291 - Transaction aborted because taking too long Tamino	9	4209	April 2, 2021
Tamino XML server performance Tamino	2	5440	April 2, 2021
Huge schema size and huge ammount of documents. Recomendatio Tamino	6	5467	April 2, 2021
compound index and count() Tamino	5	5080	April 2, 2021
Building sequence performance Tamino	3	3110	April 2, 2021

Exponential time for query processing / avoiding Tamino Post

Related topics