Very often I read in programming guides: FIND … SORTED BY is forbidden!
Also in the thread READing a Range by Multi Valued Descriptor Field Steve Robinson(!) wrote:
What I am astonished of is: “(but only for a small number of records)”. This is not true. The correct use of SORTED BY does not depend on the number of records to be sorted but on the selectivity of the descriptor the records should be sorted by.
Here is an example (extreme to show the result):
Suppose you have a unique descriptor Order-Num and you have 60 Million records in your file. The highest Order-Num is 60000000. If you do a
FIND file WITH ORDER-NUM = 60000000 SORTED BY ORDER-NUM
you will get the answer in minutes (not milliseconds!). What’s the reason for this?
The resulting set is created in milliseconds.
Then Adabas has to sort the one element set. It does this by climbing the inverted list and comparing each ISN to the result set. If it finds a ISN in the result set, it will move it into the sorted result set until the unsorted result set is empty. In the case above Adabas reads all descriptor values to compare them. And this will consume the huge time.
The higher the selectivity of a descriptor, i.e. the lower the number of ISNs per descriptor is, the longer the sort may take.
On the other hand it is very simple for Adabas to sort a huge set by a descriptor with a low selectivity (many ISNs per value). Remember that the ISNs in a descriptor are sorted by ISN and the ISNs in an unsorted resulting set are sorted by ISN, too!
So keep in mind that the efficiency of SORTED BY first depends on the selectivity of the descriptor to be sorted by and second on the number of records in the resulting set.
Steve, I am expecting your reply 8)