In the Softwareag ADABAS basic tutorial, I find the following while describing the utility of subdescriptor :
As I see it, the number of ISN-s in the inverted list for values 11000 - 11999 will be exactly same as the number for value 11. Hence, there are 2 advantages of using a subdescriptor here:
We can use FIND instead of READ.
In case of subdescriptor(11), there will be one entry in the inverted list, whereas in case of the descriptor(11***) there will be 999 entries. So, reading inverted list itself will take less time(though can be negligible)
Is there any other advantage of using subdescriptor ? Please suggest.
Is there any other scenario which gives a different utility of a subdescriptor ?
Along with FIND and READ we can use HISTOGRAM too, which could be more efficient when we want to deal only with descriptors (no updates!)
General view on descriptors (super, sub, hyper etc):
Any field can be used within a selection criterion. When a field that is used extensively as a search criterion, is defined as a descriptor (key). Hence a descriptor is a search key. The selection process is considerably faster since Adabas is able to access the descriptor’s values directly from the inverted list without reading any records from Data Storage.
Because the inverted list requires disk space and update overhead, the descriptor option should be used judiciously, particularly if the file is large and the field that is being considered as a descriptor is updated frequently.
One thing you can do with subdescriptors is invert a field NOT starting with position 1. We have a subdescriptor on one of our fields (NAED-UIC-NUM) that inverts a subdescriptor called UPC-NUMBER which is positions 2-14 of the NAED-UIC-NUM field.
That way it matters not what the initial digit of the source field is when searching for the UPC number part.
As I have understood your explanation, it stands as a disadvantage of using subdescriptor :
Taking this reference, I can think of an advantage as well :
Say,a superdescriptor has 5 fields A,B,C,D,E of which D is null suppressed.
In this case, if A and B are populated and D is NULL, the superdescriptor can never find the record. But a subdescriptor with A,B can find it within the range.
not quite. A subdescriptor is only ever on one field. More than one and it is a superdescriptor. So you can define a second super on A,B that in your scenario would index your example.
The case Brian cites is not a disadvantage. It is a facet of the indexing to be aware of. You don’t always want null values to be indexed - if you are unlikely to select on them then the don’t take up space in the inverted list (which is an advantage!). This is the beauty of including certain fields in a superdescriptor that may be null: when they’re null, they’re not indexed.
Brian’s example alludes to the subtle difference null suppression has on subdescriptors versus superdescriptors. For both super and subdescriptors, a range of bytes is specified to select from out of the source field. For subdescriptors, the entry is treated as null if the source BYTES are null. For superdescriptors, the entry is null if the source FIELD is null.
Thus if you have a superdescriptor that includes the same two blank bytes that Brian’s example has (" XYZ"), it will index the field, but the subdescriptor on the same 2 bytes will not create an index entry.
(A superdescriptor can include any range of bytes from the source fields, which allows such oddities as including bytes beyond the “end” of the field; the presence of the entry indicates that the source fileld is not null - HISTOGRAM will show those byte(s) in the superdescriptor to be blank for an alpha field.)