Regarding FIND NUMBER???

rahul_mf · April 17, 2006, 11:14am

Hi all,

What is the difference in using a “FIND NUMBER” statement and using a
“HISTOGRAM with *NUMBER” ???

Waiting for reply.

Rahul

system · April 17, 2006, 6:52pm

FIND NUMBER will return just the overall number of records a FIND with the same search criteria would return, it doesn’t open a processing loop.

HISTOGRAM will open a processing loop and present a descriptor value along with the number of records containing that value on every iteration.

rahul_mf · April 18, 2006, 10:42am

Thanks Wolfgang.

Wilfried_B · April 18, 2006, 1:12pm

If you want the number of records for a specific key value FIND NUMBER is faster than HISTOGRAM(1). If you want the number of records for each occurrence in a specific key range a HISTOGRAM is the solution. If you only want the number of records within the key range you use FIND NUMBER. If it is a complex criterion, you use FIND NUMBER, too.

Another application of FIND NUMBER is the provision of the record list for later processing: FIND NUMBER … RETAIN AS. You can also use FIND NUMBER … RETAIN AS for union, intersection and difference of retained sets.:

FIND NUMBER view WITH set1 OR set2 RETAIN AS set-union
FIND NUMBER view WITH set1 AND set2 RETAIN AS set-intersection
FIND NUMBER view WITH (set1 OR set2) AND NOT (set1 AND set2) RETAIN AS set-difference

Steve_Robinson · April 18, 2006, 6:38pm

In Report Mode, FIND NUMBER permits a WHERE clause involving other fields from a record. There is no such capability for HISTOGRAM.

steve

rahul_mf · April 19, 2006, 10:28am

But Steve, in HISTOGRAM also we have a WHERE clause ( as per Natural programming guide , P-142).
But I guess there is no use of that WHERE clause because it is mentioned there → “The field specified in the WHERE clause must be the same as in the main clause of the HISTOGRAM statement”

Could you please clarify this???

Wilfried_B · April 19, 2006, 1:12pm

In a HISTOGRAM statement you always have access only to the descriptor field the histogram uses, i.e. your view definition may only contain this descriptor and no other field from the ddm. You can of course redefine that field as you wish and write a where clause on any of this redefinition fields.

I think, your problem is understanding the WHERE clause itself. It has nothing to do with the WITH clause in a FIND statement. The WITH is evaluated by ADABAS, the WHERE is evaluated by NATURAL, i.e. after the record has been passed from ADABAS to NATURAL but before the record is available to the body of the processing loop. The WHERE clause effects *COUNTER but not *NUMBER (in a FIND), as *NUMBER is an ADABAS provided value, *COUNTER counts how often the body of the processing loop has been entered. Similar statements are ACCEPT/REJECT except that these statements are coded within the body of the processing loop and thus don’t effect *COUNTER (the body of the processing loop has been entered!)

Matthias1 · April 19, 2006, 2:04pm

I think there’s at least one more possiblity than using View-Fields.
Example:

define data local
*
01 VIEW-NAME view of DDM-NAME
  02 AE
*
end-define
*
histogram VIEW-NAME for AE
    where *NUMBER > 10
*
  display *NUMBER VIEW-NAME
*
end-histogram
*
end

A Read or Find for that purpose would be slower!

Wilfried_B · April 19, 2006, 10:41pm

To become a quibbler: *number ist not a field, it is a system variable.

Of course you can use any variable in the where clause. But if you want to compare to anything you have read (i.e. “field”), this must be defined in the view. And in a histogram you can only use a view that contains no other “field” than the descriptor.

Now I stop quibbling.

rahul_mf · April 20, 2006, 9:40am

Wilfried, does it mean that the above line of code " histogram VIEW-NAME for AE where *NUMBER > 10 " is not valid???

system · April 20, 2006, 3:16pm

It is valid and it does work

Curtis_Pew · April 24, 2006, 10:56pm

I think some of the confusion here arises because Natural does such a good job of abstracting what Adabas is doing, but to understand the difference between FIND NUMBER and HISTOGRAM you need to understand the Adabas back end.

First, remember that Adabas stores records in its DATA container files, and also builds an index for each descriptor in the ASSO container files. A descriptor may be a field in the record (a normal descriptor), a part of a field (a subdescriptor), a concatenation of parts or all of two or more fields (a superdescriptor), or a value calculated from one or more fields (a hyperdescriptor.) A descriptor index logically consists of a sorted list of values with each value having an associated list of the ISNs of the records matching that value.

To satisfy any FIND (Sx) command, Adabas creates a list of the ISNs of all records that meet the find criteria. In the simplest case this is just those records with a particular value of a single descriptor, but in other cases it may involve a range of values, multiple descriptors, and even fields that have not been indexed as descriptors. Normally your program then retrieves each record in the constructed ISN list in turn (using L1/L4 commands), but in the case of FIND NUMBER Adabas merely returns the size of the ISN list, which Natural places in the *NUMBER system variable.

For the HISTOGRAM (L9) command, Adabas reads through a single descriptor index and returns each value with the count of its associated ISN list, which NATURAL again places in the *NUMBER system variable. (The Adabas command manual calls the L9 command “Read Descriptor Values”, which is probably clearer than “HISTOGRAM”.)

If you understand what Adabas is doing “behind the scenes”, the difference between FIND NUMBER and HISTOGRAm is quite clear.

rahul_mf · April 25, 2006, 11:55am

Thanks Curtis

John_Smith · January 9, 2008, 6:28am

Hi folks,

I would like to think on performance, but according to Curtis said, which conclusion we can take?

In this issue, we are not going to consider Histogram needs a specific View, Find number can be used with AND or OR criteria. Let consider only performance.

1 ) S1 command is the same, doesn’t matter if you are using FIND, FIND NUMBER or FIND (1), Natural handles it for you, so, only ISQ is retuned on System Variable (*number);
2 ) S1 command is not the same, Adabas handles it according to the information that you pass by Control Block. Find Number doesn’t have format buffer, and there is no loop to generate RC command;
3 ) L9 is cheaper than S1, because L9 uses only inverted list; so, no AC and DS are used. S1 uses inverted list, depend on the amount of the records, Work part 2 can be used, AC and DS are used - the first record is returned on S1 command;
4 ) Find number is S1 command, even if Natural handles the command. So, no format buffer (FB), loop (hence, no RC command is issued), and ISQ is placed on *number (system variable);
5 ) To check only if the record exist - descriptor with 1 or no record:
a. Find number should be used instead of Histogram (1);
6 ) To check only if the record exist - descriptor with large records:
a. Histogram (1) should be used instead of Find Number;
7 ) If you use Find Number, only S1 is generated, since there is no loop (no RC)
8 ) If you use Histogram (1), L9 and RC are generated;
9 ) Find number is always the best choose, only if the amount of ISN is enough on NSISN parameter;

After all, which statement is better to use? Find Number or Histogram (1)?

Thanks.

John_Smith · January 12, 2008, 3:08am

Hi folks,

Does anyone have any suggestion, comments or any explanation about the issue above?

Thanks.

Ralph_Zbrog · January 12, 2008, 5:31am

There are several reasons for using FIND NUMBER versus HISTOGRAM, but, in my experience, there are no files whose contents apply solely to one of those criteria. There is always some subset of key values that would be processed better by the “other” statement. But you know only after the fact, during execution, that the other statement would have been better.

To simplify the decision, I base it on the answer to one question: Is your descriptor unique? If so, then avoid HISTOGRAM’s RC call and code

FIND NUMBER EMPLOYEES WITH NAME = 'ZBROG'

otherwise use

HISTOGRAM EMPLOYEES FOR NAME FROM 'ZBROG'
                               TO 'ZBROG'
END-HISTOGRAM

If you’re not yet at Natural 4, change the HISTOGRAM to use a THRU clause.

HISTOGRAM EMPLOYEES FOR NAME FROM 'ZBROG'
                             THRU 'ZBROG'
END-HISTOGRAM

And if it’s not easy to provide the THRU value, use

HISTOGRAM EMPLOYEES FOR NAME FROM 'ZBROG'
  ESCAPE BOTTOM
END-HISTOGRAM

John_Smith · January 12, 2008, 5:00pm

Hi Ralph,

I understand that you agreed with me about the item 5.

Thanks, does anyone want to add comments, suggestion about the other itens?

Douglas_Kelly · January 13, 2008, 12:44am

point 6 is irrelevant to the discussion: the size of the record will not impact the performance of either command.

If there was a blanket answer “use command x rather than command y” for all cases, Adabas wouldn’t need to support “command y” or Natural would be modified to generate “command x” in all cases where it now uses “command y”.

As Ralph alludes to - even if you run a test against a set of data and programs, the performance would vary for the selection criteria or data. A test against today’s scenario may yield a different performance result when run against tomorrow’s scenario.

For scenarios other than the simple existence check (counting, complex criteria, multiple-millions of records, etc, etc) you would have to define what you are trying to achieve more specifically. My general answer would be “it depends” - it depends on what your environment - data, processing, application, etc - and what the “question” is.

For a simple existence check, I generally use the FIND NUMBER as a first choice:

it is easier to code (as a developer, a significant consideration)
it allows me to add complex criteria (AND, OR, etc) readily
the intention of it is clear to the person reading the program

I will use a HISTOGRAM in an existence check to

check a range of values such as might be found with a reverse date index (find most recent record for this key starting from datevalue)
count a range of values
check values involving PE’s (that is descriptors on a PE element or superdescriptors with a PE element component)
if the number of values to be counted is likely to be large that it would impinge on NSISN. This is seldom an issue for a simple existence check.

Performance for both commands is nearly always “good enough”. I will almost always get more bang for my performance and tuning efforts looking elsewhere.

Matthias1 · January 14, 2008, 6:31pm

I think point 6 is relevant. John Smith wrote about “large records”. I would say: “large number of records”.

For example imaging the question: Is there any descriptor-value starting with “AB”? I would use HISTOGRAM here…

define data local
01 #search (A15) init <'AB'>
01 #low    (A15) init full length <H>
01 #high   (A15) init full length <H>
*
01 hist-id view of example-file
  02 id-field   (A15)
*
end-define
*
COMPRESS #search #low  into #low  LEAVING NO
COMPRESS #search #high into #high LEAVING NO
HIST-1.
HISTOGRAM (1) hist-id for id-field FROM #low THRU #high
END-HISTOGRAM
IF *COUNTER(HIST-1.) = 0
  write "there is no value starting with" #search
else
  write "there is a value starting with" #search
END-IF

Topic		Replies	Views
Find Number vs Histogram Adabas-Natural , Natural , Natural-Code-Samples , Inside-Natural-and-Performance-Hints	2	3505	April 2, 2021
which one is efficient... Find NUMBER or FIND (1) Adabas-Natural , Natural , Natural-on-Mainframes	9	2246	April 2, 2021
Read and Histogram Adabas-Natural , Natural , Natural-on-Mainframes	8	6529	April 2, 2021
Performance READ + FIND NUMBER vs. READ + IF ARRAY(*) Adabas-Natural , Adabas , Natural	44	2742	December 2, 2022
Counter & Number Adabas-Natural , Natural , Natural-on-Mainframes	9	2859	April 2, 2021

Regarding FIND NUMBER???

Related topics