Hi *,
I got an interesting performane question.
There is File A with over 1 Millon Records to process. An there is a file B with ~20 records. I have to process all records of File A, but I have to check, if there is a Record in File B to do some additional stuff. In Natural you would code:
READ FILE_A BY PRIMARY-KEY
/* do some stuff here
FN-B. FIND NUMBER FILE_B WITH FOREIGN-KEY = FILE_A.PRIMARY-KEY
IF NUMBER(FN-B.) NE 0
/ do some additional stuff here
END-IF
END-READ
So far so good. In my case I feel that this code has a bad performance. Because you do 1 Mio Finds. So I tried the following:
READ FILE_B BY FOREIGN-KEY
ADD 1 TO (#I4)
EXPAND ARRAY #ARRAY TO (1:#I4)
#ARRAY(#I4) := FOREIGN-KEY
END-READ
*
READ FILE_A BY PRIMARY-KEY
/* do some stuff here
IF PRIMARY-KEY = #ARRAY()
/ do some additional stuff here
END-IF
END-READ
Here’s the performance (while 1486 means 148.6 secs)
search array 1486 #FOUND: 20
find number 2527 #FOUND: 20
→ Searching an array (with size 20) is faster than find number.
But now I want to go the next step.
Question #1 is: When (at which array-size) does the array-technique become inperformant vs. find number?
→ I wrote a test program to try out some sizes. I think it’s ~500. Of course this could be dependent on hardware etc.
search array 2503 #FOUND: 500
find number 2513 #FOUND: 500
search array 3547 #FOUND: 1000
find number 2443 #FOUND: 1000
Question #2 is: For huge arrays there must be a better way to search then IF PRIMARY-KEY = #ARRAY(*). Which way is the best? I already got an idea, but what do you think?
Regards
Matthias