Any opinions/studies regarding advantages of one over the other concerning performance? Which one is faster?
Thanks
Any opinions/studies regarding advantages of one over the other concerning performance? Which one is faster?
Thanks
I guess it depends on whether you have Natural Optimizer.
In plain Natural REPEAT method will have you to manually control when to leave the loop, e.g. by adding ADD 1 TO #COUNTER, IF #COUNTER > #MAXCOUNT ESCAPE BOTTOM. Using REPEAT WHILE #CONTINUE = TRUE can also be fast if your logic for controlling the #CONTINUE boolian is efficient.
Hmmm - I better let the experts reply since I guess you ask for the internal efficiency, not the program logic dependent part…
First, here is a program to compare times. The program is comparing logic that offers the same capabilities (a subscript indicating iteration number, an escape based on the iteration number).
DEFINE DATA LOCAL
1 #LOOP (P9)
1 #CPU-START (P9)
1 #CPU-ELAPSED (P9)
END-DEFINE
*
INCLUDE AATITLER
INCLUDE AASETC
*
MOVE *CPU-TIME TO #CPU-START
SETA. SETTIME
FOR #LOOP = 1 TO 800000
IGNORE
END-FOR
COMPUTE #CPU-ELAPSED = *CPU-TIME - #CPU-START
WRITE 3/10 ‘FOR LOOP TIME==>’ *TIMD (SETA.) #CPU-ELAPSED
*
RESET #LOOP
MOVE *CPU-TIME TO #CPU-START
SETB. SETTIME
REPEAT
ADD 1 TO #LOOP
IF #LOOP = 800000
ESCAPE BOTTOM
END-IF
END-REPEAT
COMPUTE #CPU-ELAPSED = *CPU-TIME - #CPU-START
WRITE 3/10 ‘REPEAT LOOP TIME==>’ *TIMD (SETB.) #CPU-ELAPSED
*
END
Here is the mainframe output (without Natural optimizer)
MORE
PAGE # 1 DATE: Oct 02, 2008
PROGRAM: FORREP01 LIBRARY: XSTRO
FOR LOOP TIME==> 8 48
REPEAT LOOP TIME==> 5 35
Interestingly enough, here is the PC output:
PAGE # 1 DATE: Oct 02, 2008
PROGRAM: FORREP01 LIBRARY: INSIDE
FOR LOOP TIME==> 45 446
REPEAT LOOP TIME==> 49 480
On the mainframe, the REPEAT is clearly more efficient, by quite a margin (if memory serves, for older versions of Natural the difference was not as pronounced).
On the PC, the FOR is more efficient, but the difference is less than 10 %.
Realize that if logic that does not require an iteration counter is needed, the REPEAT will definitely outperform the IF, since the incrementing of the FOR variable is then quite superfluous.
steve
Thank you all.
I am on mainframe and never used a FOR that it did not include a counter validation.
Before my posting, I tried similar code with 90000000 limits and it appeared that REPEAT was slightly faster.
Glad to see that we still correct on that.
Carlos
For an apples-to-apples comparison, you need to include ESCAPE logic in the REPEAT loop, as was done in the benchmark. But in that benchmark the FOR loop was “executing” an IGNORE. Insert an IGNORE into the REPEAT loop, and FOR is marginally faster.
Each of the following will improve execution and elapsed times:
. change #LOOP to I4
. change the TO value to a named constant with the same format as #LOOP (eg #MAX (I4) CONST <800000>)
. compile with Optimizer Compiler (as were all my tests)
I ran several variations of the benchmark, changing the FROM and STEP values from literals to named constants or variables (changing the values in both the FOR and the REPEAT). In all cases this resulted in REPEAT being faster.
Hi Ralph;
What version of Natural did you run on? (I was on 4.2) Reason I ask is I added the IGNORE to the REPEAT loop and got the following output without the Optimizer:
mainframe i4
PAGE # 1 DATE: Oct 02, 2008
PROGRAM: FORREP02 LIBRARY: XSTRO
FOR LOOP TIME==> 6 48
REPEAT LOOP TIME==> 4 35
REPEAT IGNORE LOOP TIME==> 4 41
mainframe with p9
MORE
PAGE # 1 DATE: Oct 02, 2008
PROGRAM: FORREP03 LIBRARY: XSTRO
FOR LOOP TIME==> 6 48
REPEAT IGNORE LOOP TIME==> 4 41
REPEAT LOOP TIME==> 4 35
Two interesting things to note. First, although the REPEAT-IGNORE is slower than the REPEAT, it is still faster than the FOR loop. Second, I4 and P9 were the same. I expected the I4 would be faster.
Did you compare REPEAT and REPEAT-IGNORE with the optimizer? If so, and if the IGNORE is slower, one wonders what the IGNORE is being compiled as. The optimizer should not have any machine instructions to correspond to an IGNORE.
Also interesting, here is the PC output
PAGE # 1 DATE: Oct 02, 2008
PROGRAM: FORREP02 LIBRARY: INSIDE
FOR LOOP TIME==> 45 442
REPEAT LOOP TIME==> 48 472
REPEAT IGNORE LOOP TIME==> 48 474
The difference w/wo IGNORE is basically minimal.
steve
My tests were with Natural 4.2.3 with NOC. I changed the REPEAT loop to
REPEAT
ADD 1 TO #LOOP
IF #LOOP = 8000000
ESCAPE BOTTOM
END-IF
IGNORE
END-REPEAT
I had to increase the number of iterations to get any sort of reading.
My timings show REPEAT and FOR to be almost identical. When there is a difference, it’s marginal and in FOR’s favor. Moving the IGNORE immediately after the REPEAT sways the timings in FOR’s favor.
I think you folks have proven my point when this performance discussion comes up (another similar one is FIND NUMBER vs HISTOGRAM): it doesn’t matter.:shock:
What usually does matter - which is easier to read and maintain? If you do have something that is incrementing a counter, a FOR loop makes sense - it keeps track of the index, incrementing it and exitting. It is clear what it is intended for.
A REPEAT is ideal for more complex exit conditions and loops that don’t have a straightforward index and for conditional counting (e.g. if color = “red” add 1 to #ctr)
Using a REPEAT for a “natural” FOR loop just increases the possibility of a) stating the exit condition incorrectly b) omitting or misplacing the increment c) confusing the next maintainer as to why REPEAT was used instead of FOR.
I am already using I4.
Good point Douglas, I agree with the confusion that might occurs, but still thinking a combination of performance and clear code.
Just to add to the point made by Douglas;
The times that FOR and REPEAT are likely to be considered as alternatives are when there is a need for a “subscript” (the FOR variable, and a separate varaible for the REPEAT).
Here is a minor change to the program I originally posted. I added a MOVE statement that includes a subscripted array.
DEFINE DATA LOCAL
1 #LOOP (P9)
1 #CPU-START (P9)
1 #CPU-ELAPSED (P9)
1 #ARRAY (A1/1:1000000)
1 #A (A1)
END-DEFINE
*
INCLUDE AATITLER
INCLUDE AASETC
*
MOVE *CPU-TIME TO #CPU-START
SETA. SETTIME
FOR #LOOP = 1 TO 800000
MOVE #ARRAY (#LOOP) TO #A
END-FOR
COMPUTE #CPU-ELAPSED = *CPU-TIME - #CPU-START
WRITE 3/10 ‘FOR LOOP TIME==>’ *TIMD (SETA.) #CPU-ELAPSED
*
RESET #LOOP
MOVE *CPU-TIME TO #CPU-START
SETB. SETTIME
REPEAT
ADD 1 TO #LOOP
IF #LOOP = 800000
ESCAPE BOTTOM
END-IF
MOVE #ARRAY (#LOOP) TO #A
END-REPEAT
COMPUTE #CPU-ELAPSED = *CPU-TIME - #CPU-START
WRITE 3/10 ‘REPEAT LOOP TIME==>’ *TIMD (SETB.) #CPU-ELAPSED
*
END
And here is the output:
PAGE # 1 DATE: OCT 06, 2008
PROGRAM: FORREP03 LIBRARY: INSIDE
FOR LOOP TIME==> 69 678
REPEAT LOOP TIME==> 70 694
So, just by adding a single statement, the total CPU time is up 50% over the barebones program I originally posted (see earlier post). Here, the difference is less than 3% with regard to CPU time. My guess (will run it later) is that on the mainframe, the difference will be similarly small, and even smaller when run with a “real” loop with more than one statement.
steve
:lol: Quite some single statement you added
Handling tables and elements with 800,000 occurrences MUST take a lot of CPU as memory allocation is a big part of the statement, 800,000 times.
Edit: Okay I see that the table is allocated from the beginning, but doesn’t big arrays cost a great amount of CPU?
Looks like a topic for a future Inside Natural article.
I have never bothered to look at the generated code for an array MOVE. My guess is that it is relatively independent of array size. MANY years ago I worked with a group within IBM that was doing the maintenance for the Fortran IV compiler. An array reference generated a base address, to which an offset (width times subscript) was added. The resultant address was used in the same manner as any other address. There would have been almost no difference between a large and small array reference.
When I get some time, I will play with this in Natural.
However, the point I was really trying to make is quite independent of the efficiency/inefficiency of a single statement. My guess is that the average number of statements within a FOR or REPEAT statement is quite a bit larger than one. Being a lazy typist, I did not want to type five or ten statements. I felt it was important to include a statement that would likely be part of any FOR/REPEAT loop, namely one that used the FOR variable (or a REPEAT counterpart) as a subscript. The MOVE was the simplest to type.
By adding five or ten statements, even statements that do not access an array, the “overhead” of the loop “bookkeeping” becomes small relative to the “real code” of the loop. Suppose the overhead is 10% of the total cost of the loop. Now suppose REPEAT is 5% more efficient than FOR. In terms of the total loop “cost”, the REPEAT is only one half of one percent (.005) faster than the FOR.
The point Douglas made, which I was merely trying to reinforce, is that such a small difference is basically not worth worrying about. Add in the possibility that someone might code the REPEAT incorrectly (put the increment and test in the wrong place) and the “nod” would have to go to FOR.
steve