eXtensible Array Syntax problems

I have a prblem with 2 dimensional eXtensible arrays

For purposes of this query - assume I have 2 PDA’s which look the same

1 PDA-1
2 COUNT-1 (i2)
2 ARRAY-1 ()
3 FIELD-1 (A10)
3 COUNT-2 (I2)
3 ARRAY-2 (A10/
)

1 PDA-2
2 COUNT-1 (i2)
2 ARRAY-1 ()
3 FIELD-1 (A10)
3 COUNT-2 (I2)
3 ARRAY-2 (A10/
)

Moving PDA-1 to PDA-2 uses the following code

ASSIGN PDA-1.COUNT-1 = PDA-2.COUNT-1
EXPAND ARRAY PDA-2.ARRAY-1 TO (1:PDA-2.COUNT-1)
ASSIGN PDA-2.FIELD-1 () = PDA-1.FIELD-1 ()
ASSIGN PDA-2.COUNT-2 () = PDA-1.COUNT-2 ()

Now things get ugly
what I would like to do is simply …
EXPAND ARRAY PDA-2.ARRAY-2 TO (,)
ASSIGN PDA-2.ARRAY-2 (,) = PDA-1.ARRAY-2 (,)

but this gets a NAT1317
what I have to do is not nice - this seems to be the only code that works

FOR #A1 = 1 TO PDA-2.COUNT-1
FOR #A2 = 1 TO PDA-2.COUNT-2 (#A1)
EXPAND ARRAY PDA-2.ARRAY-2 TO (*,1:#A2)
ASSIGN PDA-2.ARRAY-2 (#A1,#A2) = PDA-1.ARRAY-2 (#A1,#A2)
END-FOR
END-FOR

Can anyone suggest cleaner and probably more efficient code

Peter

You are executing far too many EXPANDs. You need only 1 for ARRAY-1 and a total of COUNT-1 iterations for ARRAY-2.

EXPAND ARRAY PDA-2.ARRAY-1 TO (1:PDA-1.COUNT-1) 
FOR #I = 1 TO PDA-1.COUNT-1 
  EXPAND ARRAY PDA-2.ARRAY-2 TO (*, *:PDA-1.COUNT-2 (#I)) 
END-FOR

Rather than repeatedly updating COUNT-1 and COUNT-2 in the PDA, you might be able to save some CPU by computing these values only when they are needed, as in

#C1 := *OCC (PDA-1.FIELD-1) 
EXPAND ARRAY PDA-2.ARRAY-1 TO (1:#C1) 
FOR #I = 1 TO #C1 
  #C2 := *OCC (PDA-1.ARRAY-2, 2) 
  EXPAND ARRAY PDA-2.ARRAY-2 TO (*, *:#C2) 
END-FOR

Once you’ve assigned the memory, you can move the PDA values.

EXPAND ARRAY PDA-2.ARRAY-1 TO (1:PDA-1.COUNT-1) 
FOR #I = 1 TO PDA-1.COUNT-1 
  EXPAND ARRAY PDA-2.ARRAY-2 TO (*, *:PDA-1.COUNT-2 (#I)) 
END-FOR 
ASSIGN PDA-2.COUNT-1        = PDA-1.COUNT-1 
ASSIGN PDA-2.FIELD-1 (*)    = PDA-1.FIELD-1 (*) 
ASSIGN PDA-2.COUNT-2 (*)    = PDA-1.COUNT-2 (*) 
ASSIGN PDA-2.ARRAY-2 (*, *) = PDA-1.ARRAY-2 (*, *) 

One thing that you don’t realize is that with a two dimensional X-array, the second dimension does not vary between occurrences. Try this, using the same data definition as above:

EXPAND ARRAY PDA-1.ARRAY-1 TO (1:10)    
FOR #I = 1 TO 10                        
  MOVE #I TO PDA-1.FIELD-1 (#I)         
  EXPAND ARRAY PDA-1.ARRAY-2 TO (*,1:#I)
  MOVE #I TO PDA-1.ARRAY-2 (#I,*)       
END-FOR                                 
*                                       
FOR #I = 1 TO 10                        
  DISPLAY #I                            
   'occ/fld1'    *OCC(PDA-1.FIELD-1)    
   'occ1/array2' *OCC(PDA-1.ARRAY-2,1)  
   'occ2/array2' *OCC(PDA-1.ARRAY-2,2)  
END-FOR

You might expect each occurrence of array-2 to have one more occurrence than the previuos, but what you get is:

    #I          occ        occ1        occ2      
               fld1       array2      array2     
----------- ----------- ----------- -----------  
                                                 
          1          10          10          10  
          2          10          10          10  
          3          10          10          10  
          4          10          10          10  
          5          10          10          10  
          6          10          10          10  
          7          10          10          10  
          8          10          10          10  
          9          10          10          10  
         10          10          10          10  

So, even Ralph is doing far too many expands. All you need to do is:

MOVE *OCC(PDA-1.FIELD-1)   TO #I           
EXPAND ARRAY PDA-2.ARRAY-1 TO (1:#I)       
MOVE *OCC(PDA-1.ARRAY-2)   TO #I           
EXPAND ARRAY PDA-2.ARRAY-2 TO (*,1:#I)     
*                                          
MOVE BY NAME PDA-1         TO PDA-2

Or, if you don’t like move by name:

MOVE PDA-1.COUNT-1         TO PDA-2.COUNT-1      
MOVE PDA-1.FIELD-1 (*)     TO PDA-2.FIELD-1 (*)  
MOVE PDA-1.COUNT-2 (*)     TO PDA-2.COUNT-2 (*)  
MOVE PDA-1.ARRAY-2 (*,*)   TO PDA-2.ARRAY-2 (*,*) 

Thank you, Jerome!

I expected the memory allocation to be consistent for the second dimension of ARRAY-2 (similar to a Periodic Group), but I also expected *OCC (ARRAY-2) to match Peter’s COUNT-2 value (ARRAY-2 acting as a set of MUs). Jerome’s posting has helped me to see that the latter was completely ridiculous, as the syntax doesn’t allow it.

*OCC (PDA-1.ARRAY-2, 2)       /* valid 
*OCC (PDA-1.ARRAY-2 (#I) ,2)  /* invalid

I’m disappointed that X-arrays act like PEs instead of MUs. For the first dimension, *UBOUND is the memory allocation and *OCC is the number of used positions. For the second and third dimensions, *OCC will always match *UBOUND - not very useful.

To populate the first array, Peter will need to maintain the COUNT-2 field to use it as an index. OCC should be used for the EXPAND, as it represents a maximum count for the second dimension. Otherwise, he would need to calculate the maximum value of COUNT-2 () for the EXPAND. (Or use the (more expensive) FOR loop from my previous post.)

I wanted to compare COUNT-2 to *OCC, so I populated the table with

EXPAND ARRAY PDA-1.ARRAY-1 TO (1:5)
FOR #I = 1 TO 5
  MOVE #I TO PDA-1.FIELD-1 (#I)
  MOVE #I TO PDA-1.COUNT-2 (#I)
  EXPAND ARRAY PDA-1.ARRAY-2 TO (*,1:#I)
  MOVE #I TO PDA-1.ARRAY-2 (#I, 1:#I)
END-FOR

Then displayed the contents

  DISPLAY *OCC (PDA-1.FIELD-1)    (NL=2)
                PDA-1.FIELD-1 (*)
          *UBOUND (PDA-1.ARRAY-2, 2) (NL=2)
          *OCC (PDA-1.ARRAY-2, 1) (NL=2) /* 1st dimension
          *OCC (PDA-1.ARRAY-2, 2) (NL=2) /* 2nd dimension
                PDA-1.COUNT-2 (1)
                PDA-1.ARRAY-2 (1, *)
                PDA-1.COUNT-2 (2)
                PDA-1.ARRAY-2 (2, *)
                PDA-1.COUNT-2 (3)
                PDA-1.ARRAY-2 (3, *)
                PDA-1.COUNT-2 (4)
                PDA-1.ARRAY-2 (4, *)
                PDA-1.COUNT-2 (5)
                PDA-1.ARRAY-2 (5, *)

to get this

OCC  FIELD-1   UBOUND OCC OCC COUNT-2  ARRAY-2   COUNT-2  ARRAY-2   COUNT-2  ARRAY-2   COUNT-2  ARRAY-2   COUNT-2  ARRAY-2
--- ---------- ------ --- --- ------- ---------- ------- ---------- ------- ---------- ------- ---------- ------- ----------

  5 1            5      5   5      1  1               2  2               3  3               4  4               5  5
    2                                                    2                  3                  4                  5
    3                                                                       3                  4                  5
    4                                                                                          4                  5
    5                                                                                                             5

And lastly, Jerome’s code has a minor bug.

MOVE *OCC(PDA-1.ARRAY-2)   TO #I

should be

MOVE *OCC(PDA-1.ARRAY-2, 2)   TO #I  /* 2nd dimension

Thanks Ralph, that works well, and yes I had discovered Jerome’s point about the second dimension being set to the largest defined.

There is some discussion going on about how expensive these arrays are going to be when you have to expand them one occurance at a time. When I get the time I must investigate some code to (say) expand them 10 entries at a time and then if not all of the last 10 (or 100 or whatever) are used then reduce the array to that number.

In our experience, the memory allocation is relatively expensive. You should rarely expand the array one occurrence at a time If you know the size you need from the start then you should expand to that size. Otherwise, my mantra is “know your data.” If most of your use cases would fit in 10 or fewer occurrences then expand the array in increments of 10 when needed. If you know you will always need at least 50, then start with 50. Reducing the array size is also expensive and the only benefit is the accuracy of *OCC as a count of the number of filled occurrences. Natural does not release the space that was allocated. Efficiency-wise you are better off using a counter for the number of filled occurrences and forget about the REDUCE statement.

Additionally, if the x-array is a local field in a subprogram or external subroutine, then the space allocated for the array will be released at the end of the module so REDUCE is a waste of resources.

I haven’t profiled the resource trade-off yet, but I normally use the REDUCE in subprograms called via RPC. One RESIZE at the start of the subprogram to set the return array to a relatively high size and one REDUCE at the end to minimize the size of the return area.

But I can definitely confirm Jerome’s experience: expanding the array is a relatively expensive operation - it appears that the new area is allocated, contents copied from the old and the old area presumably released - so the operation gets more expensive as you fill the array and expand it. Consequently, I go for the large array first and use large increments to avoid having to redo the expands as much as possible.

Thank you Jerome, that confirms my impression from initial testing. However if you have data that can range in size from (say) 3 lines of text to 3000 or possibly more, you have a problem. What I intend to do is define a FIXED size local array of say 50 or 100 entries and then fill that up. When full or end of data, I only need to expand the PDA array by the number of entries in the local array and start filling the local array again. This will give me the minimum expansions and still provide a true value for *occurrences. Local memory is cheap.

The problem with multidimensional dynamic arrays is that they are squared (that’s how these type of arrays are called). What you want is a jagged array, i.e. an array for which the 2nd and following dimensions can vary in length. This requires an array of pointers that point to another dynamic array. This is not (directly) supported by Software AG.

Personally I do hope that Software AG implements both types of multidimensional dynamic arrays (the jagged array by means of a handle/pointer to a dynamic array).

Since you need an array of pointers you can emulate a jagged array by creating an array of object handles. Object handles are actually pointers.
When such an object handle wraps up a dynamic array you get your jagged array. Unluckily (some people might say luckily, since your data is encapsulated) you need methods to disclose your data. This solution saves you space at the cost of an extra level of indirection.

In the above example you could wrap up the group in an ODA. Most members could be disclosed using properties. The dynamic array would just be member of the class. I do know that this works (yes, also on the mainframe), but I have no clue how efficient this implementation is.
You could also create a wrapped plain dynamic array. This would come close to a dynamic array containing handles to dynamic arrays that can vary in length (it can even be empty occupying only a few bytes).

Or you could do what I have done which is to define 2 extensible arrays (of differing sizes) and having a pointers in the first array (dimension) to the appropriate start and end positions of the second array (dimension). Of course this requires that the second dimension is completely assembled before moving on the the next first dimension, which is fortunately true in my example. This would also work for 3rd+ dimensions.
The reason for the different sizes is that the first dimension would (depending on the data naturally) normally be much smaller than the second dimension (1:n) and to minimise the expansion of the arrays you would do the ‘buffer flush’ only when the 1st dimension was full (flush both dimensions) or only flush the 2nd dimension when it was full.