Finding Out the Numeric Position

HI,

Can you please help me on this?

i will start this with an example, say for instance the field contains
Field#1 := ATTN: TOM GROSS 3601 S 48 ST

Now in this sentence i have to find the Position of the numeric and then
Move the contents from the numeric till the end to field #p1 and the remaining to #p2.

i thought examine would work for this. But it didn’t.
whats the alternative for Examine under this scenario to find the numeric.

thanks,
harsh :smiley:

This is a “cute one” and very similar to a problem posted on SAG-L a couple of years ago:

DEFINE DATA LOCAL
1 FIELD#1 (A40)
1 #ARRAY (A30/1:20)
1 #OUT (A50)
END-DEFINE
INCLUDE AASETCR

FIELD#1 := 'ATTN: TOM GROSS 3601 S 48 ST ’
SEPARATE FIELD#1 INTO #ARRAY (*)
WITH RETAINED DELIMITERS ‘0123456789’
COMPRESS #ARRAY (2:20) INTO #OUT LEAVING
NO SPACE
WRITE #OUT
END

Page 1 07-06-11 06:50:16

3601 S48 ST

you could now EXAMINE for the first character of #OUT GIVING POSITION and then use SUBSTRING (field#1,1,#POS) to get the first part of the string (subtract 1 from position to get #POS)

steve

Here is a timing comparison of the technique above with a more “traditional” approach of using a MASK (N) on each position in turn

Oh yes, ignore the INCLUDE AASETCR in the solution above; just a utility of mine.

DEFINE DATA LOCAL
1 FIELD#1 (A40)
1 REDEFINE FIELD#1
2 #CHAR (A1/1:40)
1 #ARRAY (A30/1:20)
1 #OUT (A50)
1 #CPU-TIME (P9)
1 #CPU-START (P9)
1 #LOOP (P9)
1 #LOOP2 (P9)
END-DEFINE

FIELD#1 := 'ATTN: TOM GROSS 3601 S 48 ST ’
*
MOVE CPU-TIME TO #CPU-START
SETA. SETTIME
FOR #LOOP = 1 TO 100000
SEPARATE FIELD#1 INTO #ARRAY (
)
WITH RETAINED DELIMITERS ‘0123456789’
COMPRESS #ARRAY (2:20) INTO #OUT LEAVING
NO SPACE
END-FOR
*
COMPUTE #CPU-TIME = *CPU-TIME - #CPU-START
WRITE ‘separate:’ *TIMD (SETA.) #CPU-TIME


MOVE *CPU-TIME TO #CPU-START
SETB. SETTIME
FOR #LOOP = 1 TO 100000
FOR #LOOP2 = 1 TO 40
IF #CHAR (#LOOP2) = MASK (N)
ESCAPE BOTTOM
END-IF
END-FOR
END-FOR
COMPUTE #CPU-TIME = *CPU-TIME - #CPU-START
WRITE ‘for loop:’ *TIMD (SETB.) #CPU-TIME


MOVE *CPU-TIME TO #CPU-START
SETC. SETTIME
FOR #LOOP = 1 TO 100000
IGNORE
END-FOR
*
COMPUTE #CPU-TIME = *CPU-TIME - #CPU-START
WRITE ‘dummy for loop:’ *TIMD (SETC.) #CPU-TIME

END

Page 1 07-06-11 07:15:45

separate: 68 668
for loop: 163 1610
dummy for loop: 5 56

A better than 2 1/2 to 1 ratio. For long names, the ratio would be even higher.

steve

I would write the following code:

define data local
01 field#1 (A30)
01 #i4 (I4)
01 #j4 (I4)
01 #p1 (A30)
01 #p2 (A30)
end-define
Field#1 := "ATTN: TOM GROSS 3601 S 48 ST"
for #i4 = 1 to 30
  if substr(field#1,#i4,1) IS (N1)
    #j4 := #i4 - 1
    #p1 := substr(field#1,1,#j4)
    #j4 := 30 - #i4
    #p2 := substr(field#1,#i4,#j4)
    escape bottom
  end-if
end-for
display field#1 #p1 (AL=23) #p2 (AL=23)
end

But to be honest: Steve’s solutions is much more elegant.

Steve,

Thank You Very Much.
Its working fine now.:smiley:
I made the changes as you told
MOVE SUBSTR(#OUT ,1,1) TO #F1

EXAMINE FIELD#1 FOR #F1 GIVING POSITION #POS
#POS := #POS - 1
WRITE '#POS : ’ #POS
MOVE SUBSTR(FIELD#1,1,#POS) TO #P1
MOVE #OUT TO #P2


Output

Field#1 : ATTN: AMANDA BOYER-CONTRACT 1315 W ARLINGTON AVE LOT 113
#OUT : 1315 W ARLINGTON AVE LOT113
#P1 : AMANDA BOYER-CONTRACT

Thanks,
Harsh :smiley: :smiley:

Be aware: there’s a blank missing in the variable #OUT:

LOT113

The original text is:

LOT 113 

On PC the loop processing can be increased if for the loop counter I4 instead of P9 is used,
the program from Matthias can be changed a little bit for better performance:

FOR #LOOP2 = 1 TO 40
    IF SUBSTR(FIELD#1,#LOOP2,1) IS (N1)
      #OUT2 := SUBSTR(FIELD#1,#LOOP2)
      ADD -1 TO #LOOP2
      #OUT1 := SUBSTR(FIELD#1,1,#LOOP2)
      ESCAPE BOTTOM
    END-IF
  END-FOR

This solution could also work with dynamic variables, not only on fixed length strings (if loop end is changed to *length(field#1) ).
No additional variables needed, and readable, for non Natural tecies…

And best of all, no blanks will be removed, as Helmut Spichtinger mentioned.

Finally on my PC I can only see only small performance differences between the solutions:

    310 ATTN: TOM GROSS / 3601 [b]S48[/b] ST

separate: 22 209

    490 ATTN: TOM GROSS / 3601 [b]S 48[/b] ST

for loop: 23 233

    680 ATTN: TOM GROSS / 3601 [b]S 48[/b] ST

for loop Matthias: 25 245

dummy for loop (I4): 0 5
dummy for loop (P9): 1 14

Hi when i tried running according to Weihnachtsb

Hi TCS;

Hi when i tried running according to Weihnachtsb

Hi,

Its corrected .
It was my mistake that i didnt add the cntr in the transaction part, it worked fine after adding the cntr to the transaction section.

Thank you all for the help. :smiley: :smiley:

Well it was a good catch from Helmut Spichtinger that the last part was getting squeezed which i didnt notice. :slight_smile:

Steve ,
How can i use this RETAINED option if its ALPHBETIC or ALPHANUMERIC

Thanks,
Harsh

Hi;

I love a “challenge” early in the morning; at my PC, birds singing outside, fresh coffee steaming by my side, and a programming job to do. Is there any better profession?

I realized that my solution had some “silliness”. For example, why SEPARATE the entire source field, just to COMPRESS it later? Not only was it silly, but it lead to the lost blank that was discussed above. So, I rewrote my code; using the SEPARATE to basically just find the first integer. Then, I used an EXAMINE to find the location of the first integer.

This not only got rid of the lost blank, it turns out to be a LOT faster. My guess is that the SEPARATE with the IGNORE is a lot faster than the original SEPARATE.

DEFINE DATA LOCAL
1 FIELD#1 (A40)
1 REDEFINE FIELD#1
2 #CHAR (A1/1:40)
1 #ARRAY (A30/1:20)
1 #OUT (A50)
1 #OUT1 (A50)
1 #OUT2 (A50)
1 #CPU-TIME (P9)
1 #CPU-START (P9)
1 #LOOP (I4)
1 #LOOP2 (I4)
1 #POS (I4)
END-DEFINE
INCLUDE AASETC

FIELD#1 := 'ATTN: TOM GROSS 3601 S 48 ST ’
*
MOVE *CPU-TIME TO #CPU-START
SETA. SETTIME
FOR #LOOP = 1 TO 100000
SEPARATE FIELD#1 INTO #ARRAY (1:2) IGNORE
WITH RETAINED DELIMITERS ‘0123456789’
EXAMINE FIELD#1 FOR #ARRAY (2) GIVING POSITION #POS
MOVE SUBSTRING (FIELD#1,#POS) TO #OUT

SUBTRACT 1 FROM #POS
MOVE SUBSTRING (FIELD#1,1,#POS) TO #OUT
END-FOR
*
COMPUTE #CPU-TIME = *CPU-TIME - #CPU-START
WRITE ‘separate:’ *TIMD (SETA.) #CPU-TIME


MOVE *CPU-TIME TO #CPU-START
SETB. SETTIME
FOR #LOOP = 1 TO 100000
FOR #LOOP2 = 1 TO 40
IF SUBSTR(FIELD#1,#LOOP2,1) IS (N1)
#OUT2 := SUBSTR(FIELD#1,#LOOP2)
ADD -1 TO #LOOP2
#OUT1 := SUBSTR(FIELD#1,1,#LOOP2)
ESCAPE BOTTOM
END-IF
END-FOR
END-FOR
COMPUTE #CPU-TIME = *CPU-TIME - #CPU-START
WRITE ‘for loop:’ *TIMD (SETB.) #CPU-TIME


MOVE *CPU-TIME TO #CPU-START
SETC. SETTIME
FOR #LOOP = 1 TO 100000
IGNORE
END-FOR
*
COMPUTE #CPU-TIME = *CPU-TIME - #CPU-START
WRITE ‘dummy for loop:’ *TIMD (SETC.) #CPU-TIME

END

Page 1 07-06-12 07:33:54

separate: 21 206
for loop: 95 939
dummy for loop: 2 20

Weihnachtsb

Here are the results (my PC is a Pentium 4 (HT) with 3GHz, and NAT63) with your new program:

separate: 7 69
for loop: 23 225
dummy for loop: 0 5

If you change the string to:
FIELD#1 := ’ 1’ /* blank 1

separate: 6 56
for loop: 5 51
dummy for loop: 0 5

If you change the string to:
FIELD#1 := ‘1’ /* just a number

both algorithm fail, because

MOVE SUBSTRING (FIELD#1,1,#POS) TO #OUT1
or
#OUT1 := SUBSTR(FIELD#1,1,#LOOP2)

… the upperbound is 0 (#POS or #LOOP2)

If you change the string to:
FIELD#1 := ‘A’

your algorithm fail, because
you don’t find any number: #POS is 0
and therefore out of range…
MOVE SUBSTRING (FIELD#1,#POS) TO #OUT2

If you change the string to:
FIELD#1 := ‘---------------------------------------9’
your algorithm fail, because
#ARRAY (A30/1:20)
has to be changed to (A40/1:2)

… and everything changes if you use dynamic variables and *length(FIELD#1) as upper bound for the loop …

here the modified one …

DEFINE DATA LOCAL
1 FIELD#1 (A40)
*
1 #ARRAY (A40/1:2)
1 #POS (I4)
*
1 #OUT (A40)
1 #OUT1 (A40)
1 #OUT2 (A40)
1 #CPU-TIME (P9)
1 #CPU-START (P9)
1 #LOOP (I4)
1 #LOOP2 (I4)
END-DEFINE

* FIELD#1 := 'ATTN: TOM GROSS 3601 S 48 ST '
* FIELD#1 := 'A'
FIELD#1 := '9'
* FIELD#1 := '---------------------------------------9'
*
MOVE *CPU-TIME TO #CPU-START
SETA. SETTIME
FOR #LOOP = 1 TO 100000
  RESET INITIAL #OUT1 #OUT2
  #OUT1 := FIELD#1
  SEPARATE FIELD#1 INTO #ARRAY (1:2) IGNORE
    WITH RETAINED DELIMITERS '0123456789'
  EXAMINE FIELD#1 FOR #ARRAY (2) GIVING POSITION #POS
  IF #POS NE 0
    MOVE SUBSTRING (FIELD#1,#POS) TO #OUT2
  END-IF

  SUBTRACT 1 FROM #POS
  IF #POS > 0 /* changed after 1st post 
    MOVE SUBSTRING (FIELD#1,1,#POS) TO #OUT1
  END-IF
END-FOR
*
PRINT FIELD#1
PRINT #OUT1 "/" #OUT2
COMPUTE #CPU-TIME = *CPU-TIME - #CPU-START
WRITE 'separate:' *TIMD (SETA.) #CPU-TIME
************************************
MOVE *CPU-TIME TO #CPU-START
SETB. SETTIME
FOR #LOOP = 1 TO 100000
  RESET INITIAL #OUT1 #OUT2
  #OUT1 := FIELD#1
  FOR #LOOP2 = 1 TO 40
    IF SUBSTR(FIELD#1,#LOOP2,1) IS (N1)
      #OUT2 := SUBSTR(FIELD#1,#LOOP2)
      ADD -1 TO #LOOP2
      IF #LOOP2 > 0
        #OUT1 := SUBSTR(FIELD#1,1,#LOOP2)
      END-IF
      ESCAPE BOTTOM
    END-IF
  END-FOR
END-FOR
PRINT FIELD#1
PRINT #OUT1 "/" #OUT2
COMPUTE #CPU-TIME = *CPU-TIME - #CPU-START
WRITE 'for loop:' *TIMD (SETB.) #CPU-TIME
*************************************
*
END

Here are my mainframe times:

MORE
PAGE # 1 DATE: 07-06-13
PROGRAM: NUMFINDY LIBRARY: XSTRO

SEPARATE: 7 56
FOR LOOP: 44 338
DUMMY FOR LOOP: 0 6

Interesting; the only number really different from Weihnachtsb

Steve ,
How can i use this RETAINED option if its ALPHBETIC or ALPHANUMERIC

Thanks,
Harsh

Not sure if I understand your problem. Is this program what you mean?

DEFINE DATA LOCAL
1 #STRING (A30) INIT <‘abcdefg1234567’>
1 #ARRAY (A20/1:7)
END-DEFINE
*
SEPARATE #STRING INTO #ARRAY () WITH
RETAINED DELIMITERS ‘cf3’
DISPLAY 10X #ARRAY (
)
END
Page 1 07-06-14 06:45:18

             #ARRAY
      --------------------

      ab
      c
      de
      f
      g12
      3
      4567

steve

p.s. you can also have delimiters expressed in Hex (useful for things like line feeds and carriage returns coming from PCs).

Steve,

I asked that in general .
If i come across the scenario which i have asked before, if in that kind of example if i need to do a alphabetic or alphanumeric check can i use this retained option or ??

Field #1 :15976 CARRIES LN SOUTH BELOIT IL 610808050
Field #2 :426 VILLAGE WOOD LN UNIT 7 ROCKTON IL 610722984

Scenario 1 : I need to Delete the last numeric if found. 610808050 in this known case.
Scenario 2 : I need to look for Alphabetic and if found move it to some other variable or delete it. ROCKTON in this known case or anything that follows ROCKTON.

Thanks,
Harsh
:smiley:

Lets start simply. As I tried to indicate in the previous example, you can use any characters (strictly alpha, numeric, special, hex, etc) as delimiters, Natural does not care. Delimiters can be either retained or not, Natural does not care.

Now, on to the rest of your last posting.

What started out as an interesting problem in manipulating an alpha variable has gotten way out of hand. What you are looking for now is information about a general purpose, handy dandy address parser. That in itself is difficult. The fact that we do not know the characteristics of the data makes it impossible.

For example, you want to delete a zipcode if found. You could separate field#1 using blanks as the delimiter, then “look” at the last “chunk” to see if it is either 5 or 9 digits; EXCEPT , what if it is 12345-6789? Or a foreign address with non integers, or…

Good luck finding the City (e.g. Rockton) unless it is always the field before a State; but is there always a state? what about provinces? if there is always a state, is it always abbreviated, etc.

In short, one would have to know the characteristics of the data before one could even begin to undertake such a project.

Then, you would need to understand the data manipulative statements in Natural, such as EXAMINE, SEPARATE, COMPRESS and IF options like MASK and SCAN.

steve