Tokenize Problems Regex

Guest · December 17, 2004, 12:24am

Hi,

I’m haing a problem tokenizing a flat file using the webMethods services.

The file contains a number of fields (some may be empty):

field0| field1| field2||field4|||field7

I would expect pub.string:tokenize to give me a list as (when tokenizing by |):
0 field0
1 field1
2 field2
3 null or blank
4 field4
5 null or blank
6 null or blank
7 field5

However, it is skipping the null values to give me:
0 field0
1 field1
2 field2
3 field4
4 field5

I considered using pub.string:replace to replace all “||” with “| |”, but I also need to check for “|||”, “||||”, etc. I am having difficulty using a regex search to solve this problem.

Any help would greatly be appreciated.

Yemi_Bedu · December 17, 2004, 1:22am

Hello,
Your issue starts with the fact that string:tokenize that you use is probably based on the Java StringTokenizer class which does the same annoying thing. I have a simple tokenize flow service. 100% flow using only WM flows in transformers.

Good day.
Yemi Bedu

makeTokens
makeTokens.zip (2.3 k)

aswick · December 17, 2004, 1:33am

Loop it.

Do a pub.string:replace, then do a Repeat, checking for pub.string:indexOf (substring = “||”). Do a branch, and break from the loop if %value% == -1. The $default branch of the loop is another pub.string:replace.

The structure looks like this:

replace (searchString = “||”, replaceString = “| |”)
REPEAT (repeat-on SUCCESS)
__indexOf (substring = “||”)
__BRANCH (switch = “/value”)
____-1: SEQUENCE
______EXIT ‘$loop’
____$default: SEQUENCE
______replace (searchString = “||”, replaceString = “| |”)
tokenize (delim = “|”)

Guest · December 17, 2004, 4:20am

Try pub.string:replace with the following inputs:

searchString: “|||”
replaceString: "| "
useRegex: true

Don’t include the " " - just to illustrate space in replaceString

It will convert:
field0| field1| field2||field4|||field7
into
field0| field1| field2| | field4| | | field7

Now pub.string:tokenize will give you 8 elements (but you’ll have to trim them for the space added.

Topic		Replies	Views
Parsing Data EDI	3	1179	April 2, 2021
Flat File Delimiter Issue EDI	23	6541	April 2, 2021
Flat file with single quotes in around fields EDI	8	1796	April 2, 2021
Flat file delimiter "\|\|" flat-file , delimiter	11	2477	April 18, 2022
Problem with convertToString service and outbound flat file EDI	2	2276	April 2, 2021

Tokenize Problems Regex

Related topics