Encoding problem extended ascii

Hi - I’m being sent a flat file that has a special character in it - apparently it’s an extended ascii value for 1/4" (¼) it’s ascii 172. I also found on the 'net that it’s a windows extension. I would like to tell the vendor to send us it as 3 bytes (ie 1/4) as they have in other cases but they’re telling me that’s what the user is entering in their system.

It’s bombing out when I do convertToValues saying malformed input.

Is there an encoding setting when I do convertToValues that will handle it? Even if I get it past webmethods it will have to get into SAP and SQL server as well. Any assistance would be appreciated! I would prefer to tell the vendor just to change it to 1/4 before they send it but I’m looking for other possible alternatives if it’s not to onerous.

Thanks

Will,

One-quarter (¼), one-half (½) and three-quarters(¾) are all valid characters in both UTF8 and ISO-8539-1 encoding.

I whipped up a tiny example and was able to successfully parse flat files containing these three fraction characters. I did notice that while the fraction characters displayed properly in Developer’s results window for the string created from the bytes read from my test file, it did not display properly in the record created by the convertToValues service.

For example: "comment containing One-quarter (¼) " became "comment containing One-quarter (�) " in the ffValues record produced by convertToValues.

Not sure what’s up with that. If I run the string created from the bytes read with getFile through my charsetEncoder encoder service (see http://www.wmusers.com/wmusers/messages/117/36026.shtml), I can convert it successfully from UTF-8 to ISO-8859-1 and see the characters.

If you remove these fractional characters, does the malformed exception go away?

I guess one workaround might be to turn your ffData into a string, process it with a sequence that calls pub.string:replace to swap out the fraction characters before processing the string with convertToValues.

Mark

Mark

Thanks Mark,

What I discovered is setting the encoding to utf-8 for convertToValues doesn’t work (it’s the default anyways) - I had to set the encoding when I load the file from the server when I get the stream. It shows up properly with the record I get after convertToValues. However even though webmethods parses it now it would have to go into SAP, SQL Server and another application to verify it gets all the way through. Thanks for the info - at least webmethods was able to handle it ok.

Will,

I’m having trouble with convertToValues recognizing the â character (ascii 226). Whenever convertToValues hits an invoice that includes this character it seems to skip that invoice and the next 8!!

You said in your earlier post that “I had to set the encoding when I load the file from the server when I get the stream”, where did you set this? I’m using pub.file:getFile to read my stream, but that service doesn’t let you set the encoding…

Any help would be appreciated.

Darren

Darren,

For setting encoding follow these steps,

Once you done with getFile(loadAs=Stream) service in the next steps use the pub.io:streamToBytes and pub.string:bytesToString (set encoding)and map the output String to convertToValues service.

If you are using wm.b2b.edi:convertToValues service there also you can see input param (encoding).but i am not sure if this setting will help to handle that character.

HTH,

I’ve got a weird situation that is related to this thread:

Importing a flat file into wM 6.1 with this text:
“Lever Faberg‚”

After doing a convertToValues, with no encoding, the output is:
"Lever Faberg " <—A little square after the g that didn’t show up in this post

Just for a test, I ran it through the path of stringToBytes and bytesToString (trying several different encoding values).

ISO8859-1 / ISO8859-2 = “Lever Faberg?”
UTF-8 = “Lever Faberg‚”

To confuse me further, when I imported the file into Access to do some comparisons, Access treated the character as “é” which is what I believe it should be. Ascii decimal # 130

Hopefully I’m missing something fairly obvious. Can someone enlighten me?

Thanks!
Cort