Filepolling Problem - Encoding Issue

Hello Support Team,

I have set a FilePolling on Integration. Integration grab the XML, but the XML has the following charset (Taiwan and Chinese).

Actually, it shows like in this way : 英穩é?”科技股份有é™?å…¬å?.

Is it possible to set in File Polling Setting the Encoding Type [“Content Type (optional) text/xml”]?

I have defined in the receiving Service the following Input:
node

Btw, I see always this failure Message in IS:
[ISC.0076.0007W] XMLCoder decode invalid data type: com.wm.lang.xml.Document

Best Regards

Iftikhar Ahmad
Company_20140219124901.xml (163 Bytes)

Your file poll settings looks right with text/xml and one question is your XML contains the starting tag

<?xml version="1.0" encoding="UTF-8"?>

HTH,
RMG

@Iftikhar,

Please let me know the JVM version.
What is default encoding of your JVM/OS?
You can validate the result using service: pub.xml:xmlStringToXMLNode, with setting different encoding.

Thanks,
Rankesh

First, make sure the tool you used to view the file supports displaying Chinese characters. It may well be the case that the file is properly encoded, but you just can’t view it properly. (what you shown here indicates that)
2ndly, make sure the actual file encoding is consistent with the XML encoding attributes. If you file is from an old system, they may be encoded in GB2312 or Big5, not UTF-8 or UTF-16. So you need to decode it accordingly.

@RMG Yes, the XML contains “<?xml version="1.0" encoding="utf-8"?>”

[quote]
@Tong Wang: I can see the NotePad++ the Taiwan Charset. How can I decode the XML by using FilePolling?

If you see Chinese character when set to use Big5 encoding in NotePad ++, that means those characters are not encoded as utf-8. WM will process them as UTF-8, so it will shown as something like: 英穩é?”科技股份有é™?å…¬å?.

Make sure all characters in your source file are encoded in the encoding that is declared on the XML tag.

Yes that should do the trick.

I have check it, the xml-file is encoded in UTF-8; I have tried to convert it to Big5, then to UTF-8. Even I have delete in XML the enconding type.

Is there any possibility to give the Service “xmlNodeToDocument” the Encoding Type?

Does this error fail your process of document:
[ISC.0076.0007W] XMLCoder decode invalid data type: com.wm.lang.xml.Document

if not, you can just ignore.

Iftikhar,
In the receiving service is the “xmlNodeToDocument” step failing?

Did you setup save/restorepipelinetofile and step thru it the pipeline and yes as Tong Wang said you can fully ignore that entry [ISC.0076.0007W] XMLCoder decode invalid data type: com.wm.lang.xml.Document shown in the server log.

HTH,
RMG

[quote]
@RMG: in the receiving Service I didnt get any failure. I have create two receive Step for only testing purpose:

receiveService1 (Input node):
pub.flow:getTransportInfo
pub.xml:xmlNodeToDocument

On this occasion the Chinese Charset are shown in this way: 英穩é?”科技股份有é™?å…¬å?

receiveService2 (Input nothing)
pub.flow:getTransportInfo
pub.file:getFile (take Input from filePolling/filename and load as bytes and encoding = UTF-8)
pub.xml:xmlStringToXMLNode
pub.xml:xmlNodeToDocument

On this occasion the Chinese Charset are shown in the correct way. I cant understand, why I need to extra operation like “pub.file:getFile” and set the enconding typ.

HI ,

we have also incorparated the same logic in our project , but the filetype is of flatfile. we used to get chinese characters in our flat file .

those characters use to miss when we recieve them. but we havent changed any encoding at our end, but have asked source system to change their encoding type and send,. that solved our issue.

anyways coming to your issue,

i tried with the xml file you have given , i have written a service contating the following steps:

steps(input is node)

pub.xml:xmlNodeToDocument
pub.xml:documentToXMLString
pub.client:smtp

the last step is to check whether i am getting the corecct characters. i got an email perfectly with chinese characters.please find attached email for refernece

EmailCapturedforChineseCharacters.PNG

Hi,

we are using webM 8.22 and I have set in FilePolling Settings-Content Type text/xml.

Even I have tried the same method like you, and send the xml to my mailaccount. See below

<?xml version="1.0" encoding="utf-8"?> 德律科技股份有�公�

Very Strange.

hi,

in file polling settings i have not set anything.

anyway ist an optional one

Regards,
Deepthi

@Iftikhar,

What is your OS default encoding where IS is installed?

Thanks,
Rankesh

It is Windows 2008 R2.

Hi Iftikhar,

Please let me result of the following java service:

String defaultCharacterEncoding = System.getProperty(“file.encoding”);

// pipeline

// pipeline
IDataCursor pipelineCursor = pipeline.getCursor();
IDataUtil.put( pipelineCursor, “defaultCharacterEncoding”, defaultCharacterEncoding );
pipelineCursor.destroy();

Thanks,
Rankesh

Hello Rankesh,

it is Cp1252. Boah. I didnt thought that. How ist that possible?

Thanks

Which version of WM IS are you running?
in “webMethods Integration Server: Upgrading from 6.0.1 to 6.1” it mentioned the change of default encoding from JRE’s default to UTF-8 for xml content handler.
I guess the behavior for any version newer than 6.1 should be the same, unless a different content handler is used

We are running webMethods 8.22