Canbt query an html page with charset%3dunicode using pubwebquerydocument

marius1 · February 8, 2003, 3:28am

Hello !

I’m doing a simple web flow in B2B 4.0 and I’ve run in the following issue :
I have a flow service with “loadDocument” and “QueryDocument”. I can’t query following html :
"

" It seems that the "unicode" is causing some problems. The error that I get is : Could not obtain the Document View from the Server ... com.wm.util.LocalizedCharConversionException Incorrect character encoding (Missing byte-order mark)

I’ve tried to specify the encoding in the loadDocument parameters (like UTF-16 ), but no good result.

Any ideeas ?

Best regards,
Marius

fred.hartman.5916 · February 9, 2003, 3:23am

Two guesses:

Although the HTTP headers are good and the content is UTF-16, charset=unicode doesn’t define whether the data is big endian or little endian, so without a BOM the parser doesn’t know how to parse the data.
[url=“FAQ - UTF-8, UTF-16, UTF-32 & BOM”]http://www.unicode.org/faq/utf_bom.html[/url]#22
Check the HTTP headers using pub.flow:getTranportInfo. The web server may be doing something funky with encoding. Then look at the characters in the stream (get with pub.client:http and look at the bytes to see that they are really unicode).

Guest · February 9, 2003, 6:33am

Thanks for the response !

I don’t have any control of how the html is build. I need to find a way to load it and parse it using flow services.
I’ve tried on webMethods version 3.0 and it works fine. I assume that 3.0 just ignore any unicode specifier.
The IE is able to render it.

Topic		Replies	Views
problem with word 2000 html files Tamino	2	8433	April 2, 2021
nixe, html and charset Tamino	3	11190	April 2, 2021
UTF-8 question Tamino	6	6326	April 2, 2021
encoding Tamino	2	6314	April 2, 2021
xml encoding Tamino	4	5776	April 2, 2021

Canbt query an html page with charset%3dunicode using pubwebquerydocument

Related topics