XML encoding to iso88591

All,

I want the encode the xml I sent with iso-8859-1. This means that the first line of the xml must be:
?xml version=“1.0” encoding=“iso-8859-1”?

I can put this line in my xml, but if I put the input parameter “encode” to true, the xml will be HTML-encoded. How can I encode with iso-8859-1?

Difference between both encodings:
’ becomes ' in iso-8859-1
’ becomes ’ in HTML

Any help would be welcome?

Claire

Claire,

BTW which service are you using for encoding and setting the input parameter?Just elaborate your flow,so that it will help to respond accurately.

Thanks,

Hi,

I use IS 6.0.1 and I use pub.xml:documentToXMLString to create the xmldata.

As the first element of the document, I’ve set @encoding and hardcoded it to iso-8859-1.

Hopes this is clear

I believe there is no option provided in the pub.xml:documentToXMLString to hardcode encoding type (UTF-8 or iso-8859-1),which has only select list true or false.When you set this to true it will convert special characters (&,>,<) to (&<>)and process the document.So dont add the line encoding=“iso-8859-1” in the xml.

You can create a String attribute in you document namespace called @encoding and set addHeader in documentToXMLString to true.

So you can add any encoding you need for each reason, either UTF-8 or ISO 8859-1.

This works fine for me!

regards,

Jefferson Pereira
jeffersoncp@terra.com.br

Jefferson,

I tested this its not working.
created @encoding=ISO-8859-1 in nsDecls and addHeader=true and also encoding=true.
out put is still showing ’ for (quotation)

If i set encoding=false then it is showing quotation symbol as it is in the data.

Where we have to set this encoding attribute is it in nsDecls or somewhere else?

Thanks,

RMG,

did you create the @encoding in the Document Namespace properly?

You must add in the same level as root element of your document. So you can set @encoding in a map of your service, before convert document to XML string. This you do like any other element of your document namespace.

Than you run the service documentToXMLString with encoding (true), nsDecls is not set.

Att,

Jefferson

I followed exactly what you have said,but documentToXMLString output xmldata is still showing (Quotation) and the xml header has encoding=“ISO0-8859-1”

<?xml>
<employee>
<name>Car’s</name>
</employee>

sorry there was a typo in the encoding=“ISO-8859-1” of above post.

Ok,

Define your document namespace called EmployeeXML: document namespace is the document definition of a XML Document.

&#43;-@encoding
+-Employee – this is the root element
|–+-Name

When you add document reference in your flow, you will reference EmployeeXML. Its root element is Employee, in the same level of the root you must define your @encoding element. This one you will set the value ISO-8859-1.

After setting value for name you will invoke documentToXMLString and will map EmployeeXML to document in this service. You will also set addHeader as true. This last one will generate the header:
?xml version=“1.0” encoding=“ISO-8859-1”?.

I´m sorry that I don´t have version 6 installed on my machine so you could send you a sample.

Regards,

Jefferson

I agree with RMG, I can correctly generate the header by defining attributes in the top level of my document and setting “generate header” to “true”.

However, setting encoding to true results in HTML encoding not ISO-8859-1 encoding. It appears to me that the pub.xml:documentToString service does not support ISO-8859-1 encoding out-of-the box.

I also did not see any settings in server.cnf that would lead me to believe that you could change this on a server-wide basis.

If there are relatively small number of differences (that you are about, anyway) between HTML encoding and ISO-8859-1 encoding you could use the pub.string:replace built-in service to convert from one to the other or create a java service to leverage existing JDK or custom classes to do this.

I thought I might be able to create a java service that used something like the following to help, but it didn’t work for me.

I’m obviously missing something.

Mark

Jefferson,

I am able to generate the header with ?xml version=“1.0” encoding=“ISO-8859-1”?. but still the encoding is not working for handling quotation and still its putting HTML encoded 'instead it has to show '.

Am i in the same page?

Mark is also getting the same behaviour.

Regards,

Hi all,

I don’t see another solution than to hardcode iso-8859-1 in the xml header and to replace all differences.

Does somebody has a complete list of all differences between UTF-8 and iso-8859-1?

Claire

Hello Claire,

No need for any replace logic, just follow these 4 steps

  1. Call the service ‘xmlStringToXMLNode’. Set encoding to ‘UTF-8’ and isXML to ‘true’

  2. Call the service ‘xmlNodeToDocument’.

  3. Add a map step. Set the ‘@encoding’ in the parsed ‘document’ to ‘ISO-8859-1’.

  4. Call the service ‘documentToXMLString’. Set ‘encode’ parameter to true and ‘addHeader’ parameter to true.

That will work. I am using it though in version 4.6 version of IS.

-Rajesh Rao

Rajesh, your solution still results in apostrophes being encoded as “'” (HTML-encoding) instead of “&apos” (ISO-8859-1 encoding).

I think we all agree that we know how to add the ‘encoding="ISO-8859-1’ to the header and that we know how to HTML-encode the XML string. The question on the table is how to ensure that the ISO-8859-1 charset is used in the XML string.

Claire, what other issues are you dealing with other than the apostrophe? I found this reference to &apos in the “HTML Compatibility Guidelines” section of the XHTML 1.0 specification:

ISO-8859-1 is the technical name for the “Latin-1” or “Western European” charset. Its valid characters are listed here: http://www.w3.org/TR/html4/sgml/entities.html#h-24.2

The “encoding” flag in the pub.xml:documentToXMLString built-in service controls whether the resulting string is HTML-encoded. It does not have any affect on the character set of the resulting XML document.

Mark

Hello Mark,

It is working properly for me. I also tested with version 6.10 of IS. It means that I am getting ‘&’ in the output XML document, with ISO-8859-1 encoding and also displayed in the header. So I am surprised that it is not working for you.

Are you sure that you are providing a proper ‘UTF-8’ encoded XML document in the first place. You can test this by first trying to open the input XML document in Internet explorer. Additionally when you step over the ‘xmlNodeToDocument’ are you getting ‘&’ and not ‘&pos;’. If so the next step should definitely provide a properly encoded XML document in ISO-8859-1.

-Rajesh Rao

Rajesh,

Are you able to encode Quotation to '?
But in your above message it shows that you are getting & which we also got it worked for & character.

Please make sure you are encoding quotation mark.

HTH,

Hi all,

When I read:
The “encoding” flag in the pub.xml:documentToXMLString built-in service controls whether the resulting string is HTML-encoded. It does not have any affect on the character set of the resulting XML document.

Does this mean that the “encoding” flag doesn’t take care of special characters like é? If not what do I do to have them encoded?

Claire

Hello All,

For the quotation mark (') I get the output document as '. I am successfully able to open the XML file and view it within IE.

-Rajesh Rao

Even we are able to open in IE,but i believe this is not the actual problem that claire has.

Any other solutions?