Error inserting XML-documents that contains "&" characters

Hi!

Technical data: Tamino V2.3.1, JDK 1.3.1, Apache Webserver

I have a xml document as a long String. I tried to insert this String in Tamino by using the native Java HTTP Interface. Here is the code, where sXML is the XML-String


String sEncodedXML = URLEncoder.encode(sXML);
URLConnection connection = url.openConnection();
connection.setDoOutput(true);
PrintWriter out = new PrintWriter(connection.getOutputStream());
out.println("_process=" + sEncodedXML);
out.close();

This method works only if there is no “&” character in the text-elements of the XML-String. But if there is a “&” character I get this message from Tamino: Invalid token found or document incomplete

After that I tried to use the TamioClient API. I used the following code to create a DOM from my XML-String:


DOM dom = new DOM();
Document doc = dom.readDocument(new StringReader(sXML));

After that TaminoClients insert method got the same error, if there was a “&” character in any text-element of the XML-document. I don’t know what to do anymore, can you help me??

Wolfram Hu

I read the solution for this problem in another topic: Replace all occurences of “&” with “&”. Okay it works, but it looks like a workaround. Is this the only solution to this problem?


Bye

Wolfram

Hi,

this has nothing to do with Tamino. Please see this paragraph from the XML 1.0 section: *)

quote:

The ampersand character (&) and the left angle bracket (< ) may appear in their literal form only when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they must be escaped using either numeric character references or the strings “& amp;” and “& lt;” respectively. The right angle bracket (> ) may be represented using the string “& gt ;”, and must, for compatibility, be escaped using “& gt ;” or a character reference when it appears in the string “]]>” in content, when that string is not marking the end of a CDATA section.


( from W3C XML 1.0 Second Edition, section 2.4 Character Data and Markup http://www.w3.org/TR/2000/REC-xml-20001006 )

Ampersand (and <, >, ", ') are special characters that have to be escaped. Ampersand is used (just like in HTML) to introduce a reference, and for that matter, special characters (like auml =

Thnaks for this detailed explanation :-).

Is there a method to XML encode a String in the JDK or Tamino Java API? I can write it by myself, but if there is one - it is less work.


Wolfram