&- characters in XML


is there a way to accept XML data which contains characters like ‘&’.; Or has the outside world to send it’s data with CDATA blocks?



ampersand is the XML character meaning an entity reference is coming. Characters like ampersand, less-than and greater-than should be ‘htlm encoded’ before being put into an XML attribute or element value.

If you are in IS and creating XML output using pub.web:recordToDocument set the encode parameter to true. Other products that create XML files should have a way to html/xml encode the data.

You will have to encode the special characters.
& is &
’ is ’
" is "
It goes like this.
All the best!

Solution provided by fred hartman DOESNT help.

CASE: If u load an XML Node and want an XML string u cant do because the service
fails at pub.flow:documentToRecord before reaching the pub.flow:recordToDocument

please provide the right solution.

One way is righting the java service but the big problem is how to convert the xml node to xmlstring in Java Service.

if we get an XML String in java service we can replace “&”; by “&” and thereafter we will have no problem,can somebody please reply how to do that

Actually, Fred’s suggestion is correct. He is saying that the client needs to encode the special characters before they are coming to the IS because it will fail on receipt of “&”; or “<”. This is a common practice in the XML world and is always a topic for discussion when ramping up new customers, discussing text fields for item descriptions, etc.

To understand why this is an important point, think of the “<” rather than a “&”.; If a customer sent this character in an XML message, it will fail every time b/c wM thinks its a new XML tag. If you were to try to create a java service to receive this character properly, that would be nearly impossible to search for any instance of a incomplete tag.

When you run the docToRecord and recordToDoc services you should make sure the encoding parameter is in sync (so you know that you are sending an encoded message out).

Regarding the CDATA tag that is mentioned originally, that will also work if the client sends XML with “&”; enclosed in a CDATA tag. The IS will remove this tag on receipt automatically, and any & will not fail. Of course, this is just another flavor of encoding, so might as well have the customer send html encoded messages rather than CDATA encoded messages.

i completely agree with whatever everyone is saying but what i am looking for
a way how to convert an xml node into an xml string( FORGET THE CASE THAT XML IS NOT WELL-FORMED AND IS INVALID) using webMethods Java Services.

i tried toString() method but of no use…please please someone tell me how to do that.

Given below is the code i tried.Please look into it and comment where i am doing wrong
IDataCursor pipelineCursor = pipeline.getCursor();
Object ob=pipelineCursor.getValue();
IDataUtil.put( pipelineCursor, “objectNode”,ob);
String ob1=ob.toString();
IDataUtil.put( pipelineCursor, “XMLString”,ob1);

I think when the XML is mal-formed and is Invalid the node formed itself is invalid.
Because if you try to query that node using pub.xml:queryXMLNode and try doc.src
to view the source u wont get anything…the service fails.

If there are parse errors there is no way to get the original content. The stream will be partially read and partially whereever the stream is coming from, such as a network socket, which may have been closed upon getting the error. I’m not sure if the parser layer or the network layer closes the socket, I think in the case of bad XML being HTTP POSTed to IS the network layer will flush the remaining stream content and create the error response message to send out on that socket.

The only work around I can think of is to create a ContentHandler that reads and buffers the complete stream before sending a new stream on to the current text/xml ContentHandler, so you have the stream content if there was an error. This is incredibly costly.

This is one reason the Integration Server parser does not do DTD/schema validation (that can be done optionally after a successful parse). Validating parsers will lose not only syntactically bad files, but also any files that don’t 100% match the schema.

I am having a similiar issue. One customer is sending PO’s with an unencoded “&”; in an address causing failure in document to record. Yes, the customer should be encoding it themselves but they insist that none of the other vendors have a problem with this and insist we correct for it on our end. If there is no way to successfully transform the node to a string, is there a way to stop the automatic parsing for this partner?

I’d like to know what parser is handling this syntactically incorrect XML-like file.

You can install ContentHandlers for any incoming data format. There is an example of one in WmSamples. I would recommend that you register a text/xmllike ContentHandler and have them send that as the Content-Type of the POST. They are not sending XML [1], so should not say they are. A ContentHandler class can do anything with the stream. The default ContentHandler just puts the stream object in the pipeline for each recieving service to handle as it wishes, but if you want to do common processing, the ContentHandler can do the work to build a known format pipeline in a cental place. If the given Content-Type does not have a registered ContentHandler the default ContentHandler is used.


[1] http://www.xml.com/axml/testaxml.htm Section 2.4