Hi, what should be the data ‘type’ should i assign for Unicode in the schema? And what should be the format it can accept? The xample of unicode i have looks like this :
(\u4eba\u53c2). Kindly comment on it … Thanks
Please explain your question.
Do you like to create some element that should contain content, delivered in the unicode encoding? The encoding information is to be declared in the xml header of the instance.
The information stored in the type property declares the value space of the content, that means if it should contain a string, or a number etc. But this information is independent from the encoding of the data.
Hi,
The XML document contains both English text(eg. English Name) and Unicode character (eg. Chinese Name of unicode value = ‘\u4eba’). Should encoding be conventional or to UTF-8.
Problem encountered when we store the Chinese name as String in Tamino. Reading the Chinese Name field using Java language treats the unicode character as string instead of double-byte character.
Just wondering if there is a data-type in Tamino to specifically store the data as Unicode characters. Or is it possible to set encoding for specific part of the XML document. Otherwise is there an API to convert string with unicode value to force the system to treat the data as Unicode.
Many thanks for your response.
Calvin;)
There is no special data type. Its a matter of encodings.
If you load the data into Tamino you have to declare the encoding your data is stored with. Tamino will load the data and stores it into a universal encoding. If you retrieve the data you have to tell Tamino what encoding you desire. Now again you have to specify an encoding that fits to the data, otherwise information will be lost. If you now get some byte stream in Java, again you have to tell Java which encoding is to be used for the conversion into String (== unicode characters).
If you like to know how to do this with the Java Tamino API, please post this question to the Tamino API for Java - Forum
If I understand you question correctly ,than the datatype for Unicode in Tamino or XML Schema is “string”. If characters need to be escaped, then use XML characters entities in your document ,e.g 人参 ( NOT Java notation “\u4eba\u53c2”)