Can we search with Chinese Characters?

I have written a simple application to do full text search. The input tag is like the following



It is ok for search with English words. However, no result return for Chinese characters (BIG5)? Anythings I need to do?

Dave Low

[This message was edited by Christian Freytag on 20 Mar 2003 at 15:39.]

Hello Dave,

Do you want to build an application which supports multiple languages (English and Chinese)?

It shut be possible if you use UTF-8 to encode your Java Server Pages and define UTF-8 as default encoding for your application. Otherwise, if your encoding is set to ISO-8859-1 or BIG5 and the HTTP Client (Browser) does not sent information how it encoded the HTTP request parameters, it may be that the bytes are not correctly decoded by the servlet container.

Could you check the search with this UTF-8 option ?

(When using the generator UTF-8 encoding you receive UTF-8 coded JSP pages by default.)

If the error still occurs please sent a further post. Perhaps, it is an Tomcat problem because we tested the I18N stuff mainly with Tomcat 3.3.

Bye,
Christian.

I find that the following URL get nothing

http://localhost/tamino/mydb/collection?_encoding=big5&_xql=/Object[Title~=‘??’]
while the following get the result

http://localhost/tamino/mydb/collection?_encoding=big5&_xql=/Object[Title~=‘??’]
Do it means that Tamino use symbols and spaces as separator for full-text search (because Chinese characters are not separated with spaces)?
Any suggestion with Chinese characters?

Dave Low

Hello,

this topic will be moved to the Assistance Forum in the next days.
The assistance forum is intended to be used for technical questions and answers.

Thank you for your understanding.

Regards, Harald

[Dave Low] …
> I find that the following URL get nothing
>
> http://localhost/tamino/mydb/collection?
> _encoding=big5&_xql=/Object[Title~=‘??’]
> while the following get the result
>
> http://localhost/tamino/mydb/collection?
> _encoding=big5&_xql=/Object[Title~=‘??’]
> Do it means that Tamino use symbols and spaces
> as separator for full-text search (because
> Chinese characters are not separated with
> spaces)?

Yes.
This is obviously a bug (see SAGSIS 204645).

> Any suggestion with Chinese characters?

Yes.
Tamino 312 will support text search for Chinese
(and the other languages that use han characters:
Japanese and Korean).

First customer shipment of version 312 is planned
for July 2002. Beta test kits for Chinese will be available at end of April. If you like to
participate in the beta test, please contact
product management (Andreas Grübel) or me.

All the best,
Paul

-------------------------------------------------
Paul Langer E-mail: paul.langer@softwareag.com
Software AG Phone: +49-6151-92-1912
Uhlandstraße 12 Fax: +49-6151-92-1613
64297 Darmstadt
Germany

Hello,

for X-Application 4.1.1 building applications for schemas based on Chinese / Japanese / … character sets should be possible.

Also multi-language support with UTF-8 based applications is supported by X-Application.

However, please have a look at the chapter Internationalizion of X-Application’s documentation. It shows how to adapt the Tomcat container to support I18N requirements.

Bye,
Christian.