Error 1003 0069: Missing unicode character U+2019 in target ICU Converter ibm-37_P100-19

We had our services use TRANSLATION=SAGTCHA until this weekend after Rolf suggested we always use CONVERSION=SAGTRPC as our only use of EntireX makes use of either Natural RPC Servers or XML RPC Servers. Indeed, this change fixed a number of special character issues we were having with a Natural client invoking a web service.

However, today someone made a call from an Oracle Financials module through SOA (Fusion) to a web service exposing a Natural subprogram that failed with the error:

Error 1003 0069: Missing unicode character U+2019 in target ICU Converter ibm-37_P100-19

U+2019 appears to be a right single quotation mark. How can I handle such attempts avoiding the 1003 0069 error? Is there a different, better conversion method than SAGTRPC to use that will change right and left single quotation marks to regular single quotation marks, for example (without losing any gains we made in moving to this method)?

Thanks in advance!

-Brian

You have some extra options for handling unknown characters(this is an extra parameter for the CONVERSION=…)
Have a look here:
http://techcommunity.softwareag.com/ecosystem/documentation/webmethods/wmsuites/wmsuite9-12/EntireX/9-12_EntireX/adminGeneral/attributes.htm#attributes_service_OPTION-conversion

Finn

Thanks, Finn! From the link you provided, I see options SUBSTITUTE and SUBSTITUTE-NONCONV as both being equally useful in this case. I will just need to decide if I still want it to throw an error if the sender’s character is part of the sender’s code page (though with Unicode, I am not sure what that could be).

I see it mentions that with these options, it will replace the offending character with a “codepage-dependent replacement character”. Does that mean each character has a default replacement (example: normal straight single quotation mark in lieu of a right- or left-single quotation mark), or are all such offending characters replaced by the same default replacement character? If the latter, what is this default replacement character… a space (EBCDIC x’40’)?

Thanks,

Brian

Hi again Brian
I must admit I haven’t spent much time experimenting with this: The error I had went away with either - we knew that the offending characters were just “noise” anyway.

I found a hint regarding the replacement character…
If you go to this link, then you will att the bottom see the substitution character - in this case “\x3F”

Thanks again, Finn. You’re quick with excellent advice.

x’3F’ in ASCII is - whatever that is, LOL! I am sure it will appear as a funny non-printable (e.g., ?), but it seems preferable to giving an error back.

Thanks!

Hi Finn,

Since you mentioned that you couldn’t find the exact code page referenced in the error message, I wanted to let you know that in Broker startup, this is displayed in SYSOUT:

ETBD0286 Diagnostic Values:
ICU version 54.1
ETBD0286 Diagnostic Values:
Broker Codepage… ibm-1047_P100-1995,swaplfnl
ETBD0286 Diagnostic Values:
FD_SETSIZE: 16384 rlim_cur: 64000 rlim_max: 64000
ETBD0286 Diagnostic Values:
New rlim_cur: 16384

I am thinking the codepage name in the error message is truncated, and the codepage you pointed me to was correct as it matches what is shown here.

The situation here has nothing to do with the broker codepage. In conversion there are 3 different
codepages (ICU Convereters) involved, a source a target and a broker codepage. There are also 3
different error messages, one for source, one for target and one for broker codepage:

10030069 Missing Unicode character char in target ICU Converter converter
10030070 No character at codepoint codepoint in source ICU Converter converter
10030071 Missing Unicode character char in broker ICU Converter converter

Refer also to the doc for more info.

10030069 which happend here means the target codepage. Depending on the direction, request
or reply it could be the client or server side. Because target converter is ibm-37_P100-19… used by Natural
it happend during request.

Hi Brian,

as far as I know you are running Natural RPC without a codepage specified. In this case the default EBCDIC codepage is used:
http://techcommunity.softwareag.com/ecosystem/documentation/webmethods/wmsuites/wmsuite9-12/EntireX/9-12_EntireX/internat/locmap.htm#locmap_defaults
So ibm-37_P100-1995 is used.

Hi Bernhard,

I understand your point that the Broker’s codepage may be different than either client or target and there is a different error if the issue was the Broker codepage and not the target codepage. I also understand that there is a lot of design flexibility that this supports in that Brokers do not have to run on the same environments or operating systems as client sources and target servers.

In our case, the Broker runs in the same environment as the target server and use the same IBM codepage, and all service calls to be run under the Natural RPC Server will potentially face the same client codepage-to-IBM-37_P100-1995 conversion challenge where the client codepage is a superset of the target.

Also, we have deployed the suggestion by Finn yesterday, and have retested the same payload and it processed successfully.

Rolf,

Neither the Broker nor the Natural RPC Servers specify a codepage to use, so my assumption is that both will use the default codepage for the mainframe both execute on (since they run on the same LPAR of the same mainframe). Both should be defaulting to IBM-37_P100-1995, yes?

-Brian

Yes, the default for an EBCDIC participant (client or server) is IBM-37_P100-1995.
Your SYSOUT shows the broker codepage is ibm-1047_P100-1995,swaplfnl, but it is not relevant here.

Hi Brian,

you have to set the CRPC parameter if you want to use a codepage which is different from ibm-37_P100-1995.

See http://techcommunity.softwareag.com/ecosystem/documentation/natural/nat826mf/parms/rpc_settings.htm#cprpc-mf

Perhaps there is something better than working with one substitution character.

The ICU converters allow for fallback character mappings. These can be used for characters not in the target code page but with a similar character.

Unfortunately, ICU’s code page IBM-37 does not define a fallback mapping for the right single quotation mark character
to the apostrophe character - see none

However, ICU has one code page in stock that is based on IBM-37 with additional fallback mappings: macos-3074-10.2.ucm in

http://source.icu-project.org/repos/icu/data/trunk/charset/data/ucm/

There may be more useful fallback mappings in this code page, like the typographic left single quotation mark or the left and right double quotation marks.

The good thing about the fallback characters is that they remain similar (in contrast to a substitution character) when returned to the originating system.

It also depends on the code page of the originating data and what set of different characters is being used.

Take for example Windows 1252 which has the same character set as IBM-37 plus 27 additional characters in the range x80-x9f. Of these most of the typographic characters have the fallback defined in macos-3074. See the attached mapping of the characters from 1252 to macos-3074 (the target code point is the hex byte above in the each character box, yellow = code point different in target code page, red = character does not exist in target code page)

A different approach: If it is possible to configure the application where the data is originating, it may be possible to inhibit the use of these typographic characters. In Microsoft Word for example you could switch off the auto correction option to ‘correct’ normal quotes to typographic ones.

The icu converter Marbod mentioned with the fallbacks can be made available in the EntireX Broker as a custom ICU converter. For z/OS version EntireX 9.12 is required. See EntireX 9.12 documentation: Administration and Monitoring > z/OS > Broker Configuration > Configuring Broker for Internationalization > Building and Installing ICU Custom Converters.

The icu converter name has also to be configured as CPRPC parameter in the Natural RPC Server then . . .

Note: Custom ICU converters are also supported for EntireX Broker in Windows and Unix, even for older EntireX versions . . .

This suggestion sounds preferable to the default replacement character. For now we are no longer rejecting service calls but Marbod’s point sounds like it would be more business friendly to accommodate with the functionality described of replacing with a similar character with identical visual meaning.

We’re just on v9.10 right now, so I will look into this set up and configuration for the Natural RPC Servers when we go to v9.12 or higher. Though you mention the Brokers support ICU customizations already, it’s our target codepage that has to apply conversion here (re: error for target codepage, not Broker codepage).

Thanks for the feedback!