Encoding issue - Converting special characters from Unix(LF) ANSI to Windows (CR LF) UTF-8

Hello everyone,

I am having an issue when I try to create a record from a bizdoc. When I debug to see the values in a document by using “wm.b2b.editn:bizdocToRecord” service, some of the special characters turn into symbols. It doesn’t matter if I set the encoding parameter as UTF-8 or not.

I also tried to use “pub.string:bytesToString” and “pub.string:stringToBytes” services with and without set encoding to get rid of those symbols but it didn’t work either.

When I export the transaction from TN and open it in Notepad I see line endings are UNIX LF and encoding is ANSI and can see special characters without an issue as Rønne instead of R�nne.

There are a few Java codes that I tried but the result was the same,

byte[] ansiBytes = (byte[]) ValuesEmulator.get(pipeline, "bytes");
StringBuilder hexString = new StringBuilder();
for (byte b : ansiBytes) {
    String hex = Integer.toHexString(Byte.toUnsignedInt(b));
    if (hex.length() == 1) {
        hex = "0" + hex; 
    }
    hexString.append(hex);
}
String hexResult = hexString.toString();
byte[] byteArray = new byte[hexResult.length() / 2];
for (int i = 0; i < byteArray.length; i++) {
    int index = i * 2;
    int j = Integer.parseInt(hexResult.substring(index, index + 2), 16);
    byteArray[i] = (byte) j;
}

String utf8String = new String(byteArray, StandardCharsets.UTF_8);
byte[] unixBytes = (byte[]) ValuesEmulator.get(pipeline, "bytes");
	
ByteArrayOutputStream windowsOutputStream = new ByteArrayOutputStream();
for (byte b : unixBytes) {
    if (b == '\n') {
        windowsOutputStream.write('\r');
    }
    windowsOutputStream.write(b);
}
byte[] windowsBytes = windowsOutputStream.toByteArray();

byte[] utf8Bytes = null;
try {
    utf8Bytes = new String(windowsBytes, "UTF-8").getBytes("UTF-8");
} catch (UnsupportedEncodingException e) {
    e.printStackTrace();
}
		
String utf8String = new String(utf8Bytes, StandardCharsets.UTF_8);
ValuesEmulator.put(pipeline,"UTF8stringOutput",utf8String); 

These transactions come from a generic AS2 service so I don’t want to make changes there and asking the sender to change the content is the last option.

Do you have any suggestions? I can provide extra info if it is needed.

Thanks in advance.

2 Likes