We have a scenario where many customers can upload files via HTTP post to our service. The files can be in many different encodings (ISO-8859-1,.-2.-3.,ISO-8859-9,…).
How can i convert these different encodings to UTF-8? Is it only with the stringtobytes or bytestostring?
The HTTP header and/or XML declaration needs to indicate the appropriate encoding. Trying to manage the encoding after it has already been converted to a String (and therefore, a (default) encoding has already been assumed) will be an exercise in frustration.
Since this has come up a lot lately, someone needs to do an article about character encodings and how IS handles them. Any takers?