Encoding issue with Chinese characters
Was having issue with encoding on PHP (server side) to be printed out on via Javascript (client side).
More info http://www.tinymce.com/wiki.php/Configuration:entity_encoding
References:
http://stackoverflow.com/a/3116893
http://www.php.net/manual/en/function.mb-encode-numericentity.php#29839
http://www.tinymce.com/forum/viewtopic.php?pid=63319
http://stackoverflow.com/questions/4691477/php-converting-unicode-strings-to-ansi-strings
The problem
The Chinese characters were fine when output directly from PHP, became garbled after encoded and decoded on using javascript.printf('document.write(unescape("%s"));', rawurlencode($data));
Solution
After Googled for a while I realize the best solution, I think, is to convert the foreign characters to unicode numeric entities. Example:Numeric Code | HTML Entity Code | Result |
256 | Ā | Ā |
Solution for PHP
For PHP there's a simple solution: mb_encode_numericentity(). Luckily the convmap for conversion (excluding HTML character) are in the comment. (As pointed in http://stackoverflow.com/a/3116893)function convertToNumericEntities($string) { $convmap = array(0x80, 0x10ffff, 0, 0xffffff); return mb_encode_numericentity($string, $convmap, "UTF-8"); }
Solution TinyMCE
If you are using TinyMCE, this can be solved by enabling "numeric" for entity_encoding.tinyMCE.init({
...
entity_encoding : "numeric"
});
More info http://www.tinymce.com/wiki.php/Configuration:entity_encoding
References:
http://stackoverflow.com/a/3116893
http://www.php.net/manual/en/function.mb-encode-numericentity.php#29839
http://www.tinymce.com/forum/viewtopic.php?pid=63319
http://stackoverflow.com/questions/4691477/php-converting-unicode-strings-to-ansi-strings
Comments
Post a Comment