strftime():中文、俄语和匈牙利语的编码错误
strftime(): Encoding is wrong for chinese, russian and hungarian
我想做的很简单:我想用中文(或俄语)打印日期(时间戳)。
对于我使用的所有语言
setlocale(LC_TIME, 'hu_HU.utf8', 'hu_HU.UTF-8', 'hu_HU', 'hr');
$date = strftime('%a %e %b %Y, %H:%M');
$date = utf8_encode($date);
即使没有 utf8_encode()
,这 return 也是一个 UTF-8 字符串。一切都好。现在,当我对 'zh_CN.utf8'
语言环境(或 'zh_CN.UTF-8'
、'zh_CN'
或 'zh'
)执行完全相同的操作时,这不是 return 正确的日期。有或没有 utf8_encode()
这个 returns
'2018å¹?mæ?#dæ?'
我不会说中文,但这显然是错误的。我发现它应该 return 类似于 '年'
。此字符具有 UTF-8 十六进制编码 E5 B9 B4
,但当我查看 returned 字符串时,会发现错误的十六进制值。有(2018年后)C3 A5 C2 B9 3F 6D C3 A6 ...
.
当我用 mb_detect_encoding()
检查 returned 字符串的编码时,这总是 returns UTF-8。我期待的是因为我使用的是 'zh_CN.utf8'
语言环境,它将编码设置为 UTF-8。
环顾四周后,我遇到了 this answer of Peter。他建议在 strftime()
函数中使用 '%Y年%m月%e日'
格式。当我使用它时,我得到与以前相同的结果。
这让我想到编码是错误的。但这是真的吗?是不是编码错了?如何将结果转换为正确的编码?
我在俄语方面遇到了同样的问题。
解决方案
我花了几个小时找到了正确的编码。 strftime()
是 而不是 传递 UTF-8
字符串。有关详细信息,请查看此答案的底部。我最终得到了一个 formatTime()
函数,它以正确的编码为我提供了正确的时间(对我来说 UTF-8
)。
function formatTime($format, $language = null, $timestamp = null){
switch($language){
case 'chinese':
$locale = setlocale(LC_TIME, 'zh_CN.utf8', 'zh_CN.UTF-8', 'zh_CN', 'zh');
break;
case 'hungarian':
$locale = setlocale(LC_TIME, 'hu_HU.utf8', 'hu_HU.UTF-8', 'hu_HU', 'hr');
break;
case 'russian':
$locale = setlocale(LC_TIME, 'ru_RU.utf8', 'ru_RU.UTF-8', 'ru_RU', 'ru');
break;
case 'german':
$locale = setlocale(LC_TIME, 'de_DE.utf8', 'de_DE.UTF-8', 'de_DE', 'de');
break;
case 'french':
$locale = setlocale(LC_TIME, 'fr_FR.utf8', 'fr_FR.UTF-8', 'fr_FR', 'fr');
break;
case 'polish':
$locale = setlocale(LC_TIME, 'pl_PL.utf8', 'pl_PL.UTF-8', 'pl_PL', 'pl');
break;
case 'turkish':
$locale = setlocale(LC_TIME, 'tr_TR.utf8', 'tr_TR.UTF-8', 'tr_TR', 'tr');
break;
case 'english':
$locale = setlocale(LC_TIME, 'en_GB.utf8', 'en_GB.UTF-8', 'en_GB', 'en');
break;
// ...
default: break;
}
if(!is_numeric($timestamp)){
$datetime = strftime($format);
}
else{
$datetime = strftime($format, $timestamp);
}
$current_locale = strtolower(setlocale(LC_TIME, 0));
if(($pos = strpos("utf", $current_locale)) === false || strpos("8", $current_locale, $pos) === false){
// UTF-8 locale is not used, the encodings are found out with the code shown below
$locale_default_encodings = array(
"german" => "ISO-8859-1",
"french" => "ISO-8859-1",
"polish" => "ISO-8859-2",
"turkish" => "ISO-8859-9",
// Testing hungarian results in "Windows-1252", but php.net recommends to
// use ISO-8859-2, in fact Windows-1252 is based on ISO-8859-2 so it should
// (hopefully) work with both (*)
"hungarian" => "ISO-8859-2",
"chinese" => "CP936",
"russian" => "KOI8-R"
);
$target_encoding = mb_internal_encoding(); // or "UTF-8" or whatever
if(isset($locale_default_encodings[$language])){
$datetime = mb_convert_encoding(
$datetime,
$target_encoding,
$locale_default_encodings[$language]
);
}
else{
// try to avoid this case
$datetime = mb_convert_encoding($datetime, $target_encoding);
}
}
setlocale(LC_TIME, $locale);
return $datetime;
}
(*): http://php.net/manual/de/function.strftime.php#94399
漫漫长路
我检查了特定语言的 strftime("%B")
结果。这是完整的月份名称。我检查了我的语言的翻译,然后我为翻译的不同字母查找了 UTF-8
的十六进制值。
现在我正在遍历 php 支持的所有编码。我将 strftime()
给出的结果从当前迭代编码转换为 UTF-8
。现在我可以将 strftime()
转换为 UTF-8
的结果与手动翻译的十六进制值进行比较,这也是 UTF-8
的十六进制值。如果它们匹配 strftime()
的结果具有当前交互编码的编码。
我选择十六进制值是因为它们在定义上是相同的并且不依赖于内部编码,因为它们是 ASCII 字符串(甚至是 php 中的数字)。
这给了我以下输出,代码贴在下面:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
<h1>Detecting the font encoding of <code>strftime()</code>
</h1>
<h2>hungarian</h2>
<p>
<code>strftime()</code> for March for language hungarian. Expected hex: <code>6fc5be756a616b</code>, converted expected hex to string: <code>ožujak</code>
</p>
<table>
<tr>
<td>initial return value</td>
<td>oߵjak</td>
<td>6f9e756a616b</td>
</tr>
<tr>
<td colspan='3'>Encodings that deliver the correct result:</td>
</tr>
<tr style='background: green;'>
<td>Windows-1252</td>
<td>ožujak</td>
<td>6fc5be756a616b</td>
</tr>
</table>
<h2>chinese</h2>
<p>
<code>strftime()</code> for December for language chinese. Expected hex: <code>e58d81e4ba8ce69c88</code>, converted expected hex to string: <code>十二月</code>
</p>
<table>
<tr>
<td>initial return value</td>
<td>ʮՂ</td>
<td>caaeb6fed4c2</td>
</tr>
<tr>
<td colspan='3'>Encodings that deliver the correct result:</td>
</tr>
<tr style='background: green;'>
<td>EUC-CN</td>
<td>十二月</td>
<td>e58d81e4ba8ce69c88</td>
</tr>
<tr style='background: green;'>
<td>CP936</td>
<td>十二月</td>
<td>e58d81e4ba8ce69c88</td>
</tr>
<tr style='background: green;'>
<td>GB18030</td>
<td>十二月</td>
<td>e58d81e4ba8ce69c88</td>
</tr>
</table>
<h2>russian</h2>
<p>
<code>strftime()</code> for December for language russian. Expected hex: <code>d0b4d095d099d0aed090d09fd0ad</code>, converted expected hex to string: <code>дЕЙЮАПЭ</code>
</p>
<table>
<tr>
<td>initial return value</td>
<td>ť롡td>
<td>c4e5eae0e1f0fc</td>
</tr>
<tr>
<td colspan='3'>Encodings that deliver the correct result:</td>
</tr>
<tr style='background: green;'>
<td>KOI8-R</td>
<td>дЕЙЮАПЭ</td>
<td>d0b4d095d099d0aed090d09fd0ad</td>
</tr>
<tr style='background: green;'>
<td>KOI8-U</td>
<td>дЕЙЮАПЭ</td>
<td>d0b4d095d099d0aed090d09fd0ad</td>
</tr>
</table>
</body>
</html>
请注意,此 html 以 UTF-8 编码。还是strftime()
函数给出的结果是错误的!这与评论中指出的浏览器或编辑器编码无关。
$encodings = array(
"UCS-4",
"UCS-4BE",
"UCS-4LE",
"UCS-2",
"UCS-2BE",
"UCS-2LE",
"UTF-32",
"UTF-32BE",
"UTF-32LE",
"UTF-16",
"UTF-16BE",
"UTF-16LE",
"UTF-7",
"UTF7-IMAP",
"UTF-8",
"ASCII",
"EUC-JP",
"SJIS",
"eucJP-win",
"SJIS-win",
"ISO-2022-JP",
"ISO-2022-JP-MS",
"CP932",
"CP51932",
"SJIS-mac",
"SJIS-Mobile#DOCOMO",
"SJIS-Mobile#KDDI",
"SJIS-Mobile#SOFTBANK",
"UTF-8-Mobile#DOCOMO",
"UTF-8-Mobile#KDDI-A",
"UTF-8-Mobile#KDDI-B",
"UTF-8-Mobile#SOFTBANK",
"ISO-2022-JP-MOBILE#KDDI",
"JIS",
"JIS-ms",
"CP50220",
"CP50220raw",
"CP50221",
"CP50222",
"ISO-8859-1",
"ISO-8859-2",
"ISO-8859-3",
"ISO-8859-4",
"ISO-8859-5",
"ISO-8859-6",
"ISO-8859-7",
"ISO-8859-8",
"ISO-8859-9",
"ISO-8859-10",
"ISO-8859-13",
"ISO-8859-14",
"ISO-8859-15",
"ISO-8859-16",
"byte2be",
"byte2le",
"byte4be",
"byte4le",
"BASE64",
"HTML-ENTITIES",
"7bit",
"8bit",
"EUC-CN",
"CP936",
"GB18030",
"HZ",
"EUC-TW",
"CP950",
"BIG-5",
"EUC-KR",
"UHC",
"ISO-2022-KR",
"Windows-1251",
"Windows-1252",
"CP866",
"KOI8-R",
"KOI8-U",
"ArmSCII-8"
);
$show_wrong_encodings = false;
$internal_encoding = "UTF-8";
mb_internal_encoding($internal_encoding);
$languages = array(
// name of the language => hex in UTF-8 and timestamp to check
"german" => array("4dc3a4727a", 1520343439), // march
"french" => array("64c3a963656d627265", 1544103703), // december
"polish" => array("677275647a6965c584", 1544103703), // december
"turkish" => array("4172616cc4b16b", 1544103703), // december
"hungarian" => array("6fc5be756a616b", 1520343439), // march
"chinese" => array("e58d81e4ba8ce69c88", 1544103703), // december
"russian" => array("d0b4d095d099d0aed090d09fd0ad", 1544103703) // december
);
$format = "%B"; // print full month name
print("<h1>Detecting the font encoding of <code>strftime()</code></h1>\n");
foreach($languages as $language => $data){
// the hex value in UTF-8, this is the target value
$hex = $data[0];
// the timestamp to check
$timestamp = $data[1];
print(
"<h2>".$language."</h2>\n".
"<p>".
"<code>strftime()</code> for ".formatTime("%B", "english", $timestamp)." ".
"for language ".$language.". Expected hex: <code>".$hex."</code>, converted expected ".
"hex to string: <code>".tostring($hex)."</code>".
"</p>\n"
);
// this is a different formatTime() function than mentioned above, it is defined after this
// foreach
$string = formatTime("%B", $language, $timestamp);
print("<table>\n");
print("<tr>\n".
"\t<td>initial return value</td>\n".
"\t<td>".$string."</td>\n".
"\t<td>".tohex($string)."</td>\n".
"</tr>\n\n".
"<tr><td colspan='3'>Encodings that deliver the correct result:</td></tr>"
);
foreach($encodings as $source_encoding){
$converted = mb_convert_encoding($string, $internal_encoding, $source_encoding);
$converted_hex = tohex($converted);
$style = "";
if($converted_hex == $hex){
$style = "background: green";
}
elseif(!$show_wrong_encodings){
$style = "display: none";
}
print("<tr style='".$style.";'>\n".
"\t<td>".$source_encoding."</td>\n".
"\t<td>".$converted."</td>\n".
"\t<td>".$converted_hex."</td>\n".
"</tr>\n"
);
}
print("</table>");
}
function tohex($string){
return implode(unpack("H*", $string));
}
function tostring($hex){
return pack("H*", $hex);
}
function formatTime($format, $language, $timestamp){
switch($language){
case 'chinese':
$locale = setlocale(LC_TIME, 'zh_CN.utf8', 'zh_CN.UTF-8', 'zh_CN', 'zh');
break;
case 'hungarian':
$locale = setlocale(LC_TIME, 'hu_HU.utf8', 'hu_HU.UTF-8', 'hu_HU', 'hr');
break;
case 'russian':
$locale = setlocale(LC_TIME, 'ru_RU.utf8', 'ru_RU.UTF-8', 'ru_RU', 'ru');
break;
case 'german':
$locale = setlocale(LC_TIME, 'de_DE.utf8', 'de_DE.UTF-8', 'de_DE', 'de');
break;
case 'french':
$locale = setlocale(LC_TIME, 'fr_FR.utf8', 'fr_FR.UTF-8', 'fr_FR', 'fr');
break;
case 'polish':
$locale = setlocale(LC_TIME, 'pl_PL.utf8', 'pl_PL.UTF-8', 'pl_PL', 'pl');
break;
case 'turkish':
$locale = setlocale(LC_TIME, 'tr_TR.utf8', 'tr_TR.UTF-8', 'tr_TR', 'tr');
break;
// ...
default:
$locale = setlocale(LC_TIME, 'en_GB.utf8', 'en_GB.UTF-8', 'en_GB', 'en');
break;
}
$datetime = strftime($format, $timestamp);
setlocale(LC_TIME, $locale);
return $datetime;
}
我想做的很简单:我想用中文(或俄语)打印日期(时间戳)。
对于我使用的所有语言
setlocale(LC_TIME, 'hu_HU.utf8', 'hu_HU.UTF-8', 'hu_HU', 'hr');
$date = strftime('%a %e %b %Y, %H:%M');
$date = utf8_encode($date);
即使没有 utf8_encode()
,这 return 也是一个 UTF-8 字符串。一切都好。现在,当我对 'zh_CN.utf8'
语言环境(或 'zh_CN.UTF-8'
、'zh_CN'
或 'zh'
)执行完全相同的操作时,这不是 return 正确的日期。有或没有 utf8_encode()
这个 returns
'2018å¹?mæ?#dæ?'
我不会说中文,但这显然是错误的。我发现它应该 return 类似于 '年'
。此字符具有 UTF-8 十六进制编码 E5 B9 B4
,但当我查看 returned 字符串时,会发现错误的十六进制值。有(2018年后)C3 A5 C2 B9 3F 6D C3 A6 ...
.
当我用 mb_detect_encoding()
检查 returned 字符串的编码时,这总是 returns UTF-8。我期待的是因为我使用的是 'zh_CN.utf8'
语言环境,它将编码设置为 UTF-8。
环顾四周后,我遇到了 this answer of Peter。他建议在 strftime()
函数中使用 '%Y年%m月%e日'
格式。当我使用它时,我得到与以前相同的结果。
这让我想到编码是错误的。但这是真的吗?是不是编码错了?如何将结果转换为正确的编码?
我在俄语方面遇到了同样的问题。
解决方案
我花了几个小时找到了正确的编码。 strftime()
是 而不是 传递 UTF-8
字符串。有关详细信息,请查看此答案的底部。我最终得到了一个 formatTime()
函数,它以正确的编码为我提供了正确的时间(对我来说 UTF-8
)。
function formatTime($format, $language = null, $timestamp = null){
switch($language){
case 'chinese':
$locale = setlocale(LC_TIME, 'zh_CN.utf8', 'zh_CN.UTF-8', 'zh_CN', 'zh');
break;
case 'hungarian':
$locale = setlocale(LC_TIME, 'hu_HU.utf8', 'hu_HU.UTF-8', 'hu_HU', 'hr');
break;
case 'russian':
$locale = setlocale(LC_TIME, 'ru_RU.utf8', 'ru_RU.UTF-8', 'ru_RU', 'ru');
break;
case 'german':
$locale = setlocale(LC_TIME, 'de_DE.utf8', 'de_DE.UTF-8', 'de_DE', 'de');
break;
case 'french':
$locale = setlocale(LC_TIME, 'fr_FR.utf8', 'fr_FR.UTF-8', 'fr_FR', 'fr');
break;
case 'polish':
$locale = setlocale(LC_TIME, 'pl_PL.utf8', 'pl_PL.UTF-8', 'pl_PL', 'pl');
break;
case 'turkish':
$locale = setlocale(LC_TIME, 'tr_TR.utf8', 'tr_TR.UTF-8', 'tr_TR', 'tr');
break;
case 'english':
$locale = setlocale(LC_TIME, 'en_GB.utf8', 'en_GB.UTF-8', 'en_GB', 'en');
break;
// ...
default: break;
}
if(!is_numeric($timestamp)){
$datetime = strftime($format);
}
else{
$datetime = strftime($format, $timestamp);
}
$current_locale = strtolower(setlocale(LC_TIME, 0));
if(($pos = strpos("utf", $current_locale)) === false || strpos("8", $current_locale, $pos) === false){
// UTF-8 locale is not used, the encodings are found out with the code shown below
$locale_default_encodings = array(
"german" => "ISO-8859-1",
"french" => "ISO-8859-1",
"polish" => "ISO-8859-2",
"turkish" => "ISO-8859-9",
// Testing hungarian results in "Windows-1252", but php.net recommends to
// use ISO-8859-2, in fact Windows-1252 is based on ISO-8859-2 so it should
// (hopefully) work with both (*)
"hungarian" => "ISO-8859-2",
"chinese" => "CP936",
"russian" => "KOI8-R"
);
$target_encoding = mb_internal_encoding(); // or "UTF-8" or whatever
if(isset($locale_default_encodings[$language])){
$datetime = mb_convert_encoding(
$datetime,
$target_encoding,
$locale_default_encodings[$language]
);
}
else{
// try to avoid this case
$datetime = mb_convert_encoding($datetime, $target_encoding);
}
}
setlocale(LC_TIME, $locale);
return $datetime;
}
(*): http://php.net/manual/de/function.strftime.php#94399
漫漫长路
我检查了特定语言的 strftime("%B")
结果。这是完整的月份名称。我检查了我的语言的翻译,然后我为翻译的不同字母查找了 UTF-8
的十六进制值。
现在我正在遍历 php 支持的所有编码。我将 strftime()
给出的结果从当前迭代编码转换为 UTF-8
。现在我可以将 strftime()
转换为 UTF-8
的结果与手动翻译的十六进制值进行比较,这也是 UTF-8
的十六进制值。如果它们匹配 strftime()
的结果具有当前交互编码的编码。
我选择十六进制值是因为它们在定义上是相同的并且不依赖于内部编码,因为它们是 ASCII 字符串(甚至是 php 中的数字)。
这给了我以下输出,代码贴在下面:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
<h1>Detecting the font encoding of <code>strftime()</code>
</h1>
<h2>hungarian</h2>
<p>
<code>strftime()</code> for March for language hungarian. Expected hex: <code>6fc5be756a616b</code>, converted expected hex to string: <code>ožujak</code>
</p>
<table>
<tr>
<td>initial return value</td>
<td>oߵjak</td>
<td>6f9e756a616b</td>
</tr>
<tr>
<td colspan='3'>Encodings that deliver the correct result:</td>
</tr>
<tr style='background: green;'>
<td>Windows-1252</td>
<td>ožujak</td>
<td>6fc5be756a616b</td>
</tr>
</table>
<h2>chinese</h2>
<p>
<code>strftime()</code> for December for language chinese. Expected hex: <code>e58d81e4ba8ce69c88</code>, converted expected hex to string: <code>十二月</code>
</p>
<table>
<tr>
<td>initial return value</td>
<td>ʮՂ</td>
<td>caaeb6fed4c2</td>
</tr>
<tr>
<td colspan='3'>Encodings that deliver the correct result:</td>
</tr>
<tr style='background: green;'>
<td>EUC-CN</td>
<td>十二月</td>
<td>e58d81e4ba8ce69c88</td>
</tr>
<tr style='background: green;'>
<td>CP936</td>
<td>十二月</td>
<td>e58d81e4ba8ce69c88</td>
</tr>
<tr style='background: green;'>
<td>GB18030</td>
<td>十二月</td>
<td>e58d81e4ba8ce69c88</td>
</tr>
</table>
<h2>russian</h2>
<p>
<code>strftime()</code> for December for language russian. Expected hex: <code>d0b4d095d099d0aed090d09fd0ad</code>, converted expected hex to string: <code>дЕЙЮАПЭ</code>
</p>
<table>
<tr>
<td>initial return value</td>
<td>ť롡td>
<td>c4e5eae0e1f0fc</td>
</tr>
<tr>
<td colspan='3'>Encodings that deliver the correct result:</td>
</tr>
<tr style='background: green;'>
<td>KOI8-R</td>
<td>дЕЙЮАПЭ</td>
<td>d0b4d095d099d0aed090d09fd0ad</td>
</tr>
<tr style='background: green;'>
<td>KOI8-U</td>
<td>дЕЙЮАПЭ</td>
<td>d0b4d095d099d0aed090d09fd0ad</td>
</tr>
</table>
</body>
</html>
请注意,此 html 以 UTF-8 编码。还是strftime()
函数给出的结果是错误的!这与评论中指出的浏览器或编辑器编码无关。
$encodings = array(
"UCS-4",
"UCS-4BE",
"UCS-4LE",
"UCS-2",
"UCS-2BE",
"UCS-2LE",
"UTF-32",
"UTF-32BE",
"UTF-32LE",
"UTF-16",
"UTF-16BE",
"UTF-16LE",
"UTF-7",
"UTF7-IMAP",
"UTF-8",
"ASCII",
"EUC-JP",
"SJIS",
"eucJP-win",
"SJIS-win",
"ISO-2022-JP",
"ISO-2022-JP-MS",
"CP932",
"CP51932",
"SJIS-mac",
"SJIS-Mobile#DOCOMO",
"SJIS-Mobile#KDDI",
"SJIS-Mobile#SOFTBANK",
"UTF-8-Mobile#DOCOMO",
"UTF-8-Mobile#KDDI-A",
"UTF-8-Mobile#KDDI-B",
"UTF-8-Mobile#SOFTBANK",
"ISO-2022-JP-MOBILE#KDDI",
"JIS",
"JIS-ms",
"CP50220",
"CP50220raw",
"CP50221",
"CP50222",
"ISO-8859-1",
"ISO-8859-2",
"ISO-8859-3",
"ISO-8859-4",
"ISO-8859-5",
"ISO-8859-6",
"ISO-8859-7",
"ISO-8859-8",
"ISO-8859-9",
"ISO-8859-10",
"ISO-8859-13",
"ISO-8859-14",
"ISO-8859-15",
"ISO-8859-16",
"byte2be",
"byte2le",
"byte4be",
"byte4le",
"BASE64",
"HTML-ENTITIES",
"7bit",
"8bit",
"EUC-CN",
"CP936",
"GB18030",
"HZ",
"EUC-TW",
"CP950",
"BIG-5",
"EUC-KR",
"UHC",
"ISO-2022-KR",
"Windows-1251",
"Windows-1252",
"CP866",
"KOI8-R",
"KOI8-U",
"ArmSCII-8"
);
$show_wrong_encodings = false;
$internal_encoding = "UTF-8";
mb_internal_encoding($internal_encoding);
$languages = array(
// name of the language => hex in UTF-8 and timestamp to check
"german" => array("4dc3a4727a", 1520343439), // march
"french" => array("64c3a963656d627265", 1544103703), // december
"polish" => array("677275647a6965c584", 1544103703), // december
"turkish" => array("4172616cc4b16b", 1544103703), // december
"hungarian" => array("6fc5be756a616b", 1520343439), // march
"chinese" => array("e58d81e4ba8ce69c88", 1544103703), // december
"russian" => array("d0b4d095d099d0aed090d09fd0ad", 1544103703) // december
);
$format = "%B"; // print full month name
print("<h1>Detecting the font encoding of <code>strftime()</code></h1>\n");
foreach($languages as $language => $data){
// the hex value in UTF-8, this is the target value
$hex = $data[0];
// the timestamp to check
$timestamp = $data[1];
print(
"<h2>".$language."</h2>\n".
"<p>".
"<code>strftime()</code> for ".formatTime("%B", "english", $timestamp)." ".
"for language ".$language.". Expected hex: <code>".$hex."</code>, converted expected ".
"hex to string: <code>".tostring($hex)."</code>".
"</p>\n"
);
// this is a different formatTime() function than mentioned above, it is defined after this
// foreach
$string = formatTime("%B", $language, $timestamp);
print("<table>\n");
print("<tr>\n".
"\t<td>initial return value</td>\n".
"\t<td>".$string."</td>\n".
"\t<td>".tohex($string)."</td>\n".
"</tr>\n\n".
"<tr><td colspan='3'>Encodings that deliver the correct result:</td></tr>"
);
foreach($encodings as $source_encoding){
$converted = mb_convert_encoding($string, $internal_encoding, $source_encoding);
$converted_hex = tohex($converted);
$style = "";
if($converted_hex == $hex){
$style = "background: green";
}
elseif(!$show_wrong_encodings){
$style = "display: none";
}
print("<tr style='".$style.";'>\n".
"\t<td>".$source_encoding."</td>\n".
"\t<td>".$converted."</td>\n".
"\t<td>".$converted_hex."</td>\n".
"</tr>\n"
);
}
print("</table>");
}
function tohex($string){
return implode(unpack("H*", $string));
}
function tostring($hex){
return pack("H*", $hex);
}
function formatTime($format, $language, $timestamp){
switch($language){
case 'chinese':
$locale = setlocale(LC_TIME, 'zh_CN.utf8', 'zh_CN.UTF-8', 'zh_CN', 'zh');
break;
case 'hungarian':
$locale = setlocale(LC_TIME, 'hu_HU.utf8', 'hu_HU.UTF-8', 'hu_HU', 'hr');
break;
case 'russian':
$locale = setlocale(LC_TIME, 'ru_RU.utf8', 'ru_RU.UTF-8', 'ru_RU', 'ru');
break;
case 'german':
$locale = setlocale(LC_TIME, 'de_DE.utf8', 'de_DE.UTF-8', 'de_DE', 'de');
break;
case 'french':
$locale = setlocale(LC_TIME, 'fr_FR.utf8', 'fr_FR.UTF-8', 'fr_FR', 'fr');
break;
case 'polish':
$locale = setlocale(LC_TIME, 'pl_PL.utf8', 'pl_PL.UTF-8', 'pl_PL', 'pl');
break;
case 'turkish':
$locale = setlocale(LC_TIME, 'tr_TR.utf8', 'tr_TR.UTF-8', 'tr_TR', 'tr');
break;
// ...
default:
$locale = setlocale(LC_TIME, 'en_GB.utf8', 'en_GB.UTF-8', 'en_GB', 'en');
break;
}
$datetime = strftime($format, $timestamp);
setlocale(LC_TIME, $locale);
return $datetime;
}