如何在 PHP 中创建 utf-8 `ł` (polish-L) 的 htmlentity?
How to create a htmlentity of utf-8 `ł` (polish-L) in PHP?
“ł”不能实体化的原因是什么?
和如何对 get_html_translation_table
中不存在的一般字符执行此操作?
get_html_translation_table
specified/defined的内容怎么样?
(这里只说utf-8)
// control with 'ö'
php > echo htmlentities("ö",ENT_COMPAT,'utf-8');
ö
// test with 'ł'
php > echo htmlentities("ł",ENT_COMPAT,'utf-8');
ł
检查get_html_translation_table
:
php > var_dump(implode(',',array_keys(
get_html_translation_table(HTML_ENTITIES))));
// produces
// (why is ł not there?):
string(843) "",&,<,>, ,¡,¢,£,¤,¥,¦,§,¨,©,ª,«,¬,,®,¯,°,±,²,³,´,µ,¶,·,¸,¹,º,»,¼,½,¾
,¿,À,Á,Â,Ã,Ä,Å,Æ,Ç,È,É,Ê,Ë,Ì,Í,Î,Ï,Ð,Ñ,Ò,Ó,Ô,Õ,Ö,×,Ø,Ù,Ú,Û,Ü,Ý,Þ,ß,à,á,â,ã,ä,å,æ
,ç,è,é,ê,ë,ì,í,î,ï,ð,ñ,ò,ó,ô,õ,ö,÷,ø,ù,ú,û,ü,ý,þ,ÿ,Œ,œ,Š,š,Ÿ,ƒ,ˆ,˜,Α,Β,Γ,Δ,Ε,Ζ,Η
,Θ,Ι,Κ,Λ,Μ,Ν,Ξ,Ο,Π,Ρ,Σ,Τ,Υ,Φ,Χ,Ψ,Ω,α,β,γ,δ,ε,ζ,η,θ,ι,κ,λ,μ,ν,ξ,ο,π,ρ,ς,σ,τ,υ,φ,χ
,ψ,ω,ϑ,ϒ,ϖ, , , ,,,,,–,—,‘,’,‚,“,”,„,†,‡,•,…,‰,′,″,‹,›,‾,⁄,€,ℑ,℘,ℜ,™,ℵ,←,↑,→,↓,↔
,↵,⇐,⇑,⇒,⇓,⇔,∀,∂,∃,∅,∇,∈,∉,∋,∏,∑,−,∗,√,∝,∞,∠,∧,∨,∩,∪,∫,∴,∼,≅,≈,≠,≡,≤,≥,⊂,⊃,⊄,⊆,⊇
,⊕,⊗,⊥,⋅,⌈,⌉,⌊,⌋,〈,〉,◊,♠,♣,♥,♦"
PHP 5.6.12
您需要使用 ENT_HTML5
标志将代码处理为 HTML 5。
echo htmlentities("ł", ENT_COMPAT | ENT_HTML5, 'utf-8');
php 5.3...您可以在下面为 UTF-8
更改此功能(ISO-8859-2、WIN-1250)
<?php
if (!function_exists('htmlentities_polish')) { function htmlentities_polish($string) {
if (!$GLOBALS['msFunc']['htmlentities_polish']['entities']) {
$ignore = str_split('ACELNOSXZacelnosxz'); // Right Alt + that letter, on 'polish programmers keyboard' (or tilde (~) + that letter)
// polish characters: Ą Ć Ę Ł Ń Ó Ś Ź Ż ą ć ę ł ń ó ś ź ż
foreach ($ignore as &$value) {
if (stripos('ae', $value)!==FALSE) $value .= 'ogon';
else if (strtolower($value)==='l') $value .= 'strok';
else if (strtolower($value)==='z') $value .= 'dot';
else $value .= 'acute';
$value = '&'.strtr($value, 'Xx' /* Z, z acute */, 'Zz').';';
// See also: https://www.w3schools.com/charsets/ref_utf_latin_extended_a.asp , https://www.w3schools.com/charsets/ref_utf_latin1_supplement.asp
}
unset($value);
$GLOBALS['msFunc']['htmlentities_polish']['entities'] = $ignore;
$iso = array(161,198,202,163,209,211,166,172,175,177,230,234,179,241,243,182,188,191); // ISO-8859-2
foreach ($iso as &$value) { $value = chr($value);}
$GLOBALS['msFunc']['htmlentities_polish']['iso'] = $iso;
unset($value);
$win = array(165,198,202,163,209,211,140,143,175,185,230,234,179,241,243,156,159,191); // WINDOWS-1250 (WINDOWS-EE)
foreach ($win as &$value) { $value = chr($value);}
$GLOBALS['msFunc']['htmlentities_polish']['win'] = $win;
unset($value);
}
/* Convert "everything" (win and iso) ... */
if (!$GLOBALS['msFunc']['htmlentities_polish']['isowin']) { // Note: in assumption, it is within first call of htmlentities_polish() and isset($win)===TRUE etc. !
$GLOBALS['msFunc']['htmlentities_polish']['isowin']=$diff=array_merge($iso, array_diff($win, $iso));
$flip_win = array_flip($win);
for ($i=count($iso); $i<count($diff); $i++) {
$GLOBALS['msFunc']['htmlentities_polish']['entities'][$i] = $GLOBALS['msFunc']['htmlentities_polish']['entities'][$flip_win[$diff[$i]]];
}
}
return str_replace($GLOBALS['msFunc']['htmlentities_polish']['isowin'], $GLOBALS['msFunc']['htmlentities_polish']['entities'], $string);
/* ...or charset checking - in 2 ways:
$diff = array_diff($GLOBALS['msFunc']['htmlentities_polish']['iso'], $GLOBALS['msFunc']['htmlentities_polish']['win']); // characters different between ISO-8859-2 and WINDOWS-1250
// (1) fast but stupid way
foreach ($diff as $value) {
if (strpos($string, $value)!==FALSE) return str_replace($GLOBALS['msFunc']['htmlentities_polish']['iso'], $GLOBALS['msFunc']['htmlentities_polish']['entities'], $string);
} // entities from ISO-8859-2 and return !
// otherwise // entities from WINDOWS-1250 and return :
return str_replace($GLOBALS['msFunc']['htmlentities_polish']['win'], $GLOBALS['msFunc']['htmlentities_polish']['entities'], $string);
// (2) slow but exact way
foreach (str_split($string) as $value) {
if (in_array($value, $diff)) $iso_c++;
}
$diff = array_diff($GLOBALS['msFunc']['htmlentities_polish']['win'], $GLOBALS['msFunc']['htmlentities_polish']['iso']);
foreach (str_split($string) as $value) {
if (in_array($value, $diff)) $win_c++;
}
if ($win_c>$iso_c) return str_replace($GLOBALS['msFunc']['htmlentities_polish']['win'], $GLOBALS['msFunc']['htmlentities_polish']['entities'], $string);
else return str_replace($GLOBALS['msFunc']['htmlentities_polish']['iso'], $GLOBALS['msFunc']['htmlentities_polish']['entities'], $string);
*/
// polish characters: Ą Ć Ę Ł Ń Ó Ś Ź Ż ą ć ę ł ń ó ś ź ż
}
}
?>
“ł”不能实体化的原因是什么?
和如何对 get_html_translation_table
中不存在的一般字符执行此操作?
get_html_translation_table
specified/defined的内容怎么样?
(这里只说utf-8)
// control with 'ö'
php > echo htmlentities("ö",ENT_COMPAT,'utf-8');
ö
// test with 'ł'
php > echo htmlentities("ł",ENT_COMPAT,'utf-8');
ł
检查get_html_translation_table
:
php > var_dump(implode(',',array_keys(
get_html_translation_table(HTML_ENTITIES))));
// produces
// (why is ł not there?):
string(843) "",&,<,>, ,¡,¢,£,¤,¥,¦,§,¨,©,ª,«,¬,,®,¯,°,±,²,³,´,µ,¶,·,¸,¹,º,»,¼,½,¾
,¿,À,Á,Â,Ã,Ä,Å,Æ,Ç,È,É,Ê,Ë,Ì,Í,Î,Ï,Ð,Ñ,Ò,Ó,Ô,Õ,Ö,×,Ø,Ù,Ú,Û,Ü,Ý,Þ,ß,à,á,â,ã,ä,å,æ
,ç,è,é,ê,ë,ì,í,î,ï,ð,ñ,ò,ó,ô,õ,ö,÷,ø,ù,ú,û,ü,ý,þ,ÿ,Œ,œ,Š,š,Ÿ,ƒ,ˆ,˜,Α,Β,Γ,Δ,Ε,Ζ,Η
,Θ,Ι,Κ,Λ,Μ,Ν,Ξ,Ο,Π,Ρ,Σ,Τ,Υ,Φ,Χ,Ψ,Ω,α,β,γ,δ,ε,ζ,η,θ,ι,κ,λ,μ,ν,ξ,ο,π,ρ,ς,σ,τ,υ,φ,χ
,ψ,ω,ϑ,ϒ,ϖ, , , ,,,,,–,—,‘,’,‚,“,”,„,†,‡,•,…,‰,′,″,‹,›,‾,⁄,€,ℑ,℘,ℜ,™,ℵ,←,↑,→,↓,↔
,↵,⇐,⇑,⇒,⇓,⇔,∀,∂,∃,∅,∇,∈,∉,∋,∏,∑,−,∗,√,∝,∞,∠,∧,∨,∩,∪,∫,∴,∼,≅,≈,≠,≡,≤,≥,⊂,⊃,⊄,⊆,⊇
,⊕,⊗,⊥,⋅,⌈,⌉,⌊,⌋,〈,〉,◊,♠,♣,♥,♦"
PHP 5.6.12
您需要使用 ENT_HTML5
标志将代码处理为 HTML 5。
echo htmlentities("ł", ENT_COMPAT | ENT_HTML5, 'utf-8');
php 5.3...您可以在下面为 UTF-8
更改此功能(ISO-8859-2、WIN-1250)<?php
if (!function_exists('htmlentities_polish')) { function htmlentities_polish($string) {
if (!$GLOBALS['msFunc']['htmlentities_polish']['entities']) {
$ignore = str_split('ACELNOSXZacelnosxz'); // Right Alt + that letter, on 'polish programmers keyboard' (or tilde (~) + that letter)
// polish characters: Ą Ć Ę Ł Ń Ó Ś Ź Ż ą ć ę ł ń ó ś ź ż
foreach ($ignore as &$value) {
if (stripos('ae', $value)!==FALSE) $value .= 'ogon';
else if (strtolower($value)==='l') $value .= 'strok';
else if (strtolower($value)==='z') $value .= 'dot';
else $value .= 'acute';
$value = '&'.strtr($value, 'Xx' /* Z, z acute */, 'Zz').';';
// See also: https://www.w3schools.com/charsets/ref_utf_latin_extended_a.asp , https://www.w3schools.com/charsets/ref_utf_latin1_supplement.asp
}
unset($value);
$GLOBALS['msFunc']['htmlentities_polish']['entities'] = $ignore;
$iso = array(161,198,202,163,209,211,166,172,175,177,230,234,179,241,243,182,188,191); // ISO-8859-2
foreach ($iso as &$value) { $value = chr($value);}
$GLOBALS['msFunc']['htmlentities_polish']['iso'] = $iso;
unset($value);
$win = array(165,198,202,163,209,211,140,143,175,185,230,234,179,241,243,156,159,191); // WINDOWS-1250 (WINDOWS-EE)
foreach ($win as &$value) { $value = chr($value);}
$GLOBALS['msFunc']['htmlentities_polish']['win'] = $win;
unset($value);
}
/* Convert "everything" (win and iso) ... */
if (!$GLOBALS['msFunc']['htmlentities_polish']['isowin']) { // Note: in assumption, it is within first call of htmlentities_polish() and isset($win)===TRUE etc. !
$GLOBALS['msFunc']['htmlentities_polish']['isowin']=$diff=array_merge($iso, array_diff($win, $iso));
$flip_win = array_flip($win);
for ($i=count($iso); $i<count($diff); $i++) {
$GLOBALS['msFunc']['htmlentities_polish']['entities'][$i] = $GLOBALS['msFunc']['htmlentities_polish']['entities'][$flip_win[$diff[$i]]];
}
}
return str_replace($GLOBALS['msFunc']['htmlentities_polish']['isowin'], $GLOBALS['msFunc']['htmlentities_polish']['entities'], $string);
/* ...or charset checking - in 2 ways:
$diff = array_diff($GLOBALS['msFunc']['htmlentities_polish']['iso'], $GLOBALS['msFunc']['htmlentities_polish']['win']); // characters different between ISO-8859-2 and WINDOWS-1250
// (1) fast but stupid way
foreach ($diff as $value) {
if (strpos($string, $value)!==FALSE) return str_replace($GLOBALS['msFunc']['htmlentities_polish']['iso'], $GLOBALS['msFunc']['htmlentities_polish']['entities'], $string);
} // entities from ISO-8859-2 and return !
// otherwise // entities from WINDOWS-1250 and return :
return str_replace($GLOBALS['msFunc']['htmlentities_polish']['win'], $GLOBALS['msFunc']['htmlentities_polish']['entities'], $string);
// (2) slow but exact way
foreach (str_split($string) as $value) {
if (in_array($value, $diff)) $iso_c++;
}
$diff = array_diff($GLOBALS['msFunc']['htmlentities_polish']['win'], $GLOBALS['msFunc']['htmlentities_polish']['iso']);
foreach (str_split($string) as $value) {
if (in_array($value, $diff)) $win_c++;
}
if ($win_c>$iso_c) return str_replace($GLOBALS['msFunc']['htmlentities_polish']['win'], $GLOBALS['msFunc']['htmlentities_polish']['entities'], $string);
else return str_replace($GLOBALS['msFunc']['htmlentities_polish']['iso'], $GLOBALS['msFunc']['htmlentities_polish']['entities'], $string);
*/
// polish characters: Ą Ć Ę Ł Ń Ó Ś Ź Ż ą ć ę ł ń ó ś ź ż
}
}
?>