htmlentities 不适用于表情符号
htmlentites not working for emoji
我正在尝试显示字符 html 实体
echo htmlentities(htmlentities("&"));
//outputs &
echo htmlentities(htmlentities("<"));
//outputs <
但它似乎不适用于表情符号
echo htmlentities(htmlentities(""));
//outputs
如何让它输出😎
?
编辑:
我正在尝试显示用户输入的所有 html 实体都已编码的字符串。
echo htmlentities(htmlentities($input))
示例:
"this & that " -> "this & that 😎"
$emoji = "\xF0\x9F\x98\x8E";
// 这是你的表情符号
我从 convert unicode to html entities hex
收到此回调
$hex = preg_replace_callback('/[\x{80}-\x{10FFFF}]/u', function ($m) {
$char = current($m);
$utf = iconv('UTF-8', 'UCS-4', $char);
return sprintf("&#x%s;", ltrim(strtoupper(bin2hex($utf)), "0"));
}, $emoji);
echo $hex;
echo json_encode(("\xF0\x9F\x98\x8E"));
// 已解码。 htmlentities 不适用于它。
这样可以吗?
这适用于常规 HTML 实体、UTF-8 表情符号(和其他 utf 内容),当然还有常规字符串。
我只是遇到了空字符串值的问题,所以我不得不将这个条件放入函数中。
function entities( $string ) {
$stringBuilder = "";
$offset = 0;
if ( empty( $string ) ) {
return "";
}
while ( $offset >= 0 ) {
$decValue = ordutf8( $string, $offset );
$char = unichr($decValue);
$htmlEntited = htmlentities( $char );
if( $char != $htmlEntited ){
$stringBuilder .= $htmlEntited;
} elseif( $decValue >= 128 ){
$stringBuilder .= "&#" . $decValue . ";";
} else {
$stringBuilder .= $char;
}
}
return $stringBuilder;
}
// source - http://php.net/manual/en/function.ord.php#109812
function ordutf8($string, &$offset) {
$code = ord(substr($string, $offset,1));
if ($code >= 128) { //otherwise 0xxxxxxx
if ($code < 224) $bytesnumber = 2; //110xxxxx
else if ($code < 240) $bytesnumber = 3; //1110xxxx
else if ($code < 248) $bytesnumber = 4; //11110xxx
$codetemp = $code - 192 - ($bytesnumber > 2 ? 32 : 0) - ($bytesnumber > 3 ? 16 : 0);
for ($i = 2; $i <= $bytesnumber; $i++) {
$offset ++;
$code2 = ord(substr($string, $offset, 1)) - 128; //10xxxxxx
$codetemp = $codetemp*64 + $code2;
}
$code = $codetemp;
}
$offset += 1;
if ($offset >= strlen($string)) $offset = -1;
return $code;
}
// source - http://php.net/manual/en/function.chr.php#88611
function unichr($u) {
return mb_convert_encoding('&#' . intval($u) . ';', 'UTF-8', 'HTML-ENTITIES');
}
/* ---- */
var_dump( entities( "&" ) ) . "\n";
var_dump( entities( "<" ) ) . "\n";
var_dump( entities( "" ) ) . "\n";
var_dump( entities( "☚" ) ) . "\n";
var_dump( entities( "" ) ) . "\n";
var_dump( entities( "A" ) ) . "\n";
var_dump( entities( "Hello world" ) ) . "\n";
var_dump( entities( "this & that " ) ) . "\n";
htmlentities
文档指出
all characters which have HTML character entity equivalents are
translated into these entities.
您的表情符号没有 <
对应 <
的等效表情符号,因此它不会被转换。 😎
只是一个 HTML 代码,而不是 HTML 实体。
function htmlEntitiesOrCode($string) {
//try htmlentities first
$result = htmlentities($string, ENT_COMPAT, "UTF-8");
//if the output is different from input, an entity was returned
if ($result != $string) {
return $result;
}
//get the html code
$offset = 0;
$code = ord(substr($string, $offset,1));
if ($code >= 128) {
if ($code < 224) {
$bytesnumber = 2;
} else if ($code < 240) {
$bytesnumber = 3;
} else if ($code < 248) {
$bytesnumber = 4;
}
$codetemp = $code - 192 - ($bytesnumber > 2 ? 32 : 0) - ($bytesnumber > 3 ? 16 : 0);
for ($i = 2; $i <= $bytesnumber; $i++) {
$offset ++;
$code2 = ord(substr($string, $offset, 1)) - 128;
$codetemp = $codetemp*64 + $code2;
}
$code = $codetemp;
}
$offset += 1;
if ($offset >= strlen($string)) {
$offset = -1;
}
$result = "&#" . $code;
return $result;
}
HTML 代码函数取自此处:http://php.net/manual/en/function.ord.php#109812
我正在尝试显示字符 html 实体
echo htmlentities(htmlentities("&"));
//outputs &
echo htmlentities(htmlentities("<"));
//outputs <
但它似乎不适用于表情符号
echo htmlentities(htmlentities(""));
//outputs
如何让它输出😎
?
编辑:
我正在尝试显示用户输入的所有 html 实体都已编码的字符串。
echo htmlentities(htmlentities($input))
示例:
"this & that " -> "this & that 😎"
$emoji = "\xF0\x9F\x98\x8E";
// 这是你的表情符号
我从 convert unicode to html entities hex
收到此回调$hex = preg_replace_callback('/[\x{80}-\x{10FFFF}]/u', function ($m) {
$char = current($m);
$utf = iconv('UTF-8', 'UCS-4', $char);
return sprintf("&#x%s;", ltrim(strtoupper(bin2hex($utf)), "0"));
}, $emoji);
echo $hex;
echo json_encode(("\xF0\x9F\x98\x8E"));
// 已解码。 htmlentities 不适用于它。
这样可以吗?
这适用于常规 HTML 实体、UTF-8 表情符号(和其他 utf 内容),当然还有常规字符串。
我只是遇到了空字符串值的问题,所以我不得不将这个条件放入函数中。
function entities( $string ) {
$stringBuilder = "";
$offset = 0;
if ( empty( $string ) ) {
return "";
}
while ( $offset >= 0 ) {
$decValue = ordutf8( $string, $offset );
$char = unichr($decValue);
$htmlEntited = htmlentities( $char );
if( $char != $htmlEntited ){
$stringBuilder .= $htmlEntited;
} elseif( $decValue >= 128 ){
$stringBuilder .= "&#" . $decValue . ";";
} else {
$stringBuilder .= $char;
}
}
return $stringBuilder;
}
// source - http://php.net/manual/en/function.ord.php#109812
function ordutf8($string, &$offset) {
$code = ord(substr($string, $offset,1));
if ($code >= 128) { //otherwise 0xxxxxxx
if ($code < 224) $bytesnumber = 2; //110xxxxx
else if ($code < 240) $bytesnumber = 3; //1110xxxx
else if ($code < 248) $bytesnumber = 4; //11110xxx
$codetemp = $code - 192 - ($bytesnumber > 2 ? 32 : 0) - ($bytesnumber > 3 ? 16 : 0);
for ($i = 2; $i <= $bytesnumber; $i++) {
$offset ++;
$code2 = ord(substr($string, $offset, 1)) - 128; //10xxxxxx
$codetemp = $codetemp*64 + $code2;
}
$code = $codetemp;
}
$offset += 1;
if ($offset >= strlen($string)) $offset = -1;
return $code;
}
// source - http://php.net/manual/en/function.chr.php#88611
function unichr($u) {
return mb_convert_encoding('&#' . intval($u) . ';', 'UTF-8', 'HTML-ENTITIES');
}
/* ---- */
var_dump( entities( "&" ) ) . "\n";
var_dump( entities( "<" ) ) . "\n";
var_dump( entities( "" ) ) . "\n";
var_dump( entities( "☚" ) ) . "\n";
var_dump( entities( "" ) ) . "\n";
var_dump( entities( "A" ) ) . "\n";
var_dump( entities( "Hello world" ) ) . "\n";
var_dump( entities( "this & that " ) ) . "\n";
htmlentities
文档指出
all characters which have HTML character entity equivalents are translated into these entities.
您的表情符号没有 <
对应 <
的等效表情符号,因此它不会被转换。 😎
只是一个 HTML 代码,而不是 HTML 实体。
function htmlEntitiesOrCode($string) {
//try htmlentities first
$result = htmlentities($string, ENT_COMPAT, "UTF-8");
//if the output is different from input, an entity was returned
if ($result != $string) {
return $result;
}
//get the html code
$offset = 0;
$code = ord(substr($string, $offset,1));
if ($code >= 128) {
if ($code < 224) {
$bytesnumber = 2;
} else if ($code < 240) {
$bytesnumber = 3;
} else if ($code < 248) {
$bytesnumber = 4;
}
$codetemp = $code - 192 - ($bytesnumber > 2 ? 32 : 0) - ($bytesnumber > 3 ? 16 : 0);
for ($i = 2; $i <= $bytesnumber; $i++) {
$offset ++;
$code2 = ord(substr($string, $offset, 1)) - 128;
$codetemp = $codetemp*64 + $code2;
}
$code = $codetemp;
}
$offset += 1;
if ($offset >= strlen($string)) {
$offset = -1;
}
$result = "&#" . $code;
return $result;
}
HTML 代码函数取自此处:http://php.net/manual/en/function.ord.php#109812