PHP javascript escape() 和 unescape() 的实现
PHP implementation of javascript escape() and unescape()
首先我明白 JS escape()
和 unescape()
都被弃用了。基本上我们有一个古老的系统,JS escape()
数据在存储到 DB 之前,每次我们需要 unescape()
客户端的数据才能显示实际数据(我知道这很愚蠢但它是多年前完成,以支持非 unicode 兼容数据库上的 Unicode 字符)。
是否存在模拟 JavaScript escape()
和 unescape()
函数的现有 PHP 实现?
您正在寻找 urlencode()。如果您不能接受该编码的输出,您可以尝试使用 rawurlencode()。
这里有更多信息:
http://php.net/manual/en/function.urldecode.php
http://php.net/manual/en/function.urlencode.php
但是如果你只是想进行解码以将数据存储到 mysql 数据库中,那么你可以使用内置的 mysql 转义字符串函数将输入转换为合适的输出可以注入 mysql 数据库的格式。
参见:
经过一些搜索,我能够将两个 PHP 函数放在一起,它们可以满足我的需求。这些代码不是很漂亮,但 100% 可以处理我们目前拥有的数据,所以我想我会在这里分享它们。
/**
* Simulate javascript escape() function
*/
function escapejs($source) {
$map = array(
,'~' => '%7E'
,'!' => '%21'
,'\'' => '%27' // single quote
,'(' => '%28'
,')' => '%29'
,'#' => '%23'
,'$' => '%24'
,'&' => '%26'
,',' => '%2C'
,':' => '%3A'
,';' => '%3B'
,'=' => '%3D'
,'?' => '%3F'
,' ' => '%20' // space
,'"' => '%22' // double quote
,'%' => '%25'
,'<' => '%3C'
,'>' => '%3E'
,'[' => '%5B'
,'\' => '%5C' // forward slash \
,']' => '%5D'
,'^' => '%5E'
,'{' => '%7B'
,'|' => '%7C'
,'}' => '%7D'
,'`' => '%60'
,chr(9) => '%09'
,chr(10) => '%0A'
,chr(13) => '%0D'
,'¡' => '%A1'
,'¢' => '%A2'
,'£' => '%A3'
,'¤' => '%A4'
,'¥' => '%A5'
,'¦' => '%A6'
,'§' => '%A7'
,'¨' => '%A8'
,'©' => '%A9'
,'ª' => '%AA'
,'«' => '%AB'
,'¬' => '%AC'
,'¯' => '%AD'
,'®' => '%AE'
,'¯' => '%AF'
,'°' => '%B0'
,'±' => '%B1'
,'²' => '%B2'
,'³' => '%B3'
,'´' => '%B4'
,'µ' => '%B5'
,'¶' => '%B6'
,'·' => '%B7'
,'¸' => '%B8'
,'¹' => '%B9'
,'º' => '%BA'
,'»' => '%BB'
,'¼' => '%BC'
,'½' => '%BD'
,'¾' => '%BE'
,'¿' => '%BF'
,'À' => '%C0'
,'Á' => '%C1'
,'Â' => '%C2'
,'Ã' => '%C3'
,'Ä' => '%C4'
,'Å' => '%C5'
,'Æ' => '%C6'
,'Ç' => '%C7'
,'È' => '%C8'
,'É' => '%C9'
,'Ê' => '%CA'
,'Ë' => '%CB'
,'Ì' => '%CC'
,'Í' => '%CD'
,'Î' => '%CE'
,'Ï' => '%CF'
,'Ð' => '%D0'
,'Ñ' => '%D1'
,'Ò' => '%D2'
,'Ó' => '%D3'
,'Ô' => '%D4'
,'Õ' => '%D5'
,'Ö' => '%D6'
,'×' => '%D7'
,'Ø' => '%D8'
,'Ù' => '%D9'
,'Ú' => '%DA'
,'Û' => '%DB'
,'Ü' => '%DC'
,'Ý' => '%DD'
,'Þ' => '%DE'
,'ß' => '%DF'
,'à' => '%E0'
,'á' => '%E1'
,'â' => '%E2'
,'ã' => '%E3'
,'ä' => '%E4'
,'å' => '%E5'
,'æ' => '%E6'
,'ç' => '%E7'
,'è' => '%E8'
,'é' => '%E9'
,'ê' => '%EA'
,'ë' => '%EB'
,'ì' => '%EC'
,'í' => '%ED'
,'î' => '%EE'
,'ï' => '%EF'
,'ð' => '%F0'
,'ñ' => '%F1'
,'ò' => '%F2'
,'ó' => '%F3'
,'ô' => '%F4'
,'õ' => '%F5'
,'ö' => '%F6'
,'÷' => '%F7'
,'ø' => '%F8'
,'ù' => '%F9'
,'ú' => '%FA'
,'û' => '%FB'
,'ü' => '%FC'
,'ý' => '%FD'
,'þ' => '%FE'
,'ÿ' => '%FF'
);
$convmap = array(0x80, 0x10ffff, 0, 0xffffff);
$org = $source;
// make sure string is UTF8
if (false === mb_check_encoding($source, 'UTF-8')) {
if (false === ($source = iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $source))) {
$source = $org;
}
}
$chrArray = preg_split('//u', $source, -1, PREG_SPLIT_NO_EMPTY); // split up the UTF8 string into chars
$oChrArray = array();
foreach ($chrArray as $index => $chr) {
if (isset($map[$chr])) {
$chr = $map[$chr];
}
// if char doesn't fall within ASCII then assume unicode, get the hex html entities
//elseif (mb_detect_encoding($chr, 'ASCII', true) !== 'ASCII') {
else {
$chr = mb_encode_numericentity($chr, $convmap, "UTF-8", true);
// since we will be converting the &#x notation to the non-standard %u for backward compatbility, make sure the code is 4 digits long by prepending 0p
if (substr($chr, 0, 3) == '&#x' && substr($chr, -1) == ';' && strlen($chr) == 7)
$chr = '�'.substr($chr, 3);
}
$oChrArray[] = $chr;
}
$decodedStr = implode('', $oChrArray);
$decodedStr = preg_replace('/&#x([0-9A-F]{4});/', '%u', $decodedStr); // we need to use the %uXXXX format to simulate results generated with js escape()
return $decodedStr;
}
/**
* Simulate javascript unescape() function
*/
function unescapejs($source) {
$source = str_replace(array('%0B'), array(''), $source); // stripe out vertical tab
$s= preg_replace('/%u(....)/', '&#x;', $source);
$s= preg_replace('/%(..)/', '&#x;', $s);
return html_entity_decode($s, ENT_QUOTES, 'UTF-8');
}
首先我明白 JS escape()
和 unescape()
都被弃用了。基本上我们有一个古老的系统,JS escape()
数据在存储到 DB 之前,每次我们需要 unescape()
客户端的数据才能显示实际数据(我知道这很愚蠢但它是多年前完成,以支持非 unicode 兼容数据库上的 Unicode 字符)。
是否存在模拟 JavaScript escape()
和 unescape()
函数的现有 PHP 实现?
您正在寻找 urlencode()。如果您不能接受该编码的输出,您可以尝试使用 rawurlencode()。
这里有更多信息:
http://php.net/manual/en/function.urldecode.php
http://php.net/manual/en/function.urlencode.php
但是如果你只是想进行解码以将数据存储到 mysql 数据库中,那么你可以使用内置的 mysql 转义字符串函数将输入转换为合适的输出可以注入 mysql 数据库的格式。
参见:
经过一些搜索,我能够将两个 PHP 函数放在一起,它们可以满足我的需求。这些代码不是很漂亮,但 100% 可以处理我们目前拥有的数据,所以我想我会在这里分享它们。
/**
* Simulate javascript escape() function
*/
function escapejs($source) {
$map = array(
,'~' => '%7E'
,'!' => '%21'
,'\'' => '%27' // single quote
,'(' => '%28'
,')' => '%29'
,'#' => '%23'
,'$' => '%24'
,'&' => '%26'
,',' => '%2C'
,':' => '%3A'
,';' => '%3B'
,'=' => '%3D'
,'?' => '%3F'
,' ' => '%20' // space
,'"' => '%22' // double quote
,'%' => '%25'
,'<' => '%3C'
,'>' => '%3E'
,'[' => '%5B'
,'\' => '%5C' // forward slash \
,']' => '%5D'
,'^' => '%5E'
,'{' => '%7B'
,'|' => '%7C'
,'}' => '%7D'
,'`' => '%60'
,chr(9) => '%09'
,chr(10) => '%0A'
,chr(13) => '%0D'
,'¡' => '%A1'
,'¢' => '%A2'
,'£' => '%A3'
,'¤' => '%A4'
,'¥' => '%A5'
,'¦' => '%A6'
,'§' => '%A7'
,'¨' => '%A8'
,'©' => '%A9'
,'ª' => '%AA'
,'«' => '%AB'
,'¬' => '%AC'
,'¯' => '%AD'
,'®' => '%AE'
,'¯' => '%AF'
,'°' => '%B0'
,'±' => '%B1'
,'²' => '%B2'
,'³' => '%B3'
,'´' => '%B4'
,'µ' => '%B5'
,'¶' => '%B6'
,'·' => '%B7'
,'¸' => '%B8'
,'¹' => '%B9'
,'º' => '%BA'
,'»' => '%BB'
,'¼' => '%BC'
,'½' => '%BD'
,'¾' => '%BE'
,'¿' => '%BF'
,'À' => '%C0'
,'Á' => '%C1'
,'Â' => '%C2'
,'Ã' => '%C3'
,'Ä' => '%C4'
,'Å' => '%C5'
,'Æ' => '%C6'
,'Ç' => '%C7'
,'È' => '%C8'
,'É' => '%C9'
,'Ê' => '%CA'
,'Ë' => '%CB'
,'Ì' => '%CC'
,'Í' => '%CD'
,'Î' => '%CE'
,'Ï' => '%CF'
,'Ð' => '%D0'
,'Ñ' => '%D1'
,'Ò' => '%D2'
,'Ó' => '%D3'
,'Ô' => '%D4'
,'Õ' => '%D5'
,'Ö' => '%D6'
,'×' => '%D7'
,'Ø' => '%D8'
,'Ù' => '%D9'
,'Ú' => '%DA'
,'Û' => '%DB'
,'Ü' => '%DC'
,'Ý' => '%DD'
,'Þ' => '%DE'
,'ß' => '%DF'
,'à' => '%E0'
,'á' => '%E1'
,'â' => '%E2'
,'ã' => '%E3'
,'ä' => '%E4'
,'å' => '%E5'
,'æ' => '%E6'
,'ç' => '%E7'
,'è' => '%E8'
,'é' => '%E9'
,'ê' => '%EA'
,'ë' => '%EB'
,'ì' => '%EC'
,'í' => '%ED'
,'î' => '%EE'
,'ï' => '%EF'
,'ð' => '%F0'
,'ñ' => '%F1'
,'ò' => '%F2'
,'ó' => '%F3'
,'ô' => '%F4'
,'õ' => '%F5'
,'ö' => '%F6'
,'÷' => '%F7'
,'ø' => '%F8'
,'ù' => '%F9'
,'ú' => '%FA'
,'û' => '%FB'
,'ü' => '%FC'
,'ý' => '%FD'
,'þ' => '%FE'
,'ÿ' => '%FF'
);
$convmap = array(0x80, 0x10ffff, 0, 0xffffff);
$org = $source;
// make sure string is UTF8
if (false === mb_check_encoding($source, 'UTF-8')) {
if (false === ($source = iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $source))) {
$source = $org;
}
}
$chrArray = preg_split('//u', $source, -1, PREG_SPLIT_NO_EMPTY); // split up the UTF8 string into chars
$oChrArray = array();
foreach ($chrArray as $index => $chr) {
if (isset($map[$chr])) {
$chr = $map[$chr];
}
// if char doesn't fall within ASCII then assume unicode, get the hex html entities
//elseif (mb_detect_encoding($chr, 'ASCII', true) !== 'ASCII') {
else {
$chr = mb_encode_numericentity($chr, $convmap, "UTF-8", true);
// since we will be converting the &#x notation to the non-standard %u for backward compatbility, make sure the code is 4 digits long by prepending 0p
if (substr($chr, 0, 3) == '&#x' && substr($chr, -1) == ';' && strlen($chr) == 7)
$chr = '�'.substr($chr, 3);
}
$oChrArray[] = $chr;
}
$decodedStr = implode('', $oChrArray);
$decodedStr = preg_replace('/&#x([0-9A-F]{4});/', '%u', $decodedStr); // we need to use the %uXXXX format to simulate results generated with js escape()
return $decodedStr;
}
/**
* Simulate javascript unescape() function
*/
function unescapejs($source) {
$source = str_replace(array('%0B'), array(''), $source); // stripe out vertical tab
$s= preg_replace('/%u(....)/', '&#x;', $source);
$s= preg_replace('/%(..)/', '&#x;', $s);
return html_entity_decode($s, ENT_QUOTES, 'UTF-8');
}