在句子中查找数组中所有出现的字符串列表,并将除第一个字母以外的所有内容替换为破折号
Find all occurrences of list of strings in array inside a sentence, and replace everything except the first letter with dashes
我需要在一个句子中找到所有出现的字符串数组(原始 $list 有超过 780 个项目),并将除第一个字母以外的所有内容替换为 html 破折号。
这是我当前的代码:
function sanitize($string) {
$list = array(
"dumb",
"stupid",
"brainless"
);
# replace bad words
$string = str_replace($list, '–', $string);
return $string;
}
echo sanitize('hello, i think you are not intelligent, you are actually dumb and stupid.');
这是当前结果:
hello, i think you are not intelligent, you are actually – and –
结果应该是:
hello, i think you are not intelligent, you are actually d––– and s–––––
关于如何处理这个问题有什么想法吗?谢谢!
您可以使用 array_map
生成仅包含第一个字母的替换数组,并可选择为每个被替换的字符添加一个破折号:
function sanitize($string) {
$list = array(
"dumb",
"stupid",
"brainless"
);
$repl = array_map("dashReplace", $list);
# replace bad words
$string = str_replace($list, $repl, $string);
return $string;
}
function dashReplace($str) {
return $str{0}.str_repeat("-", strlen($str)-1);
}
echo sanitize('hello, i think you are not intelligent, you are actually dumb and stupid.');
您的示例的结果是:hello, i think you are not intelligent, you are actually d--- and s-----.
您可以使用这种基于正则表达式的方法 \G
:
$str = 'hello, i think you are not intelligent, you are actually dumb and stupid.';
$list = array("dumb", "stupid", "brainless");
// use array_map to generate a regex of array for each word
$relist = array_map(function($s) {
return '/(?:\b(' . $s[0] . ')(?=' . substr($s, 1) . '\b)|(?!\A)\G)\pL/';
}, $list);
// call preg_replace using list of regex
echo preg_replace($relist, '-', $str) . "\n";
输出:
hello, i think you are not intelligent, you are actually d--- and s-----.
\G
断言位置在前一个匹配的末尾或第一个匹配的字符串的开头
(?!\A)
是负先行,以确保 \G
在行首不匹配
更新:
根据您在下面的评论,您可以使用这种不同的方法:
$str = 'word';
$relist = array_map(function($s) { return '/\b' . $s . '\b/'; }, $list);
echo preg_replace_callback($relist, function($m) {
return '<span class="bad">' . $m[0][0] . str_repeat('-', strlen($m[0])-1) . '</span>';
}, $str);
输出:
first <span class="bad">w---</span>
您可以使用 preg_replace_callback
,但您需要为 $list
数组中的每一项添加反斜杠。
function sanitize($string) {
$list = array(
"/dumb/",
"/stupid/",
"/brainless/"
);
# replace bad words
$string = preg_replace_callback($list,
function ($matches) {
return preg_replace('/\B./', '-', $matches[0]);
},
$string);
return $string;
}
echo sanitize('hello, i think you are not intelligent, you are actually dumb and stupid.');
输出:
hello, i think you are not intelligent, you are actually d--- and s-----.
我需要在一个句子中找到所有出现的字符串数组(原始 $list 有超过 780 个项目),并将除第一个字母以外的所有内容替换为 html 破折号。
这是我当前的代码:
function sanitize($string) {
$list = array(
"dumb",
"stupid",
"brainless"
);
# replace bad words
$string = str_replace($list, '–', $string);
return $string;
}
echo sanitize('hello, i think you are not intelligent, you are actually dumb and stupid.');
这是当前结果:
hello, i think you are not intelligent, you are actually – and –
结果应该是:
hello, i think you are not intelligent, you are actually d––– and s–––––
关于如何处理这个问题有什么想法吗?谢谢!
您可以使用 array_map
生成仅包含第一个字母的替换数组,并可选择为每个被替换的字符添加一个破折号:
function sanitize($string) {
$list = array(
"dumb",
"stupid",
"brainless"
);
$repl = array_map("dashReplace", $list);
# replace bad words
$string = str_replace($list, $repl, $string);
return $string;
}
function dashReplace($str) {
return $str{0}.str_repeat("-", strlen($str)-1);
}
echo sanitize('hello, i think you are not intelligent, you are actually dumb and stupid.');
您的示例的结果是:hello, i think you are not intelligent, you are actually d--- and s-----.
您可以使用这种基于正则表达式的方法 \G
:
$str = 'hello, i think you are not intelligent, you are actually dumb and stupid.';
$list = array("dumb", "stupid", "brainless");
// use array_map to generate a regex of array for each word
$relist = array_map(function($s) {
return '/(?:\b(' . $s[0] . ')(?=' . substr($s, 1) . '\b)|(?!\A)\G)\pL/';
}, $list);
// call preg_replace using list of regex
echo preg_replace($relist, '-', $str) . "\n";
输出:
hello, i think you are not intelligent, you are actually d--- and s-----.
\G
断言位置在前一个匹配的末尾或第一个匹配的字符串的开头(?!\A)
是负先行,以确保\G
在行首不匹配
更新:
根据您在下面的评论,您可以使用这种不同的方法:
$str = 'word';
$relist = array_map(function($s) { return '/\b' . $s . '\b/'; }, $list);
echo preg_replace_callback($relist, function($m) {
return '<span class="bad">' . $m[0][0] . str_repeat('-', strlen($m[0])-1) . '</span>';
}, $str);
输出:
first <span class="bad">w---</span>
您可以使用 preg_replace_callback
,但您需要为 $list
数组中的每一项添加反斜杠。
function sanitize($string) {
$list = array(
"/dumb/",
"/stupid/",
"/brainless/"
);
# replace bad words
$string = preg_replace_callback($list,
function ($matches) {
return preg_replace('/\B./', '-', $matches[0]);
},
$string);
return $string;
}
echo sanitize('hello, i think you are not intelligent, you are actually dumb and stupid.');
输出:
hello, i think you are not intelligent, you are actually d--- and s-----.