使用 preg_match 将字符串拆分为 phone 数字和扩展名
splitting a string into a phone number and extension using preg_match
所以我试图拆分包含 phone 号码和扩展名的字符串,因为有时字符串中存在扩展名。这是我的尝试:
$tests[] = "941-751-6550 ext 2204";
$tests[] = "(941) 751-6550 ext 2204";
$tests[] = "(941)751-6550 ext 2204";
$tests[] = "9417516550 ext 2204";
$tests[] = "941-751-6550 e 2204";
$tests[] = "941-751-6550 ext 2204 ";
$tests[] = "941-751-6550 extension 2204";
$tests[] = "941-751-6550 x2204";
$tests[] = "(941) 751-6550";
$tests[] = "(941)7516550";
$tests[] = "941-751-6550 ";
$tests[] = "941-751-6550";
foreach ($tests as $test) {
preg_match('#([\(\)\s0-9\-]+)(.+$)#',$test,$matches);
$phone = preg_replace('#[\-\(\)\s]#','',$matches[1]);
$extension = preg_replace('#[^0-9]#','',$matches[2]);
if ($phone == '9417516550'
&& ($extension == '2204'
|| $extension == '0')) {
echo "PASS: phone: $phone ext: $extension<br />";
} else {
echo "FAIL: phone: $phone ext: $extension<br />";
}
}
但是,当我 运行 这些测试以查看它是否正确拆分 phone 号码和扩展名时,我得到以下输出:
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
FAIL: phone: 941751655 ext: 0
FAIL: phone: 941751655 ext: 0
FAIL: phone: 9417516550 ext:
FAIL: phone: 941751655 ext: 0
如您所见,当我完全排除扩展名时(最后四个测试),它会中断。我该如何更正 preg_match()
正则表达式,使 FAIL: ...
行看起来像 PASS: phone: 9417516550 ext: 0
?
我会在 preg_match
完成这一切。假设这些数字是非国际的,我认为这会起作用。
foreach ($tests as $test) {
preg_match('#\(?(\d{3})\)?[-\h]?(\d{3})[-\h]?(\d{4})\h*(?:e?x?t?(?:ension)?\h(\d+))?#',$test,$matches);
$phone = $matches[1] . $matches[2] . $matches[3];
$extension = !empty($matches[4]) ? $matches[4] : 0;
if ($phone == '9417516550'
&& ($extension == '2204' || $extension == '0')) {
echo "PASS: phone: $phone ext: $extension<br />";
} else {
echo "FAIL: phone: $phone ext: $extension<br />";
}
}
演示:https://eval.in/561720
Regex101 演示:https://regex101.com/r/mG9iD1/1
从你的例子来看,当没有找到分机时,它似乎失败了。
一个解决方案是像这样转换为 int $extension
:
$extension = intval($extension); //If nothing found will be 0
在此之后我们确定我们有一个 integer
并且我们可以将 if 语句更改为:
|| $extension === 0)) {
(.+$)
表示一行的结尾必须是1个或多个符号。因此,如果您在 phone 数字之后没有任何内容 - 那么您的 phone 数字将减去 1 个符号。
我建议使用 (.*$)
,这意味着零个或多个符号。
这按预期工作,刚刚测试。
foreach ($tests as $test) {
preg_match('#([\(\)0-9\-]+\s*[\(\)0-9\-]+)\s*(.*$)#',$test,$matches);
$phone = preg_replace('#[\-\(\)\s]#','',$matches[1]);
$extension = ($matches[2] == "") ? '0' : preg_replace('#[^0-9]#','',$matches[2]);
if ($phone == '9417516550'
&& ($extension == '2204'
|| $extension == '0')) {
echo "PASS: phone: $phone ext: $extension<br />\n";
} else {
echo "FAIL: phone: $phone ext: $extension<br />\n";
}
}
对您的代码进行最少的更改。
$pns = <<< LOL
941-751-6550 ext 2204
(941) 751-6550 ext 2204
(941)751-6550 ext 2204
9417516550 ext 2204
941-751-6550 e 2204
941-751-6550 ext 2204
941-751-6550 extension 2204
941-751-6550 x2204
(941) 751-6550
(941)7516550
941-751-6550
941-751-6550
LOL;
preg_match_all('/^([(\d )\-]+)\s?(?:e.*?|x.*?)?(\d+)?$/sim', $pns, $matches, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($matches[1]); $i++) {
$phone = preg_replace('#[\-\(\)\s]#','', $matches[1][$i]);
$extension = preg_replace('#[^0-9]#','', $matches[2][$i]);
if ($phone == '9417516550' && $extension == '2204') {
echo "PASS: phone: $phone ext: $extension\n";
} else {
echo "FAIL: phone: $phone ext: 0\n";
}
}
输出:
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
FAIL: phone: 9417516550 ext: 0
FAIL: phone: 9417516550 ext: 0
FAIL: phone: 9417516550 ext: 0
FAIL: phone: 9417516550 ext: 0
老实说,您最好去掉非数字字符,然后将前 10 个字符之后的所有字符拆分为扩展名。它在概念上是等价的,但比 运行 通过多个正则表达式更直接、更简单、更高效,后者本来就很慢。
foreach($tests as $test){
$phone = preg_replace("/[^0-9]/", "", $test);
$extension = substr($phone,10);
$phone = substr($phone,0,10);
if(empty($extension)){
$extension = '0';
}
if ($phone == '9417516550'
&& ($extension == '2204'
|| $extension == '0')) {
echo "PASS: phone: $phone ext: $extension<br />\n";
} else {
echo "FAIL: phone: $phone ext: $extension<br />\n";
}
}
输出:
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 0
PASS: phone: 9417516550 ext: 0
PASS: phone: 9417516550 ext: 0
PASS: phone: 9417516550 ext: 0
所以我试图拆分包含 phone 号码和扩展名的字符串,因为有时字符串中存在扩展名。这是我的尝试:
$tests[] = "941-751-6550 ext 2204";
$tests[] = "(941) 751-6550 ext 2204";
$tests[] = "(941)751-6550 ext 2204";
$tests[] = "9417516550 ext 2204";
$tests[] = "941-751-6550 e 2204";
$tests[] = "941-751-6550 ext 2204 ";
$tests[] = "941-751-6550 extension 2204";
$tests[] = "941-751-6550 x2204";
$tests[] = "(941) 751-6550";
$tests[] = "(941)7516550";
$tests[] = "941-751-6550 ";
$tests[] = "941-751-6550";
foreach ($tests as $test) {
preg_match('#([\(\)\s0-9\-]+)(.+$)#',$test,$matches);
$phone = preg_replace('#[\-\(\)\s]#','',$matches[1]);
$extension = preg_replace('#[^0-9]#','',$matches[2]);
if ($phone == '9417516550'
&& ($extension == '2204'
|| $extension == '0')) {
echo "PASS: phone: $phone ext: $extension<br />";
} else {
echo "FAIL: phone: $phone ext: $extension<br />";
}
}
但是,当我 运行 这些测试以查看它是否正确拆分 phone 号码和扩展名时,我得到以下输出:
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
FAIL: phone: 941751655 ext: 0
FAIL: phone: 941751655 ext: 0
FAIL: phone: 9417516550 ext:
FAIL: phone: 941751655 ext: 0
如您所见,当我完全排除扩展名时(最后四个测试),它会中断。我该如何更正 preg_match()
正则表达式,使 FAIL: ...
行看起来像 PASS: phone: 9417516550 ext: 0
?
我会在 preg_match
完成这一切。假设这些数字是非国际的,我认为这会起作用。
foreach ($tests as $test) {
preg_match('#\(?(\d{3})\)?[-\h]?(\d{3})[-\h]?(\d{4})\h*(?:e?x?t?(?:ension)?\h(\d+))?#',$test,$matches);
$phone = $matches[1] . $matches[2] . $matches[3];
$extension = !empty($matches[4]) ? $matches[4] : 0;
if ($phone == '9417516550'
&& ($extension == '2204' || $extension == '0')) {
echo "PASS: phone: $phone ext: $extension<br />";
} else {
echo "FAIL: phone: $phone ext: $extension<br />";
}
}
演示:https://eval.in/561720
Regex101 演示:https://regex101.com/r/mG9iD1/1
从你的例子来看,当没有找到分机时,它似乎失败了。
一个解决方案是像这样转换为 int $extension
:
$extension = intval($extension); //If nothing found will be 0
在此之后我们确定我们有一个 integer
并且我们可以将 if 语句更改为:
|| $extension === 0)) {
(.+$)
表示一行的结尾必须是1个或多个符号。因此,如果您在 phone 数字之后没有任何内容 - 那么您的 phone 数字将减去 1 个符号。
我建议使用 (.*$)
,这意味着零个或多个符号。
这按预期工作,刚刚测试。
foreach ($tests as $test) {
preg_match('#([\(\)0-9\-]+\s*[\(\)0-9\-]+)\s*(.*$)#',$test,$matches);
$phone = preg_replace('#[\-\(\)\s]#','',$matches[1]);
$extension = ($matches[2] == "") ? '0' : preg_replace('#[^0-9]#','',$matches[2]);
if ($phone == '9417516550'
&& ($extension == '2204'
|| $extension == '0')) {
echo "PASS: phone: $phone ext: $extension<br />\n";
} else {
echo "FAIL: phone: $phone ext: $extension<br />\n";
}
}
对您的代码进行最少的更改。
$pns = <<< LOL
941-751-6550 ext 2204
(941) 751-6550 ext 2204
(941)751-6550 ext 2204
9417516550 ext 2204
941-751-6550 e 2204
941-751-6550 ext 2204
941-751-6550 extension 2204
941-751-6550 x2204
(941) 751-6550
(941)7516550
941-751-6550
941-751-6550
LOL;
preg_match_all('/^([(\d )\-]+)\s?(?:e.*?|x.*?)?(\d+)?$/sim', $pns, $matches, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($matches[1]); $i++) {
$phone = preg_replace('#[\-\(\)\s]#','', $matches[1][$i]);
$extension = preg_replace('#[^0-9]#','', $matches[2][$i]);
if ($phone == '9417516550' && $extension == '2204') {
echo "PASS: phone: $phone ext: $extension\n";
} else {
echo "FAIL: phone: $phone ext: 0\n";
}
}
输出:
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
FAIL: phone: 9417516550 ext: 0
FAIL: phone: 9417516550 ext: 0
FAIL: phone: 9417516550 ext: 0
FAIL: phone: 9417516550 ext: 0
老实说,您最好去掉非数字字符,然后将前 10 个字符之后的所有字符拆分为扩展名。它在概念上是等价的,但比 运行 通过多个正则表达式更直接、更简单、更高效,后者本来就很慢。
foreach($tests as $test){
$phone = preg_replace("/[^0-9]/", "", $test);
$extension = substr($phone,10);
$phone = substr($phone,0,10);
if(empty($extension)){
$extension = '0';
}
if ($phone == '9417516550'
&& ($extension == '2204'
|| $extension == '0')) {
echo "PASS: phone: $phone ext: $extension<br />\n";
} else {
echo "FAIL: phone: $phone ext: $extension<br />\n";
}
}
输出:
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 0
PASS: phone: 9417516550 ext: 0
PASS: phone: 9417516550 ext: 0
PASS: phone: 9417516550 ext: 0