使用 preg_match 将字符串拆分为 phone 数字和扩展名

splitting a string into a phone number and extension using preg_match

所以我试图拆分包含 phone 号码和扩展名的字符串,因为有时字符串中存在扩展名。这是我的尝试:

$tests[] = "941-751-6550 ext 2204";
$tests[] = "(941) 751-6550 ext 2204";
$tests[] = "(941)751-6550 ext 2204";
$tests[] = "9417516550 ext 2204";
$tests[] = "941-751-6550 e 2204";
$tests[] = "941-751-6550 ext 2204 ";
$tests[] = "941-751-6550 extension 2204";
$tests[] = "941-751-6550 x2204";
$tests[] = "(941) 751-6550";
$tests[] = "(941)7516550";
$tests[] = "941-751-6550 ";
$tests[] = "941-751-6550";

foreach ($tests as $test) {
    preg_match('#([\(\)\s0-9\-]+)(.+$)#',$test,$matches);
    $phone = preg_replace('#[\-\(\)\s]#','',$matches[1]);
    $extension = preg_replace('#[^0-9]#','',$matches[2]);
    if ($phone == '9417516550' 
        && ($extension == '2204' 
            || $extension == '0')) {
                echo "PASS: phone: $phone ext: $extension<br />";
    } else {
        echo "FAIL: phone: $phone ext: $extension<br />";
    }
}

但是,当我 运行 这些测试以查看它是否正确拆分 phone 号码和扩展名时,我得到以下输出:

PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
FAIL: phone: 941751655 ext: 0
FAIL: phone: 941751655 ext: 0
FAIL: phone: 9417516550 ext: 
FAIL: phone: 941751655 ext: 0

如您所见,当我完全排除扩展名时(最后四个测试),它会中断。我该如何更正 preg_match() 正则表达式,使 FAIL: ... 行看起来像 PASS: phone: 9417516550 ext: 0

我会在 preg_match 完成这一切。假设这些数字是非国际的,我认为这会起作用。

foreach ($tests as $test) {
    preg_match('#\(?(\d{3})\)?[-\h]?(\d{3})[-\h]?(\d{4})\h*(?:e?x?t?(?:ension)?\h(\d+))?#',$test,$matches);
    $phone = $matches[1] . $matches[2] . $matches[3];
    $extension = !empty($matches[4]) ? $matches[4] : 0;
    if ($phone == '9417516550' 
        && ($extension == '2204' || $extension == '0')) {
            echo "PASS: phone: $phone ext: $extension<br />";
    } else {
         echo "FAIL: phone: $phone ext: $extension<br />";
    }
}

演示:https://eval.in/561720
Regex101 演示:https://regex101.com/r/mG9iD1/1

从你的例子来看,当没有找到分机时,它似乎失败了。

一个解决方案是像这样转换为 int $extension

$extension = intval($extension); //If nothing found will be 0

在此之后我们确定我们有一个 integer 并且我们可以将 if 语句更改为:

|| $extension === 0)) {

(.+$)表示一行的结尾必须是1个或多个符号。因此,如果您在 phone 数字之后没有任何内容 - 那么您的 phone 数字将减去 1 个符号。

我建议使用 (.*$),这意味着零个或多个符号。

这按预期工作,刚刚测试。

foreach ($tests as $test) {
    preg_match('#([\(\)0-9\-]+\s*[\(\)0-9\-]+)\s*(.*$)#',$test,$matches);
    $phone = preg_replace('#[\-\(\)\s]#','',$matches[1]);
    $extension = ($matches[2] == "") ? '0' : preg_replace('#[^0-9]#','',$matches[2]);
    if ($phone == '9417516550'
        && ($extension == '2204'
            || $extension == '0')) {
                echo "PASS: phone: $phone ext: $extension<br />\n";
    } else {
        echo "FAIL: phone: $phone ext: $extension<br />\n";
    }
}

对您的代码进行最少的更改。

$pns = <<< LOL
941-751-6550 ext 2204
(941) 751-6550 ext 2204
(941)751-6550 ext 2204
9417516550 ext 2204
941-751-6550 e 2204
941-751-6550 ext 2204 
941-751-6550 extension 2204
941-751-6550 x2204
(941) 751-6550
(941)7516550
941-751-6550
941-751-6550
LOL;

preg_match_all('/^([(\d )\-]+)\s?(?:e.*?|x.*?)?(\d+)?$/sim', $pns, $matches, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($matches[1]); $i++) {
    $phone = preg_replace('#[\-\(\)\s]#','', $matches[1][$i]);
    $extension = preg_replace('#[^0-9]#','', $matches[2][$i]);
    if ($phone == '9417516550' && $extension == '2204') {
             echo "PASS: phone: $phone ext: $extension\n";
    } else {
             echo "FAIL: phone: $phone ext: 0\n";
    }
}

输出:

PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
FAIL: phone: 9417516550 ext: 0
FAIL: phone: 9417516550 ext: 0
FAIL: phone: 9417516550 ext: 0
FAIL: phone: 9417516550 ext: 0

Ideone Demo

老实说,您最好去掉非数字字符,然后将前 10 个字符之后的所有字符拆分为扩展名。它在概念上是等价的,但比 运行 通过多个正则表达式更直接、更简单、更高效,后者本来就很慢。

foreach($tests as $test){
    $phone = preg_replace("/[^0-9]/", "", $test);
    $extension = substr($phone,10);
    $phone = substr($phone,0,10);
    if(empty($extension)){
         $extension = '0';
    }
    if ($phone == '9417516550'
        && ($extension == '2204'
            || $extension == '0')) {
                echo "PASS: phone: $phone ext: $extension<br />\n";
    } else {
        echo "FAIL: phone: $phone ext: $extension<br />\n";
    }
}

输出:

PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 0
PASS: phone: 9417516550 ext: 0
PASS: phone: 9417516550 ext: 0
PASS: phone: 9417516550 ext: 0