PHP 带空格的电子邮件提取器

PHP Email extractor with spaces

我有一个函数可以从字符串中提取电子邮件

$extractor = function ($str) {
        $str = str_replace('[at]', '@', $str);
        $str = str_replace('(dot)', '.', $str);

        $regexp = '/([a-z0-9_\.\-])+\@(([a-z0-9\-])+\.)+([a-z0-9]{2,4})+/i';
        preg_match_all($regexp, $str, $m);

        return isset($m[0]) ? $m[0] : [];
    };

    $test_string = 'This is a test string...

        test1@example.org

        Test different formats:
        test2@example.org;
        <a href="test3@example.org">foobar</a>
        <test4@example.org>

        strange formats:
        test5@example.org
        test6[at]example.org
        test7@example.net.org.com
        test8@ example.org
        test9@!foo!.org
        test10.abc [at] hello (dot) com

        foobar
';

    dd($extractor($test_string));

无法提取这些电子邮件,因为空格 before/after @[at].

test8@ example.org
test10.abc [at] hello (dot) com

如何忽略正则表达式中的那些空格。谢谢。

我建议 pre-processing 输入更多一些,删除 [at](dot)@ 周围的任何空格,即替换

$str = str_replace('[at]', '@', $str);
$str = str_replace('(dot)', '.', $str);

$str = preg_replace('/\s*(?:\[at]|@)\s*/', '@', $str); // replace [at] or @ with any amount of spaces before and after with @
$str = preg_replace('/\s*\(dot\)\s*/', '.', $str); // replace (dot) with any amount of spaces before and after with .

参见 PHP demo:

$extractor = function ($str) {
    $str = preg_replace('/\s*(?:\[at]|@)\s*/', '@', $str);
    $str = preg_replace('/\s*\(dot\)\s*/', '.', $str);
    $regexp = '/\b[a-z0-9_.-]+@(?:[a-z0-9-]+\.)+[a-z0-9]{2,4}\b/i';
    preg_match_all($regexp, $str, $m);
    return isset($m[0]) ? $m[0] : [];
};

输出:

Array
(
    [0] => test1@example.org
    [1] => test2@example.org
    [2] => test3@example.org
    [3] => test4@example.org
    [4] => test5@example.org
    [5] => test6@example.org
    [6] => test7@example.net.org.com
    [7] => test8@example.org
    [8] => test10.abc@hello.com
)