如何从字符串中提取身份证号码?

How to extract an ID number from a string?

如何使用正则表达式或 preg_match 检索中间值?

$str = 'fxs_124024574287414=base_domain=.example.com; datr=KWHazxXEIkldzBaVq_of--syv5; csrftoken=szcwad; ds_user_id=219132; mid=XN4bpAAEAAHOyBRR4V17xfbaosyN; sessionid=14811313756%12fasda%3A27; rur=VLL;'

如何使用正则表达式或preg_match只从ds_user_id获取值?

如果我们想使用 explode:

$str = 'fxs_124024574287414=base_domain=.example.com; datr=KWHazxXEIkldzBaVq_of--syv5; csrftoken=szcwad; ds_user_id=219132; mid=XN4bpAAEAAHOyBRR4V17xfbaosyN; sessionid=14811313756%12fasda%3A27; rur=VLL;';

$arr = explode(';', $str);

foreach ($arr as $key => $value) {
    if (preg_match('/ds_user_id/s', $value)) {
        $ds_user_id = explode('=', $value);
        echo $ds_user_id[1];
    }
}

输出

219132

在这里,我们还可以使用两个 non-capturing 组和一个捕获组:

(?:ds_user_id=)(.+?)(?:;)

我们有一个左边界:

(?:ds_user_id=)

和右边界:

(?:;)

我们收集我们想要的数字或我们希望使用的任何其他东西:

(.+?)

如果我们想验证我们的身份证号码,我们可以使用:

(?:ds_user_id=)([0-9]+?)(?:;)

DEMO

我们想要的值可以简单地使用 var_dump($matches[0][1]);.

来调用

测试

$re = '/(?:ds_user_id=)(.+?)(?:;)/m';
$str = 'fxs_124024574287414=base_domain=.example.com; datr=KWHazxXEIkldzBaVq_of--syv5; csrftoken=szcwad; ds_user_id=219132; mid=XN4bpAAEAAHOyBRR4V17xfbaosyN; sessionid=14811313756%12fasda%3A27; rur=VLL;';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

// Print the entire match result
var_dump($matches);

输出

array(1) {
  [0]=>
  array(2) {
    [0]=>
    string(18) "ds_user_id=219132;"
    [1]=>
    string(6) "219132"
  }
}

DEMO

好的,没有什么能打败 mickmackusa \K 构造。
但是,对于 \K 受损的引擎,这是下一个最好的选择

(\d(?<=ds_user_id=\d)\d*)(?=;)

已解释

 (                          # (1 start), Consume many ID digits
      \d                         # First digit of ID
      (?<= ds_user_id= \d )      # Look behind, assert ID key exists before digit
      \d*                        # Optional the rest of the digits
 )                          # (1 end)
 (?= ; )                    # Look ahead, assert a colon exists

这是一个动词解决方案(没有 \K),大约快 %30。

 (                             # (1 start), Consume many ID digits
      \d                            # First digit of ID
      (?:
           (?<! ds_user_id= \d )         # Look behind, if not ID,
           \d*                           # get rest of digits
           (*SKIP)                       # Fail, then start after this
           (?!)
        |  
           \d*                           # Rest of ID digits
      )
 )                             # (1 end)
 (?= ; )                       # Look ahead, assert a colon exists

用于比较的一些基准

Regex1:   (\d(?:(?<!ds_user_id=\d)\d*(*SKIP)(?!)|\d*))(?=;)
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   1
Elapsed Time:    0.53 s,   534.47 ms,   534473 µs
Matches per sec:   93,550


Regex2:   (\d(?<=ds_user_id=\d)\d*)(?=;)
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   1
Elapsed Time:    0.80 s,   796.97 ms,   796971 µs
Matches per sec:   62,737


Regex3:   ds_user_id=\K\d+(?=;)
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   1
Elapsed Time:    0.21 s,   214.55 ms,   214549 µs
Matches per sec:   233,046


Regex4:   ds_user_id=(\d+)(?=;)
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   1
Elapsed Time:    0.23 s,   231.23 ms,   231233 µs
Matches per sec:   216,232

用preg_match匹配ds_user_id=,然后用\K忘记那些匹配的字符,再匹配一个或多个数字。没有捕获组,没有环顾四周,没有解析所有 key-value 对,没有爆炸。

代码:(Demo)

$str = 'fxs_124024574287414=base_domain=.example.com; datr=KWHazxXEIkldzBaVq_of--syv5; csrftoken=szcwad; ds_user_id=219132; mid=XN4bpAAEAAHOyBRR4V17xfbaosyN; sessionid=14811313756%12fasda%3A27; rur=VLL;';
echo preg_match('~ds_user_id=\K\d+~', $str, $out) ? $out[0] : 'no match';

输出:

219132