从出现在字母和标签之间的字符串中提取数字

Question

我的在线日记中有一个 MySQL 文本字段，其中有时包含类似 D<num> <tag> 的文本，例如 D109 MU.

这些引用可以出现在字段的任何部分 - 因此可能是：

D109 MU, worked from home today
Walked the dog, later took the kids to swimming. D110 MU. Went to the gym in the evening for the 9th time this month.

我已经制定了一个 SQL 查询来提取包含 D<num> <tag> 内容的引用，通过这个 - 例如，通过转到 URL:

example.com/tidy.php?v1=7346&v2=90000&tag=MU

querystring数据用于获取字段外的数据：

$config = HTMLPurifier_Config::createDefault();
$purifier = new HTMLPurifier($config);

if (!empty($_GET['v1'])) {
    $v1 = $purifier->purify($_GET['v1']);
}

if (!empty($_GET['v2'])) {
    $v2 = $purifier->purify($_GET['v2']);
}

if (!empty($_GET['tag'])) {
    $tag = $purifier->purify($_GET['tag']);
}

$sql = "select id, post_date, post_content from tbl_log_days where id between :v1 and :v2 and post_content REGEXP :exp ";
$stmt = $pdo->prepare($sql);
$stmt->bindParam(':v1', $v1);
$stmt->bindParam(':v2', $v2);
$stmt->bindValue(":exp" , "D[0-9]+ $tag", PDO::PARAM_STR); 
$stmt->execute();

一切正常 - 所以我得到了相关的 post_content 条目。

但是，我正在努力研究只提取内容 D 部分的数字的语法。

我已经走到这一步了：

while ($row = $stmt->fetch()){

    $id = $row['id'];
    $dt = $row['post_date'];
    $pc = $row['post_content'];

    preg_match_all('/\d+/', $pc, $matches);
    $number = implode(' ', $matches[0]);

    echo "$number <hr>";

}

问题在于内容通常包含多个数字，但我只想获取出现在 D 和 tag 值之间的数字。因此，对于 D109 MU，我想提取 109，对于第二个示例，我想从 D110 MU 中提取 110，但忽略稍后出现在同一字段中的数字 9。

我怎样才能做到这一点？

Answer 1

假设标签总是MU。

$re = '/D(\d*) MU/'; //or $re = '/D(\d+) MU/';
//if the tag is not always MU, but 2 upcase characters, use the $re below
//$re = '/D(\d*) [A-Z]{2}/';  //or     //$re = '/D(\d+) [A-Z]{2}/';

$str = 'D109 MU, worked from home today
Walked the dog, later took the kids to swimming. D110 MU. Went to the gym in the     evening for the 9th time this month.';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

// Print the entire match result
var_dump($matches);

$matches 将包含您需要的数字。输出如下

array(2) {
    [0]=>
        array(2) {
            [0]=>
                string(7) "D109 MU"
            [1]=>
                string(3) "109"
        }
    [1]=>
        array(2) {
            [0]=>
                string(7) "D110 MU"
            [1]=>
                string(3) "110"
        }
}

Answer 2

You are not specific if the MU is a reliable string to match, so I'm leaving that out. Match the D, restart the fullstring match with \K, then match 1 or more digits.

Code: (Demo) (Regex101 Demo)

$string = 'D109 MU, worked from home today
Walked the dog, later took the kids to swimming. D110 MU. Went to the gym in the     evening for the 9th time this month.';

var_export(preg_match_all('~D\K\d+~', $string, $out) ? $out[0] : 'fail');

输出：

array (
  0 => '109',
  1 => '110',
)

Extension: If you need to increase the pattern accuracy by adding the known tag value, you can add the $tag variable to the pattern as a lookahead.

Code: (Demo)

$tag = "MU";
$string = 'D109 MU, worked from home today
Walked the dog, later took the kids to swimming. D110 MU. Went to the gym in the     evening for the 9th time this month.';

var_export(preg_match_all("~D\K\d+(?= $tag)~", $string, $out) ? $out[0] : 'fail');

Furthermore, if your strings only contain one qualifying <num>, then preg_match() will suffice.

Code: (Demo)

$tag = "MU";
$strings = [
    'D109 MU, worked from home today',
    'Walked the dog, later took the kids to swimming. D110 MU. Went to the gym in the     evening for the 9th time this month.'
];

foreach ($strings as $string) {
    echo "\n---\n" , preg_match("~D\K\d+(?= $tag)~", $string, $out) ? $out[0] : 'fail';
}

输出：

---
109
---
110

从出现在字母和标签之间的字符串中提取数字

Extract a number from a string that occurs between a letter and tag

php

regex

preg-match-all