匹配两个特定子字符串之一之前或之后的数字子字符串
Match numeric substring which is preceded or followed by one of two specific substrings
我有一个程序可以从后面有 Kč
或 CZK
的链中选择一个金额。如何编辑表达式(模式)以检查 Kč
或 CZK
是否在数字前面?查看字符串 1 和字符串 2:
$string='Rohlík 4,99 Kč 51235';
//$string1='Rohlík CZK 4,99 51235';
//$string2='Rohlík Kč4,99 51235';
$replace = [' ', '.'];
$string = str_replace($replace,"",$string);
$string = str_replace(',',".",$string);
/*Change?*/
$pattern = '/[0-9]*[.]?[0-9]*[Kč,CZK]/';
preg_match($pattern, $string, $matches); // => 4.99 Kč
$string = $matches;
$pattern = '/[0-9]*[.]?[0-9]*/';
preg_match($pattern, $string[0], $matches);
$price = $matches[0];
print_r($price); // => 4.99
在你的模式中使用逻辑分组来匹配可能出现在目标数字之前或之后的标签(可以在这一步之后用点替换逗号)。
代码:(Demo)
$strings = [
'Rohlík 4,99 Kč 51235',
'Rohlík CZK 4,99 51235',
'Rohlík Kč4,99 51235',
'Rohlík foo4,99 51235'
];
foreach ($strings as $string) {
var_export(
preg_match('/\b(?:(?:Kč|CZK) ?\K\d+(?:,\d+)?|\d+(?:,\d+)?(?= ?(?:Kč|CZK)))\b/u', $string, $m)
? $m[0]
: 'not found'
);
echo "\n";
}
输出:
'4,99'
'4,99'
'4,99'
'not found'
模式分解:
/ #starting pattern delimiter
\b #word boundary to guarantee matching the whole label
(?: #start non-capturing group 1
(?:Kč|CZK) ? #non-capturing group 2 requiring one of two labels, optionally followed by a space
\K #forget all previously matched characters
\d+(?:,\d+)? #match the targeted integer/float value with comma as decimal placeholder
| #OR
\d+(?:,\d+)? #match the targeted integer/float value with comma as decimal placeholder
(?= ?(?:Kč|CZK)) #lookahead to for optional space followed by one of the two labels
) #close non-capturing group 1
\b #word boundary to guarantee matching the whole label
/ #ending pattern delimiter
u #unicode/multi-byte flag
我有一个程序可以从后面有 Kč
或 CZK
的链中选择一个金额。如何编辑表达式(模式)以检查 Kč
或 CZK
是否在数字前面?查看字符串 1 和字符串 2:
$string='Rohlík 4,99 Kč 51235';
//$string1='Rohlík CZK 4,99 51235';
//$string2='Rohlík Kč4,99 51235';
$replace = [' ', '.'];
$string = str_replace($replace,"",$string);
$string = str_replace(',',".",$string);
/*Change?*/
$pattern = '/[0-9]*[.]?[0-9]*[Kč,CZK]/';
preg_match($pattern, $string, $matches); // => 4.99 Kč
$string = $matches;
$pattern = '/[0-9]*[.]?[0-9]*/';
preg_match($pattern, $string[0], $matches);
$price = $matches[0];
print_r($price); // => 4.99
在你的模式中使用逻辑分组来匹配可能出现在目标数字之前或之后的标签(可以在这一步之后用点替换逗号)。
代码:(Demo)
$strings = [
'Rohlík 4,99 Kč 51235',
'Rohlík CZK 4,99 51235',
'Rohlík Kč4,99 51235',
'Rohlík foo4,99 51235'
];
foreach ($strings as $string) {
var_export(
preg_match('/\b(?:(?:Kč|CZK) ?\K\d+(?:,\d+)?|\d+(?:,\d+)?(?= ?(?:Kč|CZK)))\b/u', $string, $m)
? $m[0]
: 'not found'
);
echo "\n";
}
输出:
'4,99'
'4,99'
'4,99'
'not found'
模式分解:
/ #starting pattern delimiter
\b #word boundary to guarantee matching the whole label
(?: #start non-capturing group 1
(?:Kč|CZK) ? #non-capturing group 2 requiring one of two labels, optionally followed by a space
\K #forget all previously matched characters
\d+(?:,\d+)? #match the targeted integer/float value with comma as decimal placeholder
| #OR
\d+(?:,\d+)? #match the targeted integer/float value with comma as decimal placeholder
(?= ?(?:Kč|CZK)) #lookahead to for optional space followed by one of the two labels
) #close non-capturing group 1
\b #word boundary to guarantee matching the whole label
/ #ending pattern delimiter
u #unicode/multi-byte flag