正则表达式谜语。想要的结果:'onetwothree'、'onetwo'、'twothree' 但不是 'two'。也许是积极的前瞻?用于货币提取
regex riddle. want results: 'onetwothree', 'onetwo', 'twothree' but NOT 'two'. Positive lookahead maybe?. For currency extraction
混淆了一些基本的正则表达式逻辑。使用简单示例:
(one)(two)(three)
我想要正则表达式捕捉:
onetwothree
onetwo
twothree
但不是
two
分组抓(一)(二)(三).
我知道我可以在 'two' 上使用正向前瞻,这样它前面只有 'one':
(one)?((?<=one)two)(three)?
但是我无法得到 'twothree' 结果
现实世界需要的是货币:
group one: [$¥£€₹]
group two: ((?:\d{1,10}[,. ])*\d{1,10})
group three: ( ?(?:[$¥£€₹]|AUD|USD|GBP|EURO?S?\b))
所以我想得到这些结果:
,000 AUD
,000
20,000 AUD
但不是
20,000
感谢帮助!
PS需要分组匹配(一)(二)(三)或(一)(二)或(二)(三)。
老实说,我会丢弃任何 lookaheads/lookbehinds 并单独定义所有情况,然后将它们组合起来。它更可靠,更容易推理和理解,也更有效。
也一样
(^(groupone)(grouptwo)$)|(^(groupone)(grouptwo)(groupthree)$)|(^(grouptwo)(groupthree)$)
例如:
$groupone = '[$¥£€₹]';
$grouptwo = '(?:\d{1,10}[,. ])*\d{1,10}';
$groupthree = ' ?([$¥£€₹]|AUD|USD|GBP|EURO)';
$caseone = "^($groupone)($grouptwo)$";
$casetwo = "^($groupone)($grouptwo)($groupthree)$";
$casethree = "^($grouptwo)($groupthree)$";
$allcases = "/($caseone)|($casetwo)|($casethree)/";
preg_match($allcases, '20,000 AUD', $matches);
print_r($matches); // matches, preg_match returns 1
preg_match($allcases, ',000', $matches);
print_r($matches); // matches, preg_match returns 1
preg_match($allcases, ',000 AUD', $matches);
print_r($matches); // matches, preg_match returns 1
preg_match($allcases, '20,000', $matches);
print_r($matches); // empty, preg_match returns 0
为了使结果看起来更好(跳过空结果、重复项、多余的空格等),我还使用了清理功能:
<?php
$groupone = '[$¥£€₹]';
$grouptwo = '(?:\d{1,10}[,. ])*\d{1,10}';
$groupthree = ' ?([$¥£€₹]|AUD|USD|GBP|EURO)';
$caseone = "^($groupone)($grouptwo)$";
$casetwo = "^($grouptwo)($groupthree)$";
$casethree = "^($groupone)($grouptwo)($groupthree)$";
$allcases = "/($caseone)|($casetwo)|($casethree)/";
function cleanup($arr) {
# trim trailing, ending whitespace
$newarr = array_map('trim', $arr);
# remove empty values
$newarr = array_filter($newarr, function($value) { return $value !== ''; });
# remove duplicates
$newarr = array_unique($newarr);
# we're only interested about the values
return array_values($newarr);
}
preg_match($allcases, '20,000 AUD', $matches);
print_r(cleanup($matches));
preg_match($allcases, ',000', $matches);
print_r(cleanup($matches));
preg_match($allcases, ',000 AUD', $matches);
print_r(cleanup($matches));
preg_match($allcases, '20,000', $matches);
print_r(cleanup($matches));
这会让你得到像
这样的结果
Array
(
[0] => 20,000 AUD
[1] => 20,000
[2] => AUD
)
Array
(
[0] => ,000
[1] => $
[2] => 20,000
)
Array
(
[0] => ,000 AUD
[1] => $
[2] => 20,000
[3] => AUD
)
Array
(
)
编辑:如果您希望组相同,您可以使用像
这样的命名组
$groupone = '(?<currencyprefix>[$¥£€₹])';
$grouptwo = '((?:\d{1,10}[,. ])*\d{1,10})';
$groupthree = ' ?(?<currencypostfix>([$¥£€₹]|AUD|USD|GBP|EURO))';
$caseone = "^$groupone$grouptwo$";
$casetwo = "^$grouptwo$groupthree$";
$casethree = "^$groupone$grouptwo$groupthree$";
$allcases = "/(?J)($caseone)|($casetwo)|($casethree)/";
function cleanup($arr) {
$currencyprefix = isset($arr['currencyprefix']) ? $arr['currencyprefix'] : null;
$currencypostfix = isset($arr['currencypostfix']) ? $arr['currencypostfix'] :null;
return array($currencyprefix, $currencypostfix);
}
if (preg_match($allcases, '20,000 AUD', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, ',000', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, ',000 AUD', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, '20,000', $matches))
print_r(cleanup($matches));
哪个会让你
Array
(
[0] =>
[1] => AUD
)
Array
(
[0] => $
[1] =>
)
Array
(
[0] => $
[1] => AUD
)
或者,也可以在最终结果中使用命名键:
$groupone = '(?<currencyprefix>[$¥£€₹])';
$grouptwo = '((?:\d{1,10}[,. ])*\d{1,10})';
$groupthree = ' ?(?<currencypostfix>([$¥£€₹]|AUD|USD|GBP|EURO))';
$caseone = "^$groupone$grouptwo$";
$casetwo = "^$grouptwo$groupthree$";
$casethree = "^$groupone$grouptwo$groupthree$";
$allcases = "/(?J)($caseone)|($casetwo)|($casethree)/";
function cleanup($arr) {
$newarr = array_filter($arr, function($var){ return !empty($var); });
return array_filter($newarr, "is_string", ARRAY_FILTER_USE_KEY);
}
if (preg_match($allcases, '20,000 AUD', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, ',000', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, ',000 AUD', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, '20,000', $matches))
print_r(cleanup($matches));
结果:
Array
(
[currencypostfix] => AUD
)
Array
(
[currencyprefix] => $
)
Array
(
[currencyprefix] => $
[currencypostfix] => AUD
)
这是一个纯正则表达式响应:
(ONE)?TWO(?(1)(THREE)?|THREE)
使用条件语句,您可以检查第一组是否匹配,如果匹配,则可以强制最后一组。
(ONE)?TWO(?(1)(THREE)?|THREE)
^ ^ ^^^^^ ^ ^
1 2 3 4 5
1: Try to match ONE. If you can't find it, no big deal.
2: You absolutely must have TWO.
3: If the first group DID match (ONE), then...
4: ... Use the first result
5: Otherwise use the second result
有了这个,我们只是让第一个结果是可选的,所以如果我们匹配一个,那么三个是可选的。如果我们错过了一个,那么三个是强制性的。
ONE
TWO
THREE
ONETWO // Matches! (e.g: ,000)
ONETHREE
TWOTHREE // Matches! (e.g: 20,000 AUD)
ONETWOTHREE // Matches! (e.g: ,000 AUD)
您可以使用可选组的交替。
\bonetwo(?:three)?|twothree\b
具有命名捕获组和 J 标志以允许重复子模式名称的示例:
(?P<symbol>[$¥£€₹])(?P<amount>\d{1,10}(?:[.,]\d{1,10})?)(?:\h+(?P<currency>AUD|USD|GBP|EURO?S?)\b)?\b|(?P<amount>\d{1,10}(?:[.,]\d{1,10})?)\h+(?P<currency>AUD|USD|GBP|EURO?S?)\b
$strings = [
",000 AUD",
",000",
"20,000 AUD",
"$",
"20,000",
"AUD"
];
$re = '/(?P<symbol>[$¥£€₹])(?P<amount>\d{1,10}(?:[.,]\d{1,10})?)(?:\h+(?P<currency>AUD|USD|GBP|EURO?S?)\b)?\b|(?P<amount>\d{1,10}(?:[.,]\d{1,10})?)\h+(?P<currency>AUD|USD|GBP|EURO?S?)\b/J';
foreach ($strings as $s) {
$m = preg_match($re, $s, $matches);
if ($m) {
print_r($matches);
}
}
输出
Array
(
[0] => ,000 AUD
[symbol] => $
[1] => $
[amount] => 20,000
[2] => 20,000
[currency] => AUD
[3] => AUD
)
Array
(
[0] => ,000
[symbol] => $
[1] => $
[amount] => 20,000
[2] => 20,000
)
Array
(
[0] => 20,000 AUD
[symbol] =>
[1] =>
[amount] => 20,000
[2] =>
[currency] => AUD
[3] =>
[4] => 20,000
[5] => AUD
)
或see an example without the numerical keys将给出
Array
(
[symbol] => $
[amount] => 20,000
[currency] => AUD
)
Array
(
[symbol] => $
[amount] => 20,000
)
Array
(
[symbol] =>
[amount] => 20,000
[currency] => AUD
)
混淆了一些基本的正则表达式逻辑。使用简单示例:
(one)(two)(three)
我想要正则表达式捕捉:
onetwothree
onetwo
twothree
但不是
two
分组抓(一)(二)(三).
我知道我可以在 'two' 上使用正向前瞻,这样它前面只有 'one':
(one)?((?<=one)two)(three)?
但是我无法得到 'twothree' 结果
现实世界需要的是货币:
group one: [$¥£€₹]
group two: ((?:\d{1,10}[,. ])*\d{1,10})
group three: ( ?(?:[$¥£€₹]|AUD|USD|GBP|EURO?S?\b))
所以我想得到这些结果:
,000 AUD
,000
20,000 AUD
但不是
20,000
感谢帮助!
PS需要分组匹配(一)(二)(三)或(一)(二)或(二)(三)。
老实说,我会丢弃任何 lookaheads/lookbehinds 并单独定义所有情况,然后将它们组合起来。它更可靠,更容易推理和理解,也更有效。
也一样
(^(groupone)(grouptwo)$)|(^(groupone)(grouptwo)(groupthree)$)|(^(grouptwo)(groupthree)$)
例如:
$groupone = '[$¥£€₹]';
$grouptwo = '(?:\d{1,10}[,. ])*\d{1,10}';
$groupthree = ' ?([$¥£€₹]|AUD|USD|GBP|EURO)';
$caseone = "^($groupone)($grouptwo)$";
$casetwo = "^($groupone)($grouptwo)($groupthree)$";
$casethree = "^($grouptwo)($groupthree)$";
$allcases = "/($caseone)|($casetwo)|($casethree)/";
preg_match($allcases, '20,000 AUD', $matches);
print_r($matches); // matches, preg_match returns 1
preg_match($allcases, ',000', $matches);
print_r($matches); // matches, preg_match returns 1
preg_match($allcases, ',000 AUD', $matches);
print_r($matches); // matches, preg_match returns 1
preg_match($allcases, '20,000', $matches);
print_r($matches); // empty, preg_match returns 0
为了使结果看起来更好(跳过空结果、重复项、多余的空格等),我还使用了清理功能:
<?php
$groupone = '[$¥£€₹]';
$grouptwo = '(?:\d{1,10}[,. ])*\d{1,10}';
$groupthree = ' ?([$¥£€₹]|AUD|USD|GBP|EURO)';
$caseone = "^($groupone)($grouptwo)$";
$casetwo = "^($grouptwo)($groupthree)$";
$casethree = "^($groupone)($grouptwo)($groupthree)$";
$allcases = "/($caseone)|($casetwo)|($casethree)/";
function cleanup($arr) {
# trim trailing, ending whitespace
$newarr = array_map('trim', $arr);
# remove empty values
$newarr = array_filter($newarr, function($value) { return $value !== ''; });
# remove duplicates
$newarr = array_unique($newarr);
# we're only interested about the values
return array_values($newarr);
}
preg_match($allcases, '20,000 AUD', $matches);
print_r(cleanup($matches));
preg_match($allcases, ',000', $matches);
print_r(cleanup($matches));
preg_match($allcases, ',000 AUD', $matches);
print_r(cleanup($matches));
preg_match($allcases, '20,000', $matches);
print_r(cleanup($matches));
这会让你得到像
这样的结果Array
(
[0] => 20,000 AUD
[1] => 20,000
[2] => AUD
)
Array
(
[0] => ,000
[1] => $
[2] => 20,000
)
Array
(
[0] => ,000 AUD
[1] => $
[2] => 20,000
[3] => AUD
)
Array
(
)
编辑:如果您希望组相同,您可以使用像
这样的命名组$groupone = '(?<currencyprefix>[$¥£€₹])';
$grouptwo = '((?:\d{1,10}[,. ])*\d{1,10})';
$groupthree = ' ?(?<currencypostfix>([$¥£€₹]|AUD|USD|GBP|EURO))';
$caseone = "^$groupone$grouptwo$";
$casetwo = "^$grouptwo$groupthree$";
$casethree = "^$groupone$grouptwo$groupthree$";
$allcases = "/(?J)($caseone)|($casetwo)|($casethree)/";
function cleanup($arr) {
$currencyprefix = isset($arr['currencyprefix']) ? $arr['currencyprefix'] : null;
$currencypostfix = isset($arr['currencypostfix']) ? $arr['currencypostfix'] :null;
return array($currencyprefix, $currencypostfix);
}
if (preg_match($allcases, '20,000 AUD', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, ',000', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, ',000 AUD', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, '20,000', $matches))
print_r(cleanup($matches));
哪个会让你
Array
(
[0] =>
[1] => AUD
)
Array
(
[0] => $
[1] =>
)
Array
(
[0] => $
[1] => AUD
)
或者,也可以在最终结果中使用命名键:
$groupone = '(?<currencyprefix>[$¥£€₹])';
$grouptwo = '((?:\d{1,10}[,. ])*\d{1,10})';
$groupthree = ' ?(?<currencypostfix>([$¥£€₹]|AUD|USD|GBP|EURO))';
$caseone = "^$groupone$grouptwo$";
$casetwo = "^$grouptwo$groupthree$";
$casethree = "^$groupone$grouptwo$groupthree$";
$allcases = "/(?J)($caseone)|($casetwo)|($casethree)/";
function cleanup($arr) {
$newarr = array_filter($arr, function($var){ return !empty($var); });
return array_filter($newarr, "is_string", ARRAY_FILTER_USE_KEY);
}
if (preg_match($allcases, '20,000 AUD', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, ',000', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, ',000 AUD', $matches))
print_r(cleanup($matches));
if (preg_match($allcases, '20,000', $matches))
print_r(cleanup($matches));
结果:
Array
(
[currencypostfix] => AUD
)
Array
(
[currencyprefix] => $
)
Array
(
[currencyprefix] => $
[currencypostfix] => AUD
)
这是一个纯正则表达式响应:
(ONE)?TWO(?(1)(THREE)?|THREE)
使用条件语句,您可以检查第一组是否匹配,如果匹配,则可以强制最后一组。
(ONE)?TWO(?(1)(THREE)?|THREE)
^ ^ ^^^^^ ^ ^
1 2 3 4 5
1: Try to match ONE. If you can't find it, no big deal.
2: You absolutely must have TWO.
3: If the first group DID match (ONE), then...
4: ... Use the first result
5: Otherwise use the second result
有了这个,我们只是让第一个结果是可选的,所以如果我们匹配一个,那么三个是可选的。如果我们错过了一个,那么三个是强制性的。
ONE
TWO
THREE
ONETWO // Matches! (e.g: ,000)
ONETHREE
TWOTHREE // Matches! (e.g: 20,000 AUD)
ONETWOTHREE // Matches! (e.g: ,000 AUD)
您可以使用可选组的交替。
\bonetwo(?:three)?|twothree\b
具有命名捕获组和 J 标志以允许重复子模式名称的示例:
(?P<symbol>[$¥£€₹])(?P<amount>\d{1,10}(?:[.,]\d{1,10})?)(?:\h+(?P<currency>AUD|USD|GBP|EURO?S?)\b)?\b|(?P<amount>\d{1,10}(?:[.,]\d{1,10})?)\h+(?P<currency>AUD|USD|GBP|EURO?S?)\b
$strings = [
",000 AUD",
",000",
"20,000 AUD",
"$",
"20,000",
"AUD"
];
$re = '/(?P<symbol>[$¥£€₹])(?P<amount>\d{1,10}(?:[.,]\d{1,10})?)(?:\h+(?P<currency>AUD|USD|GBP|EURO?S?)\b)?\b|(?P<amount>\d{1,10}(?:[.,]\d{1,10})?)\h+(?P<currency>AUD|USD|GBP|EURO?S?)\b/J';
foreach ($strings as $s) {
$m = preg_match($re, $s, $matches);
if ($m) {
print_r($matches);
}
}
输出
Array
(
[0] => ,000 AUD
[symbol] => $
[1] => $
[amount] => 20,000
[2] => 20,000
[currency] => AUD
[3] => AUD
)
Array
(
[0] => ,000
[symbol] => $
[1] => $
[amount] => 20,000
[2] => 20,000
)
Array
(
[0] => 20,000 AUD
[symbol] =>
[1] =>
[amount] => 20,000
[2] =>
[currency] => AUD
[3] =>
[4] => 20,000
[5] => AUD
)
或see an example without the numerical keys将给出
Array
(
[symbol] => $
[amount] => 20,000
[currency] => AUD
)
Array
(
[symbol] => $
[amount] => 20,000
)
Array
(
[symbol] =>
[amount] => 20,000
[currency] => AUD
)