使用正则表达式的深度(无限)拆分词
Deep (infinite) split words using regex
假设我有:
$line = "{This is my {sentence|words} I wrote.}"
输出:
This is my sentence I wrote.
This is my words I wrote.
但是,正则表达式应该匹配深层值和嵌套值并拆分这些值,例如:
$line = "{This is my {sentence|words} I wrote on a {sunny|cold} day.}";
输出:
This is my sentence I wrote on a sunny day.
This is my sentence I wrote on a cold day.
This is my words I wrote on a sunny day.
This is my words I wrote on a cold day.
我的第一个想法是按照下面的代码进行爆炸,但结果不合适:
$res = explode("|", $line);
建议?谢谢。
编辑:这些行中的内容:
$line = "{This is my {sentence|words} I wrote on a {sunny|cold} day.}";
$regex = "{[^{}]*}";
$match = [];
preg_match($regex, $line, $match);
var_dump($match);
正如已经说过的,它可以无限大,因此没有限制,在 for 循环中是合适的。
看看这个。我通过用 %s
替换你的模式并使用 vsprintf
,然后递归循环匹配来完成它。
我在代码中添加了很多注释...理解递归通常是一项非常费脑力的工作。
$line = "{This is my {sentence|statement} I {wrote|typed} on a {hot|cold} {day|night}.}";
$matches = getMatches($line);
printWords([], $matches, $line);
// function to find patterns in the line. Takes $line by reference to replace pattern matches with a vsprintf placeholder
function getMatches(&$line) {
// remove beginning and trailing brackets on the main sentence
$line = trim($line, '{}');
// initialize variable that will hold the list of pattern matches
$matches = null;
// look for an opening curly brace and skip everything until the ending curly brace
$pattern = '/\{[^}]+\}/';
// find all matches and put them in $matches
preg_match_all($pattern, $line, $matches);
// preg_match_all nests one level deeper than we need
$matches = $matches[0];
// replace all matches with a %s placeholder
$line = preg_replace($pattern, '%s', $line);
// split each of the matches by vertical pipe
foreach ($matches as $index => $match) {
$matches[$index] = explode('|', trim($match, '{}'));
}
return $matches;
}
// recursive function. $args will be used as the second argument to vsprintf
function printWords(array $args, array $matches, $line) {
// get the first element in the array of $matches, remove it from the array
$current = array_shift($matches);
// keep track of the current $args index for this recursive iteration
$currentArgIndex = count($args);
// loop through each of the words in the current set of matches
foreach ($current as $word) {
// update $args and set the vsprintf argument at this iteration's position to the next word in the set of words
$args[$currentArgIndex] = $word;
if (!empty($matches)) {
// repeat this process (recursively) until we are at the end of the list of matches
printWords($args, $matches, $line);
} else {
// if this is the last match in the line, echo the sentence with all args from previous recursive iterations added
echo vsprintf($line, $args) . '<br />';
}
}
}
输出:
This is my sentence I wrote on a hot day.
This is my sentence I wrote on a hot night.
This is my sentence I wrote on a cold day.
This is my sentence I wrote on a cold night.
This is my sentence I typed on a hot day.
This is my sentence I typed on a hot night.
This is my sentence I typed on a cold day.
This is my sentence I typed on a cold night.
This is my statement I wrote on a hot day.
This is my statement I wrote on a hot night.
This is my statement I wrote on a cold day.
This is my statement I wrote on a cold night.
This is my statement I typed on a hot day.
This is my statement I typed on a hot night.
This is my statement I typed on a cold day.
This is my statement I typed on a cold night.
假设我有:
$line = "{This is my {sentence|words} I wrote.}"
输出:
This is my sentence I wrote.
This is my words I wrote.
但是,正则表达式应该匹配深层值和嵌套值并拆分这些值,例如:
$line = "{This is my {sentence|words} I wrote on a {sunny|cold} day.}";
输出:
This is my sentence I wrote on a sunny day.
This is my sentence I wrote on a cold day.
This is my words I wrote on a sunny day.
This is my words I wrote on a cold day.
我的第一个想法是按照下面的代码进行爆炸,但结果不合适:
$res = explode("|", $line);
建议?谢谢。
编辑:这些行中的内容:
$line = "{This is my {sentence|words} I wrote on a {sunny|cold} day.}";
$regex = "{[^{}]*}";
$match = [];
preg_match($regex, $line, $match);
var_dump($match);
正如已经说过的,它可以无限大,因此没有限制,在 for 循环中是合适的。
看看这个。我通过用 %s
替换你的模式并使用 vsprintf
,然后递归循环匹配来完成它。
我在代码中添加了很多注释...理解递归通常是一项非常费脑力的工作。
$line = "{This is my {sentence|statement} I {wrote|typed} on a {hot|cold} {day|night}.}";
$matches = getMatches($line);
printWords([], $matches, $line);
// function to find patterns in the line. Takes $line by reference to replace pattern matches with a vsprintf placeholder
function getMatches(&$line) {
// remove beginning and trailing brackets on the main sentence
$line = trim($line, '{}');
// initialize variable that will hold the list of pattern matches
$matches = null;
// look for an opening curly brace and skip everything until the ending curly brace
$pattern = '/\{[^}]+\}/';
// find all matches and put them in $matches
preg_match_all($pattern, $line, $matches);
// preg_match_all nests one level deeper than we need
$matches = $matches[0];
// replace all matches with a %s placeholder
$line = preg_replace($pattern, '%s', $line);
// split each of the matches by vertical pipe
foreach ($matches as $index => $match) {
$matches[$index] = explode('|', trim($match, '{}'));
}
return $matches;
}
// recursive function. $args will be used as the second argument to vsprintf
function printWords(array $args, array $matches, $line) {
// get the first element in the array of $matches, remove it from the array
$current = array_shift($matches);
// keep track of the current $args index for this recursive iteration
$currentArgIndex = count($args);
// loop through each of the words in the current set of matches
foreach ($current as $word) {
// update $args and set the vsprintf argument at this iteration's position to the next word in the set of words
$args[$currentArgIndex] = $word;
if (!empty($matches)) {
// repeat this process (recursively) until we are at the end of the list of matches
printWords($args, $matches, $line);
} else {
// if this is the last match in the line, echo the sentence with all args from previous recursive iterations added
echo vsprintf($line, $args) . '<br />';
}
}
}
输出:
This is my sentence I wrote on a hot day. This is my sentence I wrote on a hot night. This is my sentence I wrote on a cold day. This is my sentence I wrote on a cold night. This is my sentence I typed on a hot day. This is my sentence I typed on a hot night. This is my sentence I typed on a cold day. This is my sentence I typed on a cold night. This is my statement I wrote on a hot day. This is my statement I wrote on a hot night. This is my statement I wrote on a cold day. This is my statement I wrote on a cold night. This is my statement I typed on a hot day. This is my statement I typed on a hot night. This is my statement I typed on a cold day. This is my statement I typed on a cold night.