PHP:如何用破折号和括号之间的所有内容拆分字符串。 (preg_split 或 preg_match)

PHP: How to split a string by dash and everything between brackets. (preg_split or preg_match)

几天来我一直在思考这个问题,但似乎没有得到想要的结果。

示例:

$var = "Some Words - Other Words (More Words) Dash-Binded-Word";

想要的结果:

array(
[0] => Some Words
[1] => Other Words
[2] => More Words
[3] => Dash-Bound-Word
)

我可以使用 preg_match_all 让这一切正常工作,但 "Dash-Bound-Word" 也被打破了。尝试将它与周围的空格匹配是行不通的,因为它会破坏除破折号之外的所有单词。

我使用的 preg_match_all 语句(它也打破了破折号绑定的单词)是这样的:

preg_match_all('#\(.*?\)|\[.*?\]|[^?!\-|\(|\[]+#', $var, $array);

我当然不是 preg_match、preg_split 方面的专家,因此我们将不胜感激。

您可以拆分为:

/\s*(?<!\w(?=.\w))[\-[\]()]\s*/

解释:

  1. 尝试匹配字符 class [\-[\]()](匹配任何这些字符)。您还可以向该字符添加任何字符 class.
  2. 它正在对以下条件使用负面回顾 (?<!\w):"not preceded by a word character"。
  3. 它还有一个嵌套的前瞻性 (?=.\w) 检查:"if the first condition is met, it shouldn't be followed by any char -the one used to split- and a word character".
  4. \s*开头和结尾是trim个空格。

代码:

$input_line = "Some Words - Other Words (More Words) Dash-Binded-Word";
$result = preg_split("/\s*(?<!\w(?=.\w))[\-[\]()]\s*/", $input_line);
var_dump($result);

输出:

array(4) {
  [0]=>
  string(10) "Some Words"
  [1]=>
  string(11) "Other Words"
  [2]=>
  string(10) "More Words"
  [3]=>
  string(16) "Dash-Binded-Word"
}

Run this code here

捕获parens

如另一条评论所述,如果您还想捕获括号:

$result = preg_split("/\s*(?:(?<!\w)-(?!\w)|(\(.*?\)|\[.*?]))\s*/", $input_line, -1, PREG_SPLIT_DELIM_CAPTURE);

试试这个(str_replace 和爆炸的组合)。它不是最佳的,但可能适用于这种情况:

$var = "Some Words - Other Words (More Words) Dash-Binded-Word";
$arr = Array(" - ", " (", ") ");
$var2 = str_replace($arr, "|", $var);
$final = explode('|', $var2);
var_dump($final);

输出:

array(4) { [0]=> string(10) "Some Words" [1]=> string(11) "Other Words" [2]=> string(10) "More Words" [3]=> string(16) "Dash-Binded-Word" }

$var = "Some Words - Other Words (More Words) Dash-Binded-Word";

$var=preg_replace('/[^A-Za-z\-]/', ' ', $var);
$var=str_replace('-', ' ', $var); // Replaces all hyphens with spaces.
print_r (explode(" ",preg_replace('!\s+!', ' ', $var)));  //replaces all multiple spaces with one and explode creates array split where there is space

输出:-

Array ( [0] => Some [1] => Words [2] => Other [3] => Words [4] => More [5] => Words [6] => Dash [7] => Binded [8] => Word ) 

您可以使用简单的 preg_match_all:

\w+(?:[- ]\w+)*

demo

  • \w+ - 1 个或多个字母数字或下划线
  • (?:[- ]\w+)* - 0 个或多个......
    • [- ] - 连字符或 space(您可以将 space 更改为 \s 以匹配任何白色 space)
    • \w+ - 1 个或多个字母数字或下划线

IDEONE demo:

$re = '/\w+(?:[- ]\w+)*/'; 
$str = "Some Words - Other Words (More Words) Dash-Binded-Word"; 
preg_match_all($re, $str, $matches);
print_r($matches[0]);

结果:

Array
(
    [0] => Some Words
    [1] => Other Words
    [2] => More Words
    [3] => Dash-Binded-Word
)

修改输入字符串以适应任何特定的爆炸技术将是间接的,并且表明正在使用次优的爆炸技术。

事实是,您所需的逻辑可以归结为:“在每个长度为 2 或更多的非单词字符序列上展开 ”。这就是 preg_split().

模式的样子

代码:(Demo)

$var = "Some Words - Other Words (More Words) Dash-Binded-Word";

var_export(preg_split('~\W{2,}~', $var));

输出:

array (
  0 => 'Some Words',
  1 => 'Other Words',
  2 => 'More Words',
  3 => 'Dash-Binded-Word',
)

没有比这更简单的了。