从字符串中解析参数 |正则表达式 | php

Parsing parameter from string | regex | php

我在从字符串解析参数时遇到问题。

参数定义如下:

所有这些参数都在一个长字符串中,p.ex:

-a val1 ! -b val2 --other "string with crazy -a --test stuff inside" --param-with-dash val1 val2 -test value-with-dash ! -c -d ! --test

-- 编辑 ----

还有--param value-with-dash

-- 结束编辑 ---

这是我能得到的最接近的:

https://regex101.com/r/3aPHzp/1

/(?:(?P<inverted>\!) )?(?P<names>\-{1,2}\S+)($| (?P<values>.+(?=(?: [\!|\-])|$)))/U

不幸的是,当涉及到引号内的自由文本值时,它会中断。而当一个没有值的参数后面跟着下一个参数时。

(我尝试解析 iptables-save 的输出,以防你感兴趣。另外,也许我之前可以用另一种奇特的方式拆分字符串,以避免 hugh regex,但我不这样做看到它)。

非常感谢您的帮助!

-- 最终解决方案 --

对于 PHP >= 5.6

(?<inverted>!)?\s*(?<name>--?\w[\w-]*)\s*(?<values>(?:\s*(?:\w\S*|["'](?:[^"'\]*(?:\.[^"'\]*)*)['"]))*)\K

演示:https://regex101.com/r/xSfgxP/1

对于 PHP < 5.6

(?<inverted>\!)?\s*(?<=(?:\s)|^)(?<name>\-{1,2}\w[\w\-]*)\s+(?<value>(?:\s*(?:\w\S*|["'](?:[^"'\]*(?:\.[^"'\]*)*)['"]))*)

正则表达式:

(?<inverted>!)?\s*(?<name>--?\w[\w-]*)\s*(?<values>(?:\s*(?:\w\S+|["'](?:[^"'\]*(?:\.[^"'\]*)*)['"]))*)\K

Live demo(更新)

细分

 (?<inverted> ! )?             # (1) Named-capturing group for inverted result
 \s*                           # Match any spaces
 (?<name> --? \w [\w-]* )      # (2) Named-capturing group for parameter name
 \s*                           # Match any spaces
 (?<values>                    # (3 start) Named capturing group for values
      (?:                           # Beginning of a non-capturing group (a)
           \s*                      # Match any spaces
           (?:                      # Beginning of a non-capturing group (b)
                \w\S+                   # Match a [a-zA-Z0-9_] character then any non-whitespace characters
             |                          # Or
                ["']                    # Match a qoutation mark
                (?:                     # Beginning of a non-capturing group (c)
                     [^"'\]*               # Match anything except `"`, `'` or `\`
                     (?: \ . [^"'\]* )*   # Match an escaped character then anyhthing except `"`, `'` or `\` as much as possible
                )                       # End of non-capturing group (c)
                ['"]                    # Match qutation pair
           )                        # End of non-capturing group (b)
      )*                            # Greedy (a), end of non-capturing group (a)
 )                             # (3 end)
 \K                            # Reset allocated memory of all previously matched characters

PHP代码:

<?php 
    
$str = '-a val1 ! -b val2 --custom "string :)(#with crazy -a --test stuff inside" --param-with-dash val1 val2 -c ! -d ! --test';
$re = <<< 'RE'
~(?<inverted>!)?\s*(?<name>--?\w[\w-]*)\s*(?<values>(?:\s*(?:\w\S+|["'](?:[^"'\]*(?:\.[^"'\]*)*)['"]))*)\K~
RE;

preg_match_all($re, $str, $matches, PREG_SET_ORDER);
print_r(array_map('array_filter', $matches));

输出:

Array
(
    [0] => Array
        (
            [name] => -a
            [2] => -a
            [values] => val1
            [3] => val1
        )

    [1] => Array
        (
            [inverted] => !
            [1] => !
            [name] => -b
            [2] => -b
            [values] => val2
            [3] => val2
        )

    [2] => Array
        (
            [name] => --custom
            [2] => --custom
            [values] => "string :)(#with crazy -a --test stuff inside"
            [3] => "string :)(#with crazy -a --test stuff inside"
        )

    [3] => Array
        (
            [name] => --param-with-dash
            [2] => --param-with-dash
            [values] => val1 val2
            [3] => val1 val2
        )

    [4] => Array
        (
            [name] => -c
            [2] => -c
        )

    [5] => Array
        (
            [inverted] => !
            [1] => !
            [name] => -d
            [2] => -d
        )

    [6] => Array
        (
            [inverted] => !
            [1] => !
            [name] => --test
            [2] => --test
        )

)