用于提取任何非(单|双)引号分隔的单词的正则表达式

regular expression to extract any word not (single|double) quote delimited

我有一个描述 [变量运算符值] 结构的字符串,如下所示:

type == 'prova' && padposition == "stefano" or 10>var_name

我需要构建一个正则表达式来提取变量名列表:

type
padposition
var_name

对它们应用 post 处理(基本上将它们转换为 PHP 数组的键):

$arr_name['type']
$arr_name['padposition']
$arr_name['var_name']

我找到了匹配由单引号或双引号分隔的字符串的方法:

('|")(\w*\w)('|")

但我无法(我太无知了!)否定它或简单地提取任何非单引号或双引号分隔的单词。

一种实现方式(可读性强,易于维护):

$str = 'type == \'prova\' && padposition == "stefano" or 10>var_name';

$pattern = <<<'EOD'
~
# you define first the basic elements (as for a lexer) with named groups
(?(DEFINE)
    (?<var> [a-z_]\w* ) # variable name

    (?<dqstr> (?<=") [^\"]*+ (?s:\.[^\"]*)*+ (?=") ) # double quoted string
    (?<sqstr> (?<=') [^\']*+ (?s:\.[^\']*)*+ (?=') ) # single quoted string
    (?<string> \g<dqstr> | \g<sqstr> ) # any string

    (?<num> [0-9]+(?:\.[0-9]+)? ) # a number

    (?<value> \g<string> | \g<num> ) # any value

    (?<comp> [!><=]= | =?[><] ) # comparison operator
)

# Then you write the pattern using these named groups

(?J) # allow duplicate named groups

# variable op value
(?<key> \g<var> ) \h* \g<comp> \h* ["']? (?<val> \g<value> ) ['"]? 
| # OR
# value op variable
["']? (?<val> \g<value> ) ['"]? \h* \g<comp> \h* (?<key> \g<var> ) 
~xi
EOD;

if (preg_match_all($pattern, $str, $matches, PREG_SET_ORDER)) {
    $arr_name = [];
    foreach($matches as $m) {
        $arr_name[$m['key']] = $m['val'];
    }
    print_r($arr_name);
}

Pattern demo