如何匹配泛型字符串 = 泛型字符串

How to match generic string = generic string

假设我有这个文件:

AND 1=1
AND fejo = fejo
AND 4=5
AND 423=523

我想匹配 = 左边和 = 符号右边相同的行,所以,它必须匹配以下几行:

AND 1=1
AND fejo = fejo

使用 awk:

$ awk 'split([=10=],a," *= *") && a[1]==( " " a[2])' file
AND 1=1
AND fejo = fejo

split= 上的记录拆分为 AND 11,在 [= 的前面添加 </code> 即 <code>AND 19=] 即 1 并进行比较。如果在 </code> 之后有超过 space 则失败。为避免这种情况,这似乎也有效:</p> <pre><code>$ awk 'split([=11=],a,"( *= *| *)") && a[2]==a[3]' file AND 1=1 AND fejo = fejo

缺点是被比较的元素不能有space。这个清除第一个单词及其周围的 space,śplits 在 =(包括周围的 space)并比较一半。

$ awk ' {
    r=[=12=]                     # working copy of record
    sub(/^ *[^ ]* */,"",r)   # remove AND
    n=split(r,a," *= *")     # split at = 
    if((n>1)&&a[1]==a[n])    # if r was really split in half and halfs match
        print
}' file
AND 1=1
AND fejo = fejo
grep -E '^AND\s+([^=\s]*)\s*=\s*\b'

可以很好地处理您的输入。

正则表达式

^               # begin of line (preg tries to match the regex against each line)
AND             # match literal 'AND'
\s+             # match one or more whitespace characters
(               # beginning of a group
    [           # beginning of a character class that...
        ^       #    ... match any character that is not listed here:
        =       #    literal '='
        \s      #    whitespace
    ]           # end of the character class...
                # ... that matches one character that is not '=' or whitespace
    *           # zero or more occurrences of the previous expression (the class)
)               # end of the capturing group
\s*             # match zero or more spaces... 
=               # the '=' character
\s*             # ... around the equal sign
              # match the text captured by the first (and only) group above
\b              # match a word boundary, to make sure  is not just a prefix of a longer word

上面的regex只匹配以大写AND开头的行。如果您还需要匹配以 and(小写)或这些字符的其他 uppercase/lowercase 组合开头的行,您可以将 regex 中的 AND 替换为 [aA][nN][dD].

-i 添加到 grep 命令行使其忽略 regex 和输入中的大小写。 regex 将匹配 and 1 = 1,但也匹配 and fejo = FEJO,这可能不是您需要的。

我找到了另一个非常简单的解决方案,无需弄得一团糟:

AND (\w+)\s*=\s*