如何匹配泛型字符串 = 泛型字符串
How to match generic string = generic string
假设我有这个文件:
AND 1=1
AND fejo = fejo
AND 4=5
AND 423=523
我想匹配 =
左边和 =
符号右边相同的行,所以,它必须匹配以下几行:
AND 1=1
AND fejo = fejo
使用 awk:
$ awk 'split([=10=],a," *= *") && a[1]==( " " a[2])' file
AND 1=1
AND fejo = fejo
split
将 =
上的记录拆分为 AND 1
和 1
,在 [= 的前面添加 </code> 即 <code>AND
19=] 即 1
并进行比较。如果在 </code> 之后有超过 space 则失败。为避免这种情况,这似乎也有效:</p>
<pre><code>$ awk 'split([=11=],a,"( *= *| *)") && a[2]==a[3]' file
AND 1=1
AND fejo = fejo
缺点是被比较的元素不能有space。这个清除第一个单词及其周围的 space,śplit
s 在 =
(包括周围的 space)并比较一半。
$ awk ' {
r=[=12=] # working copy of record
sub(/^ *[^ ]* */,"",r) # remove AND
n=split(r,a," *= *") # split at =
if((n>1)&&a[1]==a[n]) # if r was really split in half and halfs match
print
}' file
AND 1=1
AND fejo = fejo
grep -E '^AND\s+([^=\s]*)\s*=\s*\b'
可以很好地处理您的输入。
正则表达式
^ # begin of line (preg tries to match the regex against each line)
AND # match literal 'AND'
\s+ # match one or more whitespace characters
( # beginning of a group
[ # beginning of a character class that...
^ # ... match any character that is not listed here:
= # literal '='
\s # whitespace
] # end of the character class...
# ... that matches one character that is not '=' or whitespace
* # zero or more occurrences of the previous expression (the class)
) # end of the capturing group
\s* # match zero or more spaces...
= # the '=' character
\s* # ... around the equal sign
# match the text captured by the first (and only) group above
\b # match a word boundary, to make sure is not just a prefix of a longer word
上面的regex
只匹配以大写AND
开头的行。如果您还需要匹配以 and
(小写)或这些字符的其他 uppercase/lowercase 组合开头的行,您可以将 regex
中的 AND
替换为 [aA][nN][dD]
.
将 -i
添加到 grep
命令行使其忽略 regex
和输入中的大小写。 regex
将匹配 and 1 = 1
,但也匹配 and fejo = FEJO
,这可能不是您需要的。
我找到了另一个非常简单的解决方案,无需弄得一团糟:
AND (\w+)\s*=\s*
假设我有这个文件:
AND 1=1
AND fejo = fejo
AND 4=5
AND 423=523
我想匹配 =
左边和 =
符号右边相同的行,所以,它必须匹配以下几行:
AND 1=1
AND fejo = fejo
使用 awk:
$ awk 'split([=10=],a," *= *") && a[1]==( " " a[2])' file
AND 1=1
AND fejo = fejo
split
将 =
上的记录拆分为 AND 1
和 1
,在 [= 的前面添加 </code> 即 <code>AND
19=] 即 1
并进行比较。如果在 </code> 之后有超过 space 则失败。为避免这种情况,这似乎也有效:</p>
<pre><code>$ awk 'split([=11=],a,"( *= *| *)") && a[2]==a[3]' file
AND 1=1
AND fejo = fejo
缺点是被比较的元素不能有space。这个清除第一个单词及其周围的 space,śplit
s 在 =
(包括周围的 space)并比较一半。
$ awk ' {
r=[=12=] # working copy of record
sub(/^ *[^ ]* */,"",r) # remove AND
n=split(r,a," *= *") # split at =
if((n>1)&&a[1]==a[n]) # if r was really split in half and halfs match
print
}' file
AND 1=1
AND fejo = fejo
grep -E '^AND\s+([^=\s]*)\s*=\s*\b'
可以很好地处理您的输入。
正则表达式
^ # begin of line (preg tries to match the regex against each line)
AND # match literal 'AND'
\s+ # match one or more whitespace characters
( # beginning of a group
[ # beginning of a character class that...
^ # ... match any character that is not listed here:
= # literal '='
\s # whitespace
] # end of the character class...
# ... that matches one character that is not '=' or whitespace
* # zero or more occurrences of the previous expression (the class)
) # end of the capturing group
\s* # match zero or more spaces...
= # the '=' character
\s* # ... around the equal sign
# match the text captured by the first (and only) group above
\b # match a word boundary, to make sure is not just a prefix of a longer word
上面的regex
只匹配以大写AND
开头的行。如果您还需要匹配以 and
(小写)或这些字符的其他 uppercase/lowercase 组合开头的行,您可以将 regex
中的 AND
替换为 [aA][nN][dD]
.
将 -i
添加到 grep
命令行使其忽略 regex
和输入中的大小写。 regex
将匹配 and 1 = 1
,但也匹配 and fejo = FEJO
,这可能不是您需要的。
我找到了另一个非常简单的解决方案,无需弄得一团糟:
AND (\w+)\s*=\s*