正则表达式应该只匹配两种类型的带引号字符串中的一种
regex should match only one of two types of quoted strings
我需要一个匹配双引号括起来的字符串的正则表达式。如果此模式被单引号包围,则它不应匹配被双引号包围的字符串:
"string"
" 'xyz' "
" `" "
" `" `" "
" `" `" `" "
' ' "should match" ' '
' "should not match" '
现在我有 (https://regex101.com/r/z5PayV/1)
(?:"(([^"]*`")*[^"]*|[^"]*)")
匹配所有行。但是最后一行不应该匹配。有什么解决办法吗?
您必须跳过单引号才能将它们从匹配中排除
更新
对于 C#,必须这样做。
只需使用一个简单的 CaptureCollection 来获取所有
引用的匹配项。
(?:'[^']*'|(?:"(([^"]*`")*[^"]*|[^"]*)")|[\S\s])+
展开
(?:
' [^']* '
|
(?:
"
( # (1 start)
( [^"]* `" )* # (2)
[^"]*
| [^"]*
) # (1 end)
"
)
|
[\S\s]
)+
C#代码
var str =
"The two sentences are 'He said \"Hello there\"' and \"She said 'goodbye' and 'another sentence'\"\n" +
"\" `\" \"\n" +
"\" `\" \"\n" +
"\" `\" `\" \"\n" +
"\" `\" `\" `\" \"\n" +
"' \" \" '\n" +
"\"string\"\n" +
"\" 'xyz' \"\n" +
"\" `\" \"\n" +
"\" `\" `\" \"\n" +
"\" `\" `\" `\" \"\n" +
"' ' \"should match\" ' '\n" +
"' \"should not match\" '\n";
var rx = new Regex( "(?:'[^']*'|(?:\"(([^\"]*`\")*[^\"]*|[^\"]*)\")|[\S\s])+" );
Match M = rx.Match( str );
if (M.Success)
{
CaptureCollection cc = M.Groups[1].Captures;
for (int i = 0; i < cc.Count; i++)
Console.WriteLine("{0}", cc[i].Value);
}
输出
She said 'goodbye' and 'another sentence'
`"
`"
`" `"
`" `" `"
string
'xyz'
`"
`" `"
`" `" `"
should match
不好意思,PCRE引擎就是这么干的
'[^']*'(*SKIP)(*FAIL)|(?:"(([^"]*`")*[^"]*|[^"]*)")`
https://regex101.com/r/gMiVDU/1
' [^']* '
(*SKIP) (*FAIL)
|
(?:
"
( # (1 start)
( [^"]* `" )* # (2)
[^"]*
| [^"]*
) # (1 end)
"
)
___________________________-
答案看起来很复杂,这个怎么样:
^"(\d+|\D+)"$
是不是太简单了?
这里的想法是检查字符串以双引号 (") 开始和结束,双引号内的任何内容包括单引号都是允许的。
我需要一个匹配双引号括起来的字符串的正则表达式。如果此模式被单引号包围,则它不应匹配被双引号包围的字符串:
"string"
" 'xyz' "
" `" "
" `" `" "
" `" `" `" "
' ' "should match" ' '
' "should not match" '
现在我有 (https://regex101.com/r/z5PayV/1)
(?:"(([^"]*`")*[^"]*|[^"]*)")
匹配所有行。但是最后一行不应该匹配。有什么解决办法吗?
您必须跳过单引号才能将它们从匹配中排除
更新
对于 C#,必须这样做。
只需使用一个简单的 CaptureCollection 来获取所有
引用的匹配项。
(?:'[^']*'|(?:"(([^"]*`")*[^"]*|[^"]*)")|[\S\s])+
展开
(?:
' [^']* '
|
(?:
"
( # (1 start)
( [^"]* `" )* # (2)
[^"]*
| [^"]*
) # (1 end)
"
)
|
[\S\s]
)+
C#代码
var str =
"The two sentences are 'He said \"Hello there\"' and \"She said 'goodbye' and 'another sentence'\"\n" +
"\" `\" \"\n" +
"\" `\" \"\n" +
"\" `\" `\" \"\n" +
"\" `\" `\" `\" \"\n" +
"' \" \" '\n" +
"\"string\"\n" +
"\" 'xyz' \"\n" +
"\" `\" \"\n" +
"\" `\" `\" \"\n" +
"\" `\" `\" `\" \"\n" +
"' ' \"should match\" ' '\n" +
"' \"should not match\" '\n";
var rx = new Regex( "(?:'[^']*'|(?:\"(([^\"]*`\")*[^\"]*|[^\"]*)\")|[\S\s])+" );
Match M = rx.Match( str );
if (M.Success)
{
CaptureCollection cc = M.Groups[1].Captures;
for (int i = 0; i < cc.Count; i++)
Console.WriteLine("{0}", cc[i].Value);
}
输出
She said 'goodbye' and 'another sentence'
`"
`"
`" `"
`" `" `"
string
'xyz'
`"
`" `"
`" `" `"
should match
不好意思,PCRE引擎就是这么干的
'[^']*'(*SKIP)(*FAIL)|(?:"(([^"]*`")*[^"]*|[^"]*)")`
https://regex101.com/r/gMiVDU/1
' [^']* '
(*SKIP) (*FAIL)
|
(?:
"
( # (1 start)
( [^"]* `" )* # (2)
[^"]*
| [^"]*
) # (1 end)
"
)
___________________________-
答案看起来很复杂,这个怎么样:
^"(\d+|\D+)"$
是不是太简单了?
这里的想法是检查字符串以双引号 (") 开始和结束,双引号内的任何内容包括单引号都是允许的。