如何查找前面或后面没有另一个逗号的逗号？

Question

我有一个包含数千行的文本文件，如下所示：

8/15/2016,,Amazon,,15.93 ;most are like this
8/24/2016,,Google,18.73  ;a few are like this - one comma only
8/26/2016,,Ebay,,60.2    ;

大部分行都有两个逗号，后跟一些文本，再后跟两个逗号，再后跟一个数值。短短几十行，如二行。数值前只有一个逗号。

我正在尝试使用 Regex 查找这几十行。我没有使用编程语言，只是 Notepad++。我的问题是我到目前为止提出的正则表达式同时捕获了两者。我一直在 regex101.com

进行试验

我得到了这样的东西：2016,,.+?,[0-9]

我认为这意味着 "Find 2016,, followed by any number of characters until you find a comma followed by a numeric digit," 但它找到每一行，无论它有一个还是两个逗号（或更多，当我添加一些以查看发生了什么时发现）。

我读到正则表达式搜索 "greedily,"，但我认为 .+ 后的问号使搜索在第一次出现时停止。

我什至尝试 2016,,.+?,{1}[0-9] 认为 {1} 意味着 "just one," 但不，那也不起作用。

Answer 1

您可以试试前面没有逗号，后面也没有逗号的逗号：

[^,],[^,]

如果要捕获整行，请在开头和结尾添加 .*。 Regex101

Answer 2

如果你想select只包含一个逗号的行你可以使用这个

.*[^,],[^,].*

但是如果你想用双逗号或双逗号替换逗号，你可以

ctrl+f > , >查找所有 > supr > ,（或,）

Answer 3

那是因为2016,,.+?,[0-9]的.+?只排除了最后一个逗号：

           ▼▼▼▼▼▼▼▼▼
8/15/2016,,Amazon,,,,15.93

嘿，意思是任何字符，不是吗？

Code Different 的答案很好，但这里有一些替代方案：

• 使用 negated characters class:

2016,,[^,]+,[0-9]

• 使用 negative lookahead/lookbehind（注意，一些正则表达式引擎不支持它们，尽管 Notepad++ 支持，但回顾必须是固定长度）：

(?<!,),(?!,)

Answer 4

如果您使用否定断言来查找单个逗号，
在文字逗号之后使用断言 要快得多。

在正则表达式中首先放置一个否定断言会增加 6 倍的开销（在本例中）
与首先找到文字，然后检查断言相比。

这是因为它必须运行堆栈上的断言每个
字符位置，而不是先找到文字。

好=,(?!,)(?<!,,) 不好 = (?<!,),(?!,)

比较

目标重复字符串 29 次。

8/15/2016,,Amazon,,15.93 ;most are like this 8/24/2016,,Google,18.73 ;a few are like this - one comma only 8/26/2016,,Ebay,,60.2 ; ... 8/15/2016,,Amazon,,15.93 ;most are like this 8/24/2016,,Google,18.73 ;a few are like this - one comma only 8/26/2016,,Ebay,,60.2 ; ... (29 times total)

基准

Regex1: ,(?!,)(?<!,,) Options: < none > Completed iterations: 50 / 50 ( x 1000 ) Matches found per iteration: 29 Elapsed Time: 5.92 s, 5919.16 ms, 5919161 µs Regex2: (?<!,),(?!,) Options: < none > Completed iterations: 50 / 50 ( x 1000 ) Matches found per iteration: 29 Elapsed Time: 36.81 s, 36806.16 ms, 36806159 µs

如何查找前面或后面没有另一个逗号的逗号？

How to find commas not preceded or followed by another comma?

regex

notepad++

comma

duplicates

regex-lookarounds