正则表达式用弯引号替换常规引号

Question

我有一段文本，其中的开头和结尾引号相同

“嘿”，你好吗？ “嘿那里”...“更多文本”和更多“此处”。

请注意引号字符是“而不是”这些字符

(["'])(?:(?=(\?)).)*?

我想将开头的 " 字符替换为 “

它现在看起来像 “嘿”，你好吗？“嘿，那里”...“更多文字”和更多“这里”。

然后再次运行我可以简单地找到并将剩下的 " 出现替换为 ”

这将给出预期的输出，看起来应该是

“嗨，你好吗？ “嘿那里”...“更多文本”和更多“此处”。

Answer 1

我更喜欢@WiktorStribiżew 在对该问题的评论中给出的解决方案，但我希望提供一些读者可能感兴趣的替代解决方案。

剩余（尾随）double-quotes（即 ASCII 32）的第二次替换很简单，所以我不会讨论。

您可以将前导 double-quotes 与以下正则表达式匹配，然后将每个匹配项替换为 “:

"(?=(?:(?:[^"]*"){2})*[^"]*"[^"]*$)

Demo

这个正则表达式是基于我们想要识别所有 double-quotes 的观察结果，这些 double-quotes 在字符串后面跟着奇数个 double-quotes （假设字符串包含偶数个 double-quotes.

正则表达式可以分解如下

"              # match a double-quote (dq)
(?=            # begin a positive lookahead
  (?:          # begin a non-capture group
    (?:        # begin a non-capture group
      [^"]*"   # match 0+ chars other than dq then match dq
    ){2}       # end non-capture group and execute it twice
  )*           # end non-capture group and execute it 0+ times
  [^"]*"[^"]*  # match dq preceded and followed by 0+ non-dq chars
  $            # match end of string
)              # end positive lookahead

如果数据集很大，建议执行一些基准测试以查看执行速度是否令人满意。

正则表达式用弯引号替换常规引号

regex to replace regular quotes with curly quotes

javascript

regex

regex-group