替换正则表达式中的字符

Question

使用 Python，我有以下字符串：

['taxes.............................       .7        21.4    (6.2)','regulatory and other matters..................$   39.9        61.5        41.1','Producer contract reformation cost recoveries............................   DASH        26.3        28.3']

我需要用 space 替换每个点，但不是数字中的句点。所以结果应该是这样的：

['taxes                                    .7        21.4    (6.2)','regulatory and other matters                  $   39.9        61.5        41.1','Producer contract reformation cost recoveries                               DASH        26.3        28.3']

我试过以下方法：

dots=re.compile('(\.{2,})(\s*?[\d\($]|\s*?DASH|\s*.)')
newlist=[]
for each in list:
    newline=dots.sub(r''.replace('.',' '),each)
    newdoc.append(newline)

但是，这段代码没有保留白色space。谢谢！

Answer 1

在re.sub

中使用negative lookarounds

>>> import re
>>> s = ['taxes.............................       .7        21.4    (6.2)','regulatory and other matters..................$   39.9        61.5        41.1','Producer contract reformation cost recoveries............................   DASH        26.3        28.3']
>>> [re.sub(r'(?<!\d)\.(?!\d)', ' ', i) for i in s]
['taxes                                    .7        21.4    (6.2)', 'regulatory and other matters                  $   39.9        61.5        41.1', 'Producer contract reformation cost recoveries                               DASH        26.3        28.3']

Answer 2

如果输入总是像您的示例，您也可以使用非 word boundary。

将\.\B替换为一个space

这只检查句点后是否没有单词字符。所以它将匹配 0. 但不匹配 0.0

See demo at regex101

替换正则表达式中的字符

Replacing characters in a regex

python

regex

replace

substitution