替换正则表达式中的字符
Replacing characters in a regex
使用 Python,我有以下字符串:
['taxes............................. .7 21.4 (6.2)','regulatory and other matters..................$ 39.9 61.5 41.1','Producer contract reformation cost recoveries............................ DASH 26.3 28.3']
我需要用 space 替换每个点,但不是数字中的句点。所以结果应该是这样的:
['taxes .7 21.4 (6.2)','regulatory and other matters $ 39.9 61.5 41.1','Producer contract reformation cost recoveries DASH 26.3 28.3']
我试过以下方法:
dots=re.compile('(\.{2,})(\s*?[\d\($]|\s*?DASH|\s*.)')
newlist=[]
for each in list:
newline=dots.sub(r''.replace('.',' '),each)
newdoc.append(newline)
但是,这段代码没有保留白色space。谢谢!
在re.sub
中使用negative lookarounds
>>> import re
>>> s = ['taxes............................. .7 21.4 (6.2)','regulatory and other matters..................$ 39.9 61.5 41.1','Producer contract reformation cost recoveries............................ DASH 26.3 28.3']
>>> [re.sub(r'(?<!\d)\.(?!\d)', ' ', i) for i in s]
['taxes .7 21.4 (6.2)', 'regulatory and other matters $ 39.9 61.5 41.1', 'Producer contract reformation cost recoveries DASH 26.3 28.3']
如果输入总是像您的示例,您也可以使用非 word boundary。
将\.\B
替换为一个space
这只检查句点后是否没有单词字符。所以它将匹配 0.
但不匹配 0.0
使用 Python,我有以下字符串:
['taxes............................. .7 21.4 (6.2)','regulatory and other matters..................$ 39.9 61.5 41.1','Producer contract reformation cost recoveries............................ DASH 26.3 28.3']
我需要用 space 替换每个点,但不是数字中的句点。所以结果应该是这样的:
['taxes .7 21.4 (6.2)','regulatory and other matters $ 39.9 61.5 41.1','Producer contract reformation cost recoveries DASH 26.3 28.3']
我试过以下方法:
dots=re.compile('(\.{2,})(\s*?[\d\($]|\s*?DASH|\s*.)')
newlist=[]
for each in list:
newline=dots.sub(r''.replace('.',' '),each)
newdoc.append(newline)
但是,这段代码没有保留白色space。谢谢!
在re.sub
>>> import re
>>> s = ['taxes............................. .7 21.4 (6.2)','regulatory and other matters..................$ 39.9 61.5 41.1','Producer contract reformation cost recoveries............................ DASH 26.3 28.3']
>>> [re.sub(r'(?<!\d)\.(?!\d)', ' ', i) for i in s]
['taxes .7 21.4 (6.2)', 'regulatory and other matters $ 39.9 61.5 41.1', 'Producer contract reformation cost recoveries DASH 26.3 28.3']
如果输入总是像您的示例,您也可以使用非 word boundary。
将\.\B
替换为一个space
这只检查句点后是否没有单词字符。所以它将匹配 0.
但不匹配 0.0