如果字符串操作变为 "complicated"，我是否应该继续使用 str.replace 而不是 re.sub

Question

例如我有几千个字符串类似于：

zz='/cars-for-sale/vehicledetails.xhtml?dealerId=54222147&zip=90621&endYear=2015&location=Buena%2BPark%2BCA-90621&startYear=1981&dealerName=CarMax%2BBuena%2BPark&numRecords=100&searchRadius=10&listingId=389520333&Log=0'

我希望将其截断为

zz='/cars-for-sale/vehicledetails.xhtml?&listingId=389520333&Log=0'

我有两种方法可以做到这一点

zz.replace(zz[36:zz.strip('&Log=0').rfind('&')],'')

或

re.sub('dealer.+Radius=10','',zz)

从"good engineering practices"的角度来看，哪个更可取？可读性与可维护性与速度

我正在使用 Python 2.7

Answer 1

这个问题很难回答，因为它是基于意见的。 str.replace 肯定更快。在 ipython 中使用 timeit 与 Python 3.4.2:

In []: %timeit zz.replace(zz[36:zz.strip('&Log=0').rfind('&')],'')
100000 loops, best of 3: 2.04 µs per loop

In []: %timeit re.sub('dealer.+Radius=10','',zz)
100000 loops, best of 3: 2.83 µs per loop

正如 Padraic Cunningham 所指出的，Python 2 中的差异更大：

In []: %timeit zz.replace(zz[36:zz.strip('&Log=0').rfind('&')],'')
100000 loops, best of 3: 2 µs per loop

In []: %timeit re.sub('dealer.+Radius=10','',zz)
100000 loops, best of 3: 3.11 µs per loop

哪个更好取决于程序。一般来说，对于 Python，可读性比速度更重要（因为标准 PEP 8 风格是基于 notion 代码阅读多于编写）。如果速度对程序至关重要，那么更快的选项 str.replace 会更好。否则，更具可读性的选项 re.sub 会更好。

编辑

正如 Anony-Mousse 指出的那样，使用 re.compile 代替更快和 比两者都更具可读性。（您补充说您正在使用 Python 2，但我会先进行 Python 3 测试以反映我上面其他测试的顺序。）

与Python3:

In []: z_match = re.compile('dealer.+Radius=10')
In []: %timeit z_match.sub('', zz)
1000000 loops, best of 3: 1.36 µs per loop

与Python 2:

In []: z_match = re.compile('dealer.+Radius=10')
In []: %timeit z_match.sub('', zz)
100000 loops, best of 3: 1.68 µs per loop

如果字符串操作变为 "complicated"，我是否应该继续使用 str.replace 而不是 re.sub

Should I continue to use str.replace over re.sub, if the string manipulation becomes "complicated"

python

regex

string

编辑