用translate代替标点符号,这3种方式有什么区别?

Use translate to replace punctuation, what's the difference between these 3 ways?

我正在尝试将字符串中的标点符号替换为 space。我搜索了答案并在我的 python 2.7 中尝试了它们,它们显示了不同的结果。

s1=" merry's home, see a sign 'the-shop $on sale$ **go go!'"   #sample string

print s1.translate(string.maketrans("",""), string.punctuation) #way1

print s1.translate(None,string.punctuation)                     #way2

table=string.maketrans(string.punctuation,' '*len(string.punctuation))
print s1.translate(table)                                       #way3

它打印如下:

merrys home see a sign theshop on sale go go
merrys home see a sign theshop on sale go go
merry s home  see a sign  the shop  on sale    go go  

那么这些方式有什么区别呢?

前两个在功能上并没有真正的区别......你要么传递一个空翻译 table (string.maketrans("","")),要么你告诉 python跳过翻译步骤 (None)。翻译后,您将删除所有标点符号,因为您将 string.punctionat 作为应删除的字符传递。如果我是赌徒,我敢打赌 None 版本的性能会稍好一些,但您可以 timeit 找出...

最后一个示例创建翻译 table 以将所有标点符号映射到 space 并且不删除任何内容。这就是为什么最后一个示例中有一堆额外的 space。

translate 的文档指定 str.translate(table[, deletechars])

Return a copy of the string where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation

set the table argument to None for translations that only delete characters

print s1.translate(string.maketrans("",""), string.punctuation)

在这种情况下,您删除所有标点符号并将空字符串替换为空字符串

print s1.translate(None,string.punctuation)

在这种情况下,您只需删除所有标点符号。

table=string.maketrans(string.punctuation,' '*len(string.punctuation))
print s1.translate(table)

在这种情况下,您创建一个翻译 table,用空格替换标点符号,然后翻译。

第一个和第二个之间的区别是,正如 mgilson 所说,在性能方面,None 案例确实要快一点:

%timeit s1.translate(string.maketrans("",""), string.punctuation) #way1
The slowest run took 4.70 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.27 µs per loop

%timeit s1.translate(None, string.punctuation) #way1
The slowest run took 11.41 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 627 ns per loop

第三个是完全不同的翻译应用。