用translate代替标点符号,这3种方式有什么区别?
Use translate to replace punctuation, what's the difference between these 3 ways?
我正在尝试将字符串中的标点符号替换为 space。我搜索了答案并在我的 python 2.7 中尝试了它们,它们显示了不同的结果。
s1=" merry's home, see a sign 'the-shop $on sale$ **go go!'" #sample string
print s1.translate(string.maketrans("",""), string.punctuation) #way1
print s1.translate(None,string.punctuation) #way2
table=string.maketrans(string.punctuation,' '*len(string.punctuation))
print s1.translate(table) #way3
它打印如下:
merrys home see a sign theshop on sale go go
merrys home see a sign theshop on sale go go
merry s home see a sign the shop on sale go go
那么这些方式有什么区别呢?
前两个在功能上并没有真正的区别......你要么传递一个空翻译 table (string.maketrans("","")
),要么你告诉 python跳过翻译步骤 (None
)。翻译后,您将删除所有标点符号,因为您将 string.punctionat
作为应删除的字符传递。如果我是赌徒,我敢打赌 None
版本的性能会稍好一些,但您可以 timeit
找出...
最后一个示例创建翻译 table 以将所有标点符号映射到 space 并且不删除任何内容。这就是为什么最后一个示例中有一堆额外的 space。
translate
的文档指定 str.translate(table[, deletechars])
Return a copy of the string where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation
续
set the table argument to None
for translations that only delete characters
print s1.translate(string.maketrans("",""), string.punctuation)
在这种情况下,您删除所有标点符号并将空字符串替换为空字符串
print s1.translate(None,string.punctuation)
在这种情况下,您只需删除所有标点符号。
table=string.maketrans(string.punctuation,' '*len(string.punctuation))
print s1.translate(table)
在这种情况下,您创建一个翻译 table,用空格替换标点符号,然后翻译。
第一个和第二个之间的区别是,正如 mgilson 所说,在性能方面,None
案例确实要快一点:
%timeit s1.translate(string.maketrans("",""), string.punctuation) #way1
The slowest run took 4.70 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.27 µs per loop
%timeit s1.translate(None, string.punctuation) #way1
The slowest run took 11.41 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 627 ns per loop
第三个是完全不同的翻译应用。
我正在尝试将字符串中的标点符号替换为 space。我搜索了答案并在我的 python 2.7 中尝试了它们,它们显示了不同的结果。
s1=" merry's home, see a sign 'the-shop $on sale$ **go go!'" #sample string
print s1.translate(string.maketrans("",""), string.punctuation) #way1
print s1.translate(None,string.punctuation) #way2
table=string.maketrans(string.punctuation,' '*len(string.punctuation))
print s1.translate(table) #way3
它打印如下:
merrys home see a sign theshop on sale go go
merrys home see a sign theshop on sale go go
merry s home see a sign the shop on sale go go
那么这些方式有什么区别呢?
前两个在功能上并没有真正的区别......你要么传递一个空翻译 table (string.maketrans("","")
),要么你告诉 python跳过翻译步骤 (None
)。翻译后,您将删除所有标点符号,因为您将 string.punctionat
作为应删除的字符传递。如果我是赌徒,我敢打赌 None
版本的性能会稍好一些,但您可以 timeit
找出...
最后一个示例创建翻译 table 以将所有标点符号映射到 space 并且不删除任何内容。这就是为什么最后一个示例中有一堆额外的 space。
translate
的文档指定 str.translate(table[, deletechars])
Return a copy of the string where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation
续
set the table argument to
None
for translations that only delete characters
print s1.translate(string.maketrans("",""), string.punctuation)
在这种情况下,您删除所有标点符号并将空字符串替换为空字符串
print s1.translate(None,string.punctuation)
在这种情况下,您只需删除所有标点符号。
table=string.maketrans(string.punctuation,' '*len(string.punctuation))
print s1.translate(table)
在这种情况下,您创建一个翻译 table,用空格替换标点符号,然后翻译。
第一个和第二个之间的区别是,正如 mgilson 所说,在性能方面,None
案例确实要快一点:
%timeit s1.translate(string.maketrans("",""), string.punctuation) #way1
The slowest run took 4.70 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.27 µs per loop
%timeit s1.translate(None, string.punctuation) #way1
The slowest run took 11.41 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 627 ns per loop
第三个是完全不同的翻译应用。