Pythonic 的写法属于两组函数
Pythonic way of writing belongs function in two sets
我有两个只包含字符串的集合,我正在尝试编写如下函数:
def belongs(setA, setB):
return True/False
定义:如果set
,说setB
有一个项目包含(string
包含)[=16中的一个项目=],然后我调用 setB
属于 到 setA
。一些例子:
setA = set(['apple', 'banana', 'strawberry'])
set1 = set(['abcc', 'xyz', 'klm']) # does not belong to setA
set2 = set(['app', 'banaba', 'baba']) # does not belong to setA
set3 = set(['apples', 'xyz']) # belongs to setA
set4 = set(['bananaaa', 'hello', 'world', 'stack']) # belongs to setA
我当前的代码:
def belongs(set1, set2):
for i in set1:
for j in set2:
if i in j:
return True
return False
是否有 better/more Pythonic 方式来做同样的事情?
编写函数:
def belongs(set1, set2):
return any(s1 in s2 for s1 in set1 for s2 in set2)
并测试它:
assert not belongs(setA, set1)
assert not belongs(setA, set2)
assert belongs(setA, set3)
assert belongs(setA, set4)
检查 setA
中的任何字符串是否是 setB
中任何项目的子字符串的问题,即 setB
"belongs to" setA
是否可以使用 grep -F
.
解决
grep -Flf setA set1 set2 set3 set4
打印 "belong to" setA
的集合,即本例中的 set3
、set4
。 Aho–Corasick string matching algorithm formed the basis of the original Unix command fgrep
. It can be much more efficient for a large input than a naive solution with nested loops e.g., from 20 hours for a brute-force approach down to a couple of minutes using fgrep
.
如果无法安装第三方库;您可以尝试 re
模块,以在需要时提高性能:
import re
from itertools import imap
substrings = sorted(setA, key=len, reverse=True) # longest first
found = re.compile("|".join(map(re.escape, substrings))).search
print([any(imap(found, S)) for S in [set1, set2, set3, set4]])
# -> [False, False, True, True]
我有两个只包含字符串的集合,我正在尝试编写如下函数:
def belongs(setA, setB):
return True/False
定义:如果set
,说setB
有一个项目包含(string
包含)[=16中的一个项目=],然后我调用 setB
属于 到 setA
。一些例子:
setA = set(['apple', 'banana', 'strawberry'])
set1 = set(['abcc', 'xyz', 'klm']) # does not belong to setA
set2 = set(['app', 'banaba', 'baba']) # does not belong to setA
set3 = set(['apples', 'xyz']) # belongs to setA
set4 = set(['bananaaa', 'hello', 'world', 'stack']) # belongs to setA
我当前的代码:
def belongs(set1, set2):
for i in set1:
for j in set2:
if i in j:
return True
return False
是否有 better/more Pythonic 方式来做同样的事情?
编写函数:
def belongs(set1, set2):
return any(s1 in s2 for s1 in set1 for s2 in set2)
并测试它:
assert not belongs(setA, set1)
assert not belongs(setA, set2)
assert belongs(setA, set3)
assert belongs(setA, set4)
检查 setA
中的任何字符串是否是 setB
中任何项目的子字符串的问题,即 setB
"belongs to" setA
是否可以使用 grep -F
.
grep -Flf setA set1 set2 set3 set4
打印 "belong to" setA
的集合,即本例中的 set3
、set4
。 Aho–Corasick string matching algorithm formed the basis of the original Unix command fgrep
. It can be much more efficient for a large input than a naive solution with nested loops e.g., from 20 hours for a brute-force approach down to a couple of minutes using fgrep
.
如果无法安装第三方库;您可以尝试 re
模块,以在需要时提高性能:
import re
from itertools import imap
substrings = sorted(setA, key=len, reverse=True) # longest first
found = re.compile("|".join(map(re.escape, substrings))).search
print([any(imap(found, S)) for S in [set1, set2, set3, set4]])
# -> [False, False, True, True]