如何检查列表中是否有超过 3 个相同的字符串,Python
How to check if there are more than 3 identical string in a list, Python
我有一个列表,如下所示:
a = ['www.hughes-family.org', 'www.bondedsender.com', 'thinkgeek.com', 'www.hughes-family.org', 'www.hughes-family.org', 'lists.sourceforge.net', 'www.hughes-family.org']
如何检查此列表中是否有超过三个相同的 url?
我试过 set()
函数,但只要有重复的 url,它就会显示。
这是我试过的:
if len(set(a)) < len(a):
使用Counter.most_common
:
>>> Counter(a).most_common(1)[0][1]
4
这个returns最常见元素出现的次数。
您可以使用 list.count
获取出现三次或更多次的网址的数量:
urls = ['www.hughes-family.org', 'www.bondedsender.com', 'thinkgeek.com', 'www.hughes-family.org', 'www.hughes-family.org', 'lists.sourceforge.net', 'www.hughes-family.org']
new_urls = [url for url in urls if urls.count(url) > 1]
if len(new_urls) > 3:
pass #condition met
你可以使用 dict 来捕捉重复的东西:
a = ['www.hughes-family.org', 'www.bondedsender.com', 'thinkgeek.com', 'www.hughes-family.org', 'www.hughes-family.org', 'lists.sourceforge.net', 'www.hughes-family.org']
count={}
for i,j in enumerate(a):
if j not in count:
count[j]=[i]
else:
count[j].append(i)
for i,j in count.items():
if len(j)>1:
#do you stuff
print(count)
输出:
{'www.hughes-family.org': [0, 3, 4, 6], 'thinkgeek.com': [2], 'www.bondedsender.com': [1], 'lists.sourceforge.net': [5]}
您可以使用 defaultdict 的第二种方法:
import collections
d=collections.defaultdict(list)
for i,j in enumerate(a):
d[j].append(i)
print(d)
我假设您想检查是否有任何 URL 在列表中出现超过 3 次。您可以遍历列表,并创建一个包含字符串作为键的字典,并将它们各自的计数作为值(类似于 collections.Counter 的输出)。
In [1]: a = ['www.hughes-family.org', 'www.bondedsender.com', 'thinkgeek.com', '
...: www.hughes-family.org', 'www.hughes-family.org', 'lists.sourceforge.net'
...: , 'www.hughes-family.org']
In [2]: is_present = False
In [3]: url_counts = dict()
In [4]: for url in a:
...: if not url_counts.get(url, None): # If the URL is not present as a key, insert the URL with value 0
...: url_counts[url] = 0
...: url_counts[url] += 1 # Increment count
...: if url_counts[url] > 3: # Check if the URL occurs more than three times
...: print "The URL ", url, " occurs more than three times!"
...: is_present = True
...: break # Come out of the loop if any one of the URLs occur more than three times
# output - The URL www.hughes-family.org occurs more than three times!
In [5]: is_present # To check if there is a URL which occurs more than three times
Out[5]: True
我有一个列表,如下所示:
a = ['www.hughes-family.org', 'www.bondedsender.com', 'thinkgeek.com', 'www.hughes-family.org', 'www.hughes-family.org', 'lists.sourceforge.net', 'www.hughes-family.org']
如何检查此列表中是否有超过三个相同的 url?
我试过 set()
函数,但只要有重复的 url,它就会显示。
这是我试过的:
if len(set(a)) < len(a):
使用Counter.most_common
:
>>> Counter(a).most_common(1)[0][1]
4
这个returns最常见元素出现的次数。
您可以使用 list.count
获取出现三次或更多次的网址的数量:
urls = ['www.hughes-family.org', 'www.bondedsender.com', 'thinkgeek.com', 'www.hughes-family.org', 'www.hughes-family.org', 'lists.sourceforge.net', 'www.hughes-family.org']
new_urls = [url for url in urls if urls.count(url) > 1]
if len(new_urls) > 3:
pass #condition met
你可以使用 dict 来捕捉重复的东西:
a = ['www.hughes-family.org', 'www.bondedsender.com', 'thinkgeek.com', 'www.hughes-family.org', 'www.hughes-family.org', 'lists.sourceforge.net', 'www.hughes-family.org']
count={}
for i,j in enumerate(a):
if j not in count:
count[j]=[i]
else:
count[j].append(i)
for i,j in count.items():
if len(j)>1:
#do you stuff
print(count)
输出:
{'www.hughes-family.org': [0, 3, 4, 6], 'thinkgeek.com': [2], 'www.bondedsender.com': [1], 'lists.sourceforge.net': [5]}
您可以使用 defaultdict 的第二种方法:
import collections
d=collections.defaultdict(list)
for i,j in enumerate(a):
d[j].append(i)
print(d)
我假设您想检查是否有任何 URL 在列表中出现超过 3 次。您可以遍历列表,并创建一个包含字符串作为键的字典,并将它们各自的计数作为值(类似于 collections.Counter 的输出)。
In [1]: a = ['www.hughes-family.org', 'www.bondedsender.com', 'thinkgeek.com', '
...: www.hughes-family.org', 'www.hughes-family.org', 'lists.sourceforge.net'
...: , 'www.hughes-family.org']
In [2]: is_present = False
In [3]: url_counts = dict()
In [4]: for url in a:
...: if not url_counts.get(url, None): # If the URL is not present as a key, insert the URL with value 0
...: url_counts[url] = 0
...: url_counts[url] += 1 # Increment count
...: if url_counts[url] > 3: # Check if the URL occurs more than three times
...: print "The URL ", url, " occurs more than three times!"
...: is_present = True
...: break # Come out of the loop if any one of the URLs occur more than three times
# output - The URL www.hughes-family.org occurs more than three times!
In [5]: is_present # To check if there is a URL which occurs more than three times
Out[5]: True