如何在不重复的情况下生成 5 个字符的字符串组合(2 个不同的数字、两个相等的字母和 1 个字母)
How to generate 5-character strings combinations (2 different digits, two equal letters and 1 letter) without duplication
我已经发布了 但这是一个 不同 的问题。
我正在尝试生成由 三个 字母(恰好两个相等,另一个不同的字母)和两个 不同的 数字组成的 5 个字符的字符串的组合但是当我尝试这样做时我得到了重复。
正确 组合示例:
82ccb
b8cc7
7c6dc
不正确 组合示例:
22ddc -> incorrect because the digits are equal and should be different
46ddd -> incorrect because there are more than 2 equal letters
2t4cd -> No 2 equal letters + 2 equal different letters
这是我使用的代码:
LETTERS = 'bcdfghjklmnpqrstvwxz'
DIGITS = '2456789'
def aab12(letters=LETTERS, digits=DIGITS):
"""Generate the distinct 5-character strings consisting of three
letters (two are equal and a repeated letter) and two digits (each one is different from the other).
"""
letterdxs = set(range(5))
combs = []
for (d1, d2), (i, j), (l1, l2) in product(
permutations(digits, 2), # 2 digits (a repeated).
combinations(range(5), 2), # Positions for the 1st and 2nd digits.
permutations(letters, 2)): # 2 letters (a repeated).
x, y, z = letterdxs.difference((i, j))
s = set((x, y, z))
# Choosing 2 positions for the repeated letters
c1 = combinations((x, y, z), 2)
for c in c1:
result = []
result[i:i] = d1,
result[j:j] = d2,
result[c[0]:c[0]] = l1,
result[c[1]:c[1]] = l1,
# Choosing position for the last letter. This is position that was left
letter_indx = (s.difference(c)).pop()
result[letter_indx:letter_indx] = l2,
combs.append(''.join(result))
# Should be 478,800
print(len(combs))
return combs
def is_contain_dup(combos):
s = set(combos)
if len(s) != len(combos):
print('found duplicates !')
is_contain_dup(aab12())
虽然长度还可以,但我有重复。
这个函数是基于这个数学:
- 为不同的数字选择 2 个位置
- 为重复的字母选择 2 个位置
- 选择与最后一个字母不同的字母
我不确定是什么导致了重复,但这可能是因为选择了两个相同的字母 + 不同的字母。
您可以创建一个递归函数:
from collections import Counter
LETTERS = 'bcdfghjklmnpqrstvwxz'
DIGITS = '2456789'
def validate(val, queue, counter):
if not queue:
return True
if val.isdigit():
return sum(i.isdigit() for i in queue) < 2 and val not in queue
_sum = sum(i.isalpha() for i in counter)
return _sum < 3 and counter.get(val, 0) < 2
def is_valid(_input):
d = Counter(_input)
return sum(i.isdigit() for i in d) == 2 and sum(i.isalpha() for i in d) == 2
def combinations(d, current = []):
if len(current) == 5:
yield ''.join(current)
else:
for i in d:
if validate(i, current, Counter(current)):
yield from combinations(d, current+[i])
_r = [i for i in combinations(DIGITS+LETTERS) if is_valid(i)]
print(len(_r))
输出:
478800
这里是纯粹的蛮力,带有 4 个嵌套循环的幼稚方法:
LETTERS = 'bcdfghjklmnpqrstvwxz'
DIGITS = '2456789'
from itertools import permutations
def aab12_1(letters=LETTERS, digits=DIGITS):
st=[]
for fc in letters:
for sc in letters:
if sc==fc: continue
for n1 in digits:
for n2 in digits:
if n1==n2: continue
st.append(''.join((fc,fc,sc,n1,n2)))
di={e:[''.join(t) for t in permutations(e)] for e in st}
return {s for sl in di.values() for s in sl}
>>> r=aab12_1()
>>> len(r)
478800
这有 O(n**4)
复杂性;即,对于较长的字符串,真的很糟糕。但是,示例字符串并没有那么长,对于较短的字符串,这是一种可行的方法。
您可以通过对生成的基本字符串进行排序来减少对 permutations
:
的重复调用,从而稍微降低复杂性
def aab12_2(letters=LETTERS, digits=DIGITS):
st=set()
for fc in letters:
for sc in letters:
if sc==fc: continue
for n1 in digits:
for n2 in digits:
if n1==n2: continue
st.add(''.join(sorted((fc,fc,sc,n1,n2))))
di={e:[''.join(t) for t in permutations(e)] for e in st}
return {s for sl in di.values() for s in sl}
可以进一步简化为:
from itertools import permutations, product, combinations
def aab12_3(letters=LETTERS, digits=DIGITS):
let_combo=[x+y for x,y in product([e+e for e in letters],letters) if x[0]!=y]
n_combos={a+b for a,b in combinations(digits,2)}
di={e:[''.join(t) for t in permutations(e)] for e in (x+y for x,y in product(let_combo, n_combos))}
return {s for sl in di.values() for s in sl}
仍然有一个隐含的 O(n**3)
和 3 products()
这相当于每个的嵌套循环。然而,每个 O
都更快,这里的总时间现在约为 350 毫秒。
所以,让我们进行基准测试。这是上面的 3 个函数,Ajax1234 的递归函数和 Rory Daulton 的 itertools 函数:
from itertools import combinations, permutations, product
def aab12_1(letters=LETTERS, digits=DIGITS):
st=[]
for fc in letters:
for sc in letters:
if sc==fc: continue
for n1 in digits:
for n2 in digits:
if n1==n2: continue
st.append(''.join((fc,fc,sc,n1,n2)))
di={e:[''.join(t) for t in permutations(e)] for e in st}
return {s for sl in di.values() for s in sl}
def aab12_2(letters=LETTERS, digits=DIGITS):
st=set()
for fc in letters:
for sc in letters:
if sc==fc: continue
for n1 in digits:
for n2 in digits:
if n1==n2: continue
st.add(''.join(sorted((fc,fc,sc,n1,n2))))
di={e:[''.join(t) for t in permutations(e)] for e in st}
return {s for sl in di.values() for s in sl}
def aab12_3(letters=LETTERS, digits=DIGITS):
let_combo=[x+y for x,y in product([e+e for e in letters],letters) if x[0]!=y]
n_combos={a+b for a,b in combinations(digits,2)}
di={e:[''.join(t) for t in permutations(e)] for e in (x+y for x,y in product(let_combo, n_combos))}
return {s for sl in di.values() for s in sl}
def aab12_4():
# Ajax1234 recursive approach
def validate(val, queue, counter):
if not queue:
return True
if val.isdigit():
return sum(i.isdigit() for i in queue) < 2 and val not in queue
_sum = sum(i.isalpha() for i in counter)
return _sum < 3 and counter.get(val, 0) < 2
def is_valid(_input):
d = Counter(_input)
return sum(i.isdigit() for i in d) == 2 and sum(i.isalpha() for i in d) == 2
def combinations(d, current = []):
if len(current) == 5:
yield ''.join(current)
else:
for i in d:
if validate(i, current, Counter(current)):
yield from combinations(d, current+[i])
return [i for i in combinations(DIGITS+LETTERS) if is_valid(i)]
def aab12_5(letters=LETTERS, digits=DIGITS):
""" Rory Daulton
Generate the distinct 5-character strings consisting of three
letters (two are equal and a repeated letter) and two digits (each
one is different from the other).
"""
indices = range(5) # indices for the generated 5-char strings
combs = []
for (letterdbl, lettersngl), (digit1, digit2), (indx1, indx2, indx3) in (
product(permutations(letters, 2),
combinations(digits, 2),
permutations(indices, 3))):
charlist = [letterdbl] * 5
charlist[indx1] = lettersngl
charlist[indx2] = digit1
charlist[indx3] = digit2
combs.append(''.join(charlist))
return combs
if __name__=='__main__':
import timeit
funcs=(aab12_1,aab12_2,aab12_3,aab12_4,aab12_5)
di={f.__name__:len(set(f())) for f in funcs}
print(di)
for f in funcs:
print(" {:^10s}{:.4f} secs".format(f.__name__, timeit.timeit("f()", setup="from __main__ import f", number=1)))
打印:
{'aab12_1': 478800, 'aab12_2': 478800, 'aab12_3': 478800, 'aab12_4': 478800, 'aab12_5': 478800}
aab12_1 0.6230 secs
aab12_2 0.3433 secs
aab12_3 0.3292 secs
aab12_4 50.4786 secs
aab12_5 0.2094 secs
这里最快的是 Rory Daulton 的 itertools 函数。干得漂亮!
这是一个在 itertools
中使用多个函数的答案。我的策略在评论中说明。在 itertools
函数中完成尽可能多的循环,以最大限度地提高速度。 returns 所需的 478,800 个字符串,它们都是不同的。
运行 %timeit
on aab12()
在我的系统上给出时间结果
391 ms ± 2.34 ms per loop
.
"""
Strategy: Generate a permutation of 2 distinct characters, a
combination of 2 distinct digits, and a permutation of 3 distinct
indices in range(5). Then build a 5-char string of the first character
(which will be the repeated one), use the first two indices to place
the digits and the last index to place the non-repeated character.
This yields a total of (20*19) * (7/1*6/2) * (5*4*3) = 478,800
items.
"""
from itertools import combinations, permutations, product
LETTERS = 'bcdfghjklmnpqrstvwxz'
DIGITS = '2456789'
def aab12(letters=LETTERS, digits=DIGITS):
"""Generate the distinct 5-character strings consisting of three
letters (two are equal and a repeated letter) and two digits (each
one is different from the other).
"""
indices = range(5) # indices for the generated 5-char strings
combs = []
for (letterdbl, lettersngl), (digit1, digit2), (indx1, indx2, indx3) in (
product(permutations(letters, 2),
combinations(digits, 2),
permutations(indices, 3))):
charlist = [letterdbl] * 5
charlist[indx1] = digit1
charlist[indx2] = digit2
charlist[indx3] = lettersngl
combs.append(''.join(charlist))
return combs
列表中的前十五个字符串是
['24cbb',
'24bcb',
'24bbc',
'2c4bb',
'2b4cb',
'2b4bc',
'2cb4b',
'2bc4b',
'2bb4c',
'2cbb4',
'2bcb4',
'2bbc4',
'42cbb',
'42bcb',
'42bbc']
我已经发布了
我正在尝试生成由 三个 字母(恰好两个相等,另一个不同的字母)和两个 不同的 数字组成的 5 个字符的字符串的组合但是当我尝试这样做时我得到了重复。
正确 组合示例:
82ccb
b8cc7
7c6dc
不正确 组合示例:
22ddc -> incorrect because the digits are equal and should be different
46ddd -> incorrect because there are more than 2 equal letters
2t4cd -> No 2 equal letters + 2 equal different letters
这是我使用的代码:
LETTERS = 'bcdfghjklmnpqrstvwxz'
DIGITS = '2456789'
def aab12(letters=LETTERS, digits=DIGITS):
"""Generate the distinct 5-character strings consisting of three
letters (two are equal and a repeated letter) and two digits (each one is different from the other).
"""
letterdxs = set(range(5))
combs = []
for (d1, d2), (i, j), (l1, l2) in product(
permutations(digits, 2), # 2 digits (a repeated).
combinations(range(5), 2), # Positions for the 1st and 2nd digits.
permutations(letters, 2)): # 2 letters (a repeated).
x, y, z = letterdxs.difference((i, j))
s = set((x, y, z))
# Choosing 2 positions for the repeated letters
c1 = combinations((x, y, z), 2)
for c in c1:
result = []
result[i:i] = d1,
result[j:j] = d2,
result[c[0]:c[0]] = l1,
result[c[1]:c[1]] = l1,
# Choosing position for the last letter. This is position that was left
letter_indx = (s.difference(c)).pop()
result[letter_indx:letter_indx] = l2,
combs.append(''.join(result))
# Should be 478,800
print(len(combs))
return combs
def is_contain_dup(combos):
s = set(combos)
if len(s) != len(combos):
print('found duplicates !')
is_contain_dup(aab12())
虽然长度还可以,但我有重复。
这个函数是基于这个数学:
- 为不同的数字选择 2 个位置
- 为重复的字母选择 2 个位置
- 选择与最后一个字母不同的字母
我不确定是什么导致了重复,但这可能是因为选择了两个相同的字母 + 不同的字母。
您可以创建一个递归函数:
from collections import Counter
LETTERS = 'bcdfghjklmnpqrstvwxz'
DIGITS = '2456789'
def validate(val, queue, counter):
if not queue:
return True
if val.isdigit():
return sum(i.isdigit() for i in queue) < 2 and val not in queue
_sum = sum(i.isalpha() for i in counter)
return _sum < 3 and counter.get(val, 0) < 2
def is_valid(_input):
d = Counter(_input)
return sum(i.isdigit() for i in d) == 2 and sum(i.isalpha() for i in d) == 2
def combinations(d, current = []):
if len(current) == 5:
yield ''.join(current)
else:
for i in d:
if validate(i, current, Counter(current)):
yield from combinations(d, current+[i])
_r = [i for i in combinations(DIGITS+LETTERS) if is_valid(i)]
print(len(_r))
输出:
478800
这里是纯粹的蛮力,带有 4 个嵌套循环的幼稚方法:
LETTERS = 'bcdfghjklmnpqrstvwxz'
DIGITS = '2456789'
from itertools import permutations
def aab12_1(letters=LETTERS, digits=DIGITS):
st=[]
for fc in letters:
for sc in letters:
if sc==fc: continue
for n1 in digits:
for n2 in digits:
if n1==n2: continue
st.append(''.join((fc,fc,sc,n1,n2)))
di={e:[''.join(t) for t in permutations(e)] for e in st}
return {s for sl in di.values() for s in sl}
>>> r=aab12_1()
>>> len(r)
478800
这有 O(n**4)
复杂性;即,对于较长的字符串,真的很糟糕。但是,示例字符串并没有那么长,对于较短的字符串,这是一种可行的方法。
您可以通过对生成的基本字符串进行排序来减少对 permutations
:
def aab12_2(letters=LETTERS, digits=DIGITS):
st=set()
for fc in letters:
for sc in letters:
if sc==fc: continue
for n1 in digits:
for n2 in digits:
if n1==n2: continue
st.add(''.join(sorted((fc,fc,sc,n1,n2))))
di={e:[''.join(t) for t in permutations(e)] for e in st}
return {s for sl in di.values() for s in sl}
可以进一步简化为:
from itertools import permutations, product, combinations
def aab12_3(letters=LETTERS, digits=DIGITS):
let_combo=[x+y for x,y in product([e+e for e in letters],letters) if x[0]!=y]
n_combos={a+b for a,b in combinations(digits,2)}
di={e:[''.join(t) for t in permutations(e)] for e in (x+y for x,y in product(let_combo, n_combos))}
return {s for sl in di.values() for s in sl}
仍然有一个隐含的 O(n**3)
和 3 products()
这相当于每个的嵌套循环。然而,每个 O
都更快,这里的总时间现在约为 350 毫秒。
所以,让我们进行基准测试。这是上面的 3 个函数,Ajax1234 的递归函数和 Rory Daulton 的 itertools 函数:
from itertools import combinations, permutations, product
def aab12_1(letters=LETTERS, digits=DIGITS):
st=[]
for fc in letters:
for sc in letters:
if sc==fc: continue
for n1 in digits:
for n2 in digits:
if n1==n2: continue
st.append(''.join((fc,fc,sc,n1,n2)))
di={e:[''.join(t) for t in permutations(e)] for e in st}
return {s for sl in di.values() for s in sl}
def aab12_2(letters=LETTERS, digits=DIGITS):
st=set()
for fc in letters:
for sc in letters:
if sc==fc: continue
for n1 in digits:
for n2 in digits:
if n1==n2: continue
st.add(''.join(sorted((fc,fc,sc,n1,n2))))
di={e:[''.join(t) for t in permutations(e)] for e in st}
return {s for sl in di.values() for s in sl}
def aab12_3(letters=LETTERS, digits=DIGITS):
let_combo=[x+y for x,y in product([e+e for e in letters],letters) if x[0]!=y]
n_combos={a+b for a,b in combinations(digits,2)}
di={e:[''.join(t) for t in permutations(e)] for e in (x+y for x,y in product(let_combo, n_combos))}
return {s for sl in di.values() for s in sl}
def aab12_4():
# Ajax1234 recursive approach
def validate(val, queue, counter):
if not queue:
return True
if val.isdigit():
return sum(i.isdigit() for i in queue) < 2 and val not in queue
_sum = sum(i.isalpha() for i in counter)
return _sum < 3 and counter.get(val, 0) < 2
def is_valid(_input):
d = Counter(_input)
return sum(i.isdigit() for i in d) == 2 and sum(i.isalpha() for i in d) == 2
def combinations(d, current = []):
if len(current) == 5:
yield ''.join(current)
else:
for i in d:
if validate(i, current, Counter(current)):
yield from combinations(d, current+[i])
return [i for i in combinations(DIGITS+LETTERS) if is_valid(i)]
def aab12_5(letters=LETTERS, digits=DIGITS):
""" Rory Daulton
Generate the distinct 5-character strings consisting of three
letters (two are equal and a repeated letter) and two digits (each
one is different from the other).
"""
indices = range(5) # indices for the generated 5-char strings
combs = []
for (letterdbl, lettersngl), (digit1, digit2), (indx1, indx2, indx3) in (
product(permutations(letters, 2),
combinations(digits, 2),
permutations(indices, 3))):
charlist = [letterdbl] * 5
charlist[indx1] = lettersngl
charlist[indx2] = digit1
charlist[indx3] = digit2
combs.append(''.join(charlist))
return combs
if __name__=='__main__':
import timeit
funcs=(aab12_1,aab12_2,aab12_3,aab12_4,aab12_5)
di={f.__name__:len(set(f())) for f in funcs}
print(di)
for f in funcs:
print(" {:^10s}{:.4f} secs".format(f.__name__, timeit.timeit("f()", setup="from __main__ import f", number=1)))
打印:
{'aab12_1': 478800, 'aab12_2': 478800, 'aab12_3': 478800, 'aab12_4': 478800, 'aab12_5': 478800}
aab12_1 0.6230 secs
aab12_2 0.3433 secs
aab12_3 0.3292 secs
aab12_4 50.4786 secs
aab12_5 0.2094 secs
这里最快的是 Rory Daulton 的 itertools 函数。干得漂亮!
这是一个在 itertools
中使用多个函数的答案。我的策略在评论中说明。在 itertools
函数中完成尽可能多的循环,以最大限度地提高速度。 returns 所需的 478,800 个字符串,它们都是不同的。
运行 %timeit
on aab12()
在我的系统上给出时间结果
391 ms ± 2.34 ms per loop
.
"""
Strategy: Generate a permutation of 2 distinct characters, a
combination of 2 distinct digits, and a permutation of 3 distinct
indices in range(5). Then build a 5-char string of the first character
(which will be the repeated one), use the first two indices to place
the digits and the last index to place the non-repeated character.
This yields a total of (20*19) * (7/1*6/2) * (5*4*3) = 478,800
items.
"""
from itertools import combinations, permutations, product
LETTERS = 'bcdfghjklmnpqrstvwxz'
DIGITS = '2456789'
def aab12(letters=LETTERS, digits=DIGITS):
"""Generate the distinct 5-character strings consisting of three
letters (two are equal and a repeated letter) and two digits (each
one is different from the other).
"""
indices = range(5) # indices for the generated 5-char strings
combs = []
for (letterdbl, lettersngl), (digit1, digit2), (indx1, indx2, indx3) in (
product(permutations(letters, 2),
combinations(digits, 2),
permutations(indices, 3))):
charlist = [letterdbl] * 5
charlist[indx1] = digit1
charlist[indx2] = digit2
charlist[indx3] = lettersngl
combs.append(''.join(charlist))
return combs
列表中的前十五个字符串是
['24cbb',
'24bcb',
'24bbc',
'2c4bb',
'2b4cb',
'2b4bc',
'2cb4b',
'2bc4b',
'2bb4c',
'2cbb4',
'2bcb4',
'2bbc4',
'42cbb',
'42bcb',
'42bbc']