在字符串中查找所有可能的替换组合
Find all possible combinations with replacement in a string
这是我目前的代码:
from itertools import combinations, product
string = "abcd012345"
char = "01268abc"
for i, j in combinations(tuple(range(len(string))), 2):
for char1, char2 in product(char, char):
print(string[:i] + char1 + string[i+1:j] + char2 + string[j+1:])
因此,字符串是 abcd012345,我更改了两个字符以找到所有可能的组合。字符为 01268abc。在这个例子中,我们得到了 2880 种组合。
目标是设置哪些字符将出现在字符串的指定位置。示例如下:
from itertools import combinations, product
string = "abcd012345"
# place 0123456789
char_to change_for_place0 = "ab02"
char_to change_for_place1 = "14ah"
char_to change_for_place2 = "94nf"
char_to change_for_place3 = "a"
char_to change_for_place4 = "9347592"
char_to change_for_place5 = "93478nvg"
char_to change_for_place6 = "b"
char_to change_for_place7 = ""
char_to change_for_place8 = ""
char_to change_for_place9 = "84n"
for i, j in combinations(tuple(range(len(string))), 2):
for char1, char2 in product(char, char):
print(string[:i] + char1 + string[i+1:j] + char2 + string[j+1:])
注:
- 有些地方可以空着,与第 7 和 8 处相同。
- 名额 将是 64 个。
- 要更改的字符数将为 4 个,而不是示例中的 2 个。
我很乐意从您的解决方案和想法中学习,谢谢。
这归结为将 string
中每个位置的当前字母添加到该位置的当前替换项,然后创建这些选项的所有可能组合:
from itertools import combinations, product
string = "abcd012345"
# must be of same lenght as (string), each entry correspond to the same index in string
p = ["ab02", "14ah", "94nf", "a", "9347592", "93478nvg", "b", "", "", "84n"]
errmsg = f"Keep them equal lenghts: '{string}' ({len(string)}) vs {p} ({len(p)})"
assert len(p)==len(string), errmsg
# eliminates duplicates from letter in string + replacments due to frozenset()
d = {idx: frozenset(v + string[idx]) for idx, v in enumerate(p)}
# creating this list take memory
all_of_em = [''.join(whatever) for whatever in product(*d.values())]
# if you hit a MemoryError creating the list, write to a file instead
# this uses a generator with limits memory usage but the file is going
# to get BIG
# with open("words.txt","w") as f:
# for w in (''.join(whatever) for whatever in product(*d.values())):
# f.write(w+"\n")
print(*all_of_em, f"\n{len(all_of_em)}", sep="\t")
输出:
2and2g234n 2and2g2348 2and2g2344 2and2g2345 2and27b34n
[...snipp...]
249d99234n 249d992348 249d992344 249d992345
100800
如果您重视替换中的字母顺序,请使用
d = {idx: (v if string[idx] in v else string[idx]+v) for idx, v in enumerate(p)}
改为:
abcd012345 abcd012348 [...] 2hfa2gb344 2hfa2gb34n 115200
数量差异是由于“9347592”中的重复 9
已使用 frozensets 删除。
仅获取更改少于 5 项的内容:
# use a generator comprehension to reduce memory usage
all_of_em = (''.join(whatever) for whatever in product(*d.values()))
# create the list with less then 5 changes from the generator above
fewer = [w for w in all_of_em if sum(a != b for a, b in zip(w, string)) < 5]
这是我目前的代码:
from itertools import combinations, product
string = "abcd012345"
char = "01268abc"
for i, j in combinations(tuple(range(len(string))), 2):
for char1, char2 in product(char, char):
print(string[:i] + char1 + string[i+1:j] + char2 + string[j+1:])
因此,字符串是 abcd012345,我更改了两个字符以找到所有可能的组合。字符为 01268abc。在这个例子中,我们得到了 2880 种组合。
目标是设置哪些字符将出现在字符串的指定位置。示例如下:
from itertools import combinations, product
string = "abcd012345"
# place 0123456789
char_to change_for_place0 = "ab02"
char_to change_for_place1 = "14ah"
char_to change_for_place2 = "94nf"
char_to change_for_place3 = "a"
char_to change_for_place4 = "9347592"
char_to change_for_place5 = "93478nvg"
char_to change_for_place6 = "b"
char_to change_for_place7 = ""
char_to change_for_place8 = ""
char_to change_for_place9 = "84n"
for i, j in combinations(tuple(range(len(string))), 2):
for char1, char2 in product(char, char):
print(string[:i] + char1 + string[i+1:j] + char2 + string[j+1:])
注:
- 有些地方可以空着,与第 7 和 8 处相同。
- 名额 将是 64 个。
- 要更改的字符数将为 4 个,而不是示例中的 2 个。
我很乐意从您的解决方案和想法中学习,谢谢。
这归结为将 string
中每个位置的当前字母添加到该位置的当前替换项,然后创建这些选项的所有可能组合:
from itertools import combinations, product
string = "abcd012345"
# must be of same lenght as (string), each entry correspond to the same index in string
p = ["ab02", "14ah", "94nf", "a", "9347592", "93478nvg", "b", "", "", "84n"]
errmsg = f"Keep them equal lenghts: '{string}' ({len(string)}) vs {p} ({len(p)})"
assert len(p)==len(string), errmsg
# eliminates duplicates from letter in string + replacments due to frozenset()
d = {idx: frozenset(v + string[idx]) for idx, v in enumerate(p)}
# creating this list take memory
all_of_em = [''.join(whatever) for whatever in product(*d.values())]
# if you hit a MemoryError creating the list, write to a file instead
# this uses a generator with limits memory usage but the file is going
# to get BIG
# with open("words.txt","w") as f:
# for w in (''.join(whatever) for whatever in product(*d.values())):
# f.write(w+"\n")
print(*all_of_em, f"\n{len(all_of_em)}", sep="\t")
输出:
2and2g234n 2and2g2348 2and2g2344 2and2g2345 2and27b34n
[...snipp...]
249d99234n 249d992348 249d992344 249d992345
100800
如果您重视替换中的字母顺序,请使用
d = {idx: (v if string[idx] in v else string[idx]+v) for idx, v in enumerate(p)}
改为:
abcd012345 abcd012348 [...] 2hfa2gb344 2hfa2gb34n 115200
数量差异是由于“9347592”中的重复 9
已使用 frozensets 删除。
仅获取更改少于 5 项的内容:
# use a generator comprehension to reduce memory usage
all_of_em = (''.join(whatever) for whatever in product(*d.values()))
# create the list with less then 5 changes from the generator above
fewer = [w for w in all_of_em if sum(a != b for a, b in zip(w, string)) < 5]