生成潜在的 8 个字符串的所有可能的 2 个字符组合?
Generate all possible 2 character combinations of a potential 8 character string?
我有一个生成元组的元组的所有可能组合
( (base1 , position1) , (base2 , position2) )
bases = ["U", "C", "A", "G"]
和 positions = [0,1,2,3,4,5,6,7,8]
。
要求
- 无重复
- 碱基可以相同但位置必须相同
不同
- 必须遵守顺序。
例如:
( (A,1), (B,2) ) == ( (B,2) , (A,1) )
和
( (A,1), (B,1) )
应该被丢弃。
示例输出:
[ ( (U,0) , (U,1) ), ( (U,0) , (U,2) ), ( (U,0) , (U,3) ) ...]
长度应为 448
示例:
对于长度为 2 的字符串:
((U,0),(U,1))
((U,0),(C,1))
((U,0),(A,1))
((U,0),(G,1))
((C,0),(U,1))
((C,0),(C,1))
((C,0),(A,1))
((C,0),(G,1))
((A,0),(U,1))
((A,0),(C,1))
((A,0),(A,1))
((A,0),(G,1))
((G,0),(U,1))
((G,0),(C,1))
((G,0),(A,1))
((G,0),(G,1))
会是所有的组合...我想
到目前为止我有这个
all_possible = []
nucleotides = ["U","C","A","G"]
for i in range(len(nucleotides)):
for j in range(8):
all_possible.append(((nucleotides[i],j),(nucleotides[i],j)))
听起来你想要(每个可能的 2 基词)X(从范围 (8) 中抽取的每个 2 组合)的笛卡尔积。
您一般可以通过
获得
from itertools import product, combinations
def build(num_chars, length):
bases = ["U", "C", "A", "G"]
for letters in product(bases, repeat=num_chars):
for positions in combinations(range(length), num_chars):
yield list(zip(letters, positions))
这给了我们
In [4]: output = list(build(2, 8))
In [5]: len(output)
Out[5]: 448
In [6]: output[:4]
Out[6]:
[[('U', 0), ('U', 1)],
[('U', 0), ('U', 2)],
[('U', 0), ('U', 3)],
[('U', 0), ('U', 4)]]
In [7]: output[-4:]
Out[7]:
[[('G', 4), ('G', 7)],
[('G', 5), ('G', 6)],
[('G', 5), ('G', 7)],
[('G', 6), ('G', 7)]]
我有一个生成元组的元组的所有可能组合
( (base1 , position1) , (base2 , position2) )
bases = ["U", "C", "A", "G"]
和 positions = [0,1,2,3,4,5,6,7,8]
。
要求
- 无重复
- 碱基可以相同但位置必须相同 不同
- 必须遵守顺序。
例如:
( (A,1), (B,2) ) == ( (B,2) , (A,1) )
和
( (A,1), (B,1) )
应该被丢弃。
示例输出:
[ ( (U,0) , (U,1) ), ( (U,0) , (U,2) ), ( (U,0) , (U,3) ) ...]
长度应为 448
示例:
对于长度为 2 的字符串:
((U,0),(U,1))
((U,0),(C,1))
((U,0),(A,1))
((U,0),(G,1))
((C,0),(U,1))
((C,0),(C,1))
((C,0),(A,1))
((C,0),(G,1))
((A,0),(U,1))
((A,0),(C,1))
((A,0),(A,1))
((A,0),(G,1))
((G,0),(U,1))
((G,0),(C,1))
((G,0),(A,1))
((G,0),(G,1))
会是所有的组合...我想
到目前为止我有这个
all_possible = []
nucleotides = ["U","C","A","G"]
for i in range(len(nucleotides)):
for j in range(8):
all_possible.append(((nucleotides[i],j),(nucleotides[i],j)))
听起来你想要(每个可能的 2 基词)X(从范围 (8) 中抽取的每个 2 组合)的笛卡尔积。
您一般可以通过
获得from itertools import product, combinations
def build(num_chars, length):
bases = ["U", "C", "A", "G"]
for letters in product(bases, repeat=num_chars):
for positions in combinations(range(length), num_chars):
yield list(zip(letters, positions))
这给了我们
In [4]: output = list(build(2, 8))
In [5]: len(output)
Out[5]: 448
In [6]: output[:4]
Out[6]:
[[('U', 0), ('U', 1)],
[('U', 0), ('U', 2)],
[('U', 0), ('U', 3)],
[('U', 0), ('U', 4)]]
In [7]: output[-4:]
Out[7]:
[[('G', 4), ('G', 7)],
[('G', 5), ('G', 6)],
[('G', 5), ('G', 7)],
[('G', 6), ('G', 7)]]