为什么我的程序不遵循我设置的条件?功能问题?
Why is my program not following the conditionals that I set? Function issues?
我正在尝试生成随机 barcode_list,其中包含 6 个汉明距离为 3 的 UNIQUE 条形码。问题是该程序生成的条形码列表包含重复项,而不是正确的汉明距离。下面是代码。
import random
nucl_list = ['A', 'C', 'G', 'T']
length = 6
number = 6
attempts = 1000
barcode_list = []
tested = []
def make_barcode():
"""Generates a random barcode from nucl_list"""
barcode = ''
for i in range(length):
barcode += random.choice(nucl_list)
return barcode
def distance(s1, s2):
"""Calculates the hamming distance between s1 and s2"""
length1 = len(s1)
length2 = len(s2)
# Initiate 2-D array
distances = [[0 for i in range(length2 + 1)] for j in range(length1 + 1)]
# Add in null values for the x rows and y columns
for i in range(0, length1 + 1):
distances[i][0] = i
for j in range(0, length2 + 1):
distances[0][j] = j
for i in range(1, length1 + 1):
for j in range(1,length2 + 1):
cost = 0
if s1[i - 1] != s2[j - 1]:
cost = 1
distances[i][j] = min(distances[i - 1][j - 1] + cost, distances[i][j - 1] + 1, distances[i - 1][j] + 1)
min_distance = distances[length1][length2]
for i in range(0, length1 + 1):
min_distance = min(min_distance, distances[i][length2])
for j in range(0, length2 + 1):
min_distance = min(min_distance, distances[length1][j])
return min_distance
def compare_barcodes():
"""Generates a new barcode and compares with barcodes in barcode_list"""
new_barcode = make_barcode()
# keep track of # of barcodes tested
tested.append(new_barcode)
if new_barcode not in barcode_list:
for barcode in barcode_list:
dist = distance(barcode, new_barcode)
if dist >= 3:
barcode_list.append(new_barcode)
else:
pass
else:
pass
# make first barcode
first_barc = ''
for i in xrange(length):
first_barc += random.choice(nucl_list)
barcode_list.append(first_barc)
while len(tested) < attempts:
if len(barcode_list) < number:
compare_barcodes()
else:
break
barcode_list.sort()
print barcode_list
我认为我的问题与最后一个 while 循环有关:我希望 compare_barcodes
不断生成符合条件的条形码(不是重复的,并且不在任何已生成的条形码的汉明距离内)。
在你的 compare_barcodes()
中尝试一些类似的行为。
本质上,我们跟踪是否 dist >= 3
与 too_far
。一旦我们完成循环 barcode_list
,我们返回并检查 too_far
。如果不是 too_far
那么我们可以追加到列表中。
每次找到 dist >= 3
时,旧逻辑都会附加到 barcode_list
,这当然会不止一次,具体取决于已经添加到列表中的条形码数量。
def compare_barcodes():
too_far = False
"""Generates a new barcode and compares with barcodes in barcode_list"""
new_barcode = make_barcode()
# keep track of # of barcodes tested
tested.append(new_barcode)
if new_barcode not in barcode_list:
for barcode in barcode_list:
dist = distance(barcode, new_barcode)
if dist >= 3:
too_far = True
if not too_far:
barcode_list.append(new_barcode)
编辑:我刚刚意识到您希望汉明距离为 3 或更大...在这种情况下只需将 if not too far
更改为 if too far
。
问题出在您的 compare_barcodes() 函数上。在旧版本中,一旦它看到与任何比较字符串相差 3 步的条形码,它就会将该新字符串添加到列表中。代码可以修改为如下
def compare_barcodes():
"""Generates a new barcode and compares with barcodes in barcode_list"""
minDist = length
new_barcode = make_barcode()
# keep track of # of barcodes tested
tested.append(new_barcode)
if new_barcode not in barcode_list:
for barcode in barcode_list:
dist = distance(barcode, new_barcode)
#if dist >= 3:
# barcode_list.append(new_barcode)
#else:
# pass
if dist < minDist:
minDist = dist
else:
pass
if minDist >= 3:
barcode_list.append(new_barcode)
@Jkdc 的回答是正确的,为他+1。在您的原始代码中,您几乎就在那里。这是我的建议,将您的 if new_barcode not in barcode_list:
条件移动到您的 for loop
中,使其成为 if new_barcode not in barcode_list and distance(barcode, new_barcode)
,然后您将不会在列表中添加任何重复项,然后仅在 [=14] 时才计算距离=] 不在你的 barcode_list
中:
def compare_barcodes():
"""Generates a new barcode and compares with barcodes in barcode_list"""
new_barcode = make_barcode()
# keep track of # of barcodes tested
tested.append(new_barcode)
for barcode in barcode_list:
if new_barcode not in barcode_list and distance(barcode, new_barcode):
barcode_list.append(new_barcode)
另一个建议是如果你想避免重复,你可以使用 set
存储你的条形码,set
操作未排序的唯一元素。
我最终创建了一个新函数来计算汉明距离...
def compare_distances(new_barcode):
"""Compares the hamming_dist between new barcode and old barcodes"""
# Count number of distances < 3
count = 0
global barcode_list
for barcode in barcode_list:
if distance(new_barcode, barcode) < 3:``
count +=1
return count
def compare_barcodes():
new_barcode = make_barcode()
if new_barcode not in barcode_list:
count = compare_distances(new_barcode)
if count > 0:
pass
else:
barcode_list.append(new_barcode)
else:
pass
# Initiate the functions to generate barcodes
while len(barcode_list) < number:
compare_barcodes()
我正在尝试生成随机 barcode_list,其中包含 6 个汉明距离为 3 的 UNIQUE 条形码。问题是该程序生成的条形码列表包含重复项,而不是正确的汉明距离。下面是代码。
import random
nucl_list = ['A', 'C', 'G', 'T']
length = 6
number = 6
attempts = 1000
barcode_list = []
tested = []
def make_barcode():
"""Generates a random barcode from nucl_list"""
barcode = ''
for i in range(length):
barcode += random.choice(nucl_list)
return barcode
def distance(s1, s2):
"""Calculates the hamming distance between s1 and s2"""
length1 = len(s1)
length2 = len(s2)
# Initiate 2-D array
distances = [[0 for i in range(length2 + 1)] for j in range(length1 + 1)]
# Add in null values for the x rows and y columns
for i in range(0, length1 + 1):
distances[i][0] = i
for j in range(0, length2 + 1):
distances[0][j] = j
for i in range(1, length1 + 1):
for j in range(1,length2 + 1):
cost = 0
if s1[i - 1] != s2[j - 1]:
cost = 1
distances[i][j] = min(distances[i - 1][j - 1] + cost, distances[i][j - 1] + 1, distances[i - 1][j] + 1)
min_distance = distances[length1][length2]
for i in range(0, length1 + 1):
min_distance = min(min_distance, distances[i][length2])
for j in range(0, length2 + 1):
min_distance = min(min_distance, distances[length1][j])
return min_distance
def compare_barcodes():
"""Generates a new barcode and compares with barcodes in barcode_list"""
new_barcode = make_barcode()
# keep track of # of barcodes tested
tested.append(new_barcode)
if new_barcode not in barcode_list:
for barcode in barcode_list:
dist = distance(barcode, new_barcode)
if dist >= 3:
barcode_list.append(new_barcode)
else:
pass
else:
pass
# make first barcode
first_barc = ''
for i in xrange(length):
first_barc += random.choice(nucl_list)
barcode_list.append(first_barc)
while len(tested) < attempts:
if len(barcode_list) < number:
compare_barcodes()
else:
break
barcode_list.sort()
print barcode_list
我认为我的问题与最后一个 while 循环有关:我希望 compare_barcodes
不断生成符合条件的条形码(不是重复的,并且不在任何已生成的条形码的汉明距离内)。
在你的 compare_barcodes()
中尝试一些类似的行为。
本质上,我们跟踪是否 dist >= 3
与 too_far
。一旦我们完成循环 barcode_list
,我们返回并检查 too_far
。如果不是 too_far
那么我们可以追加到列表中。
每次找到 dist >= 3
时,旧逻辑都会附加到 barcode_list
,这当然会不止一次,具体取决于已经添加到列表中的条形码数量。
def compare_barcodes():
too_far = False
"""Generates a new barcode and compares with barcodes in barcode_list"""
new_barcode = make_barcode()
# keep track of # of barcodes tested
tested.append(new_barcode)
if new_barcode not in barcode_list:
for barcode in barcode_list:
dist = distance(barcode, new_barcode)
if dist >= 3:
too_far = True
if not too_far:
barcode_list.append(new_barcode)
编辑:我刚刚意识到您希望汉明距离为 3 或更大...在这种情况下只需将 if not too far
更改为 if too far
。
问题出在您的 compare_barcodes() 函数上。在旧版本中,一旦它看到与任何比较字符串相差 3 步的条形码,它就会将该新字符串添加到列表中。代码可以修改为如下
def compare_barcodes():
"""Generates a new barcode and compares with barcodes in barcode_list"""
minDist = length
new_barcode = make_barcode()
# keep track of # of barcodes tested
tested.append(new_barcode)
if new_barcode not in barcode_list:
for barcode in barcode_list:
dist = distance(barcode, new_barcode)
#if dist >= 3:
# barcode_list.append(new_barcode)
#else:
# pass
if dist < minDist:
minDist = dist
else:
pass
if minDist >= 3:
barcode_list.append(new_barcode)
@Jkdc 的回答是正确的,为他+1。在您的原始代码中,您几乎就在那里。这是我的建议,将您的 if new_barcode not in barcode_list:
条件移动到您的 for loop
中,使其成为 if new_barcode not in barcode_list and distance(barcode, new_barcode)
,然后您将不会在列表中添加任何重复项,然后仅在 [=14] 时才计算距离=] 不在你的 barcode_list
中:
def compare_barcodes():
"""Generates a new barcode and compares with barcodes in barcode_list"""
new_barcode = make_barcode()
# keep track of # of barcodes tested
tested.append(new_barcode)
for barcode in barcode_list:
if new_barcode not in barcode_list and distance(barcode, new_barcode):
barcode_list.append(new_barcode)
另一个建议是如果你想避免重复,你可以使用 set
存储你的条形码,set
操作未排序的唯一元素。
我最终创建了一个新函数来计算汉明距离...
def compare_distances(new_barcode):
"""Compares the hamming_dist between new barcode and old barcodes"""
# Count number of distances < 3
count = 0
global barcode_list
for barcode in barcode_list:
if distance(new_barcode, barcode) < 3:``
count +=1
return count
def compare_barcodes():
new_barcode = make_barcode()
if new_barcode not in barcode_list:
count = compare_distances(new_barcode)
if count > 0:
pass
else:
barcode_list.append(new_barcode)
else:
pass
# Initiate the functions to generate barcodes
while len(barcode_list) < number:
compare_barcodes()