为什么这个字符串没有变成大写?
Why does this string not change to uppercase?
所以我有一个氨基酸文件,我正在尝试阅读 mdvfmkglskakegvvaaaektkqgvaeaagktkegvlyvgsktkegvvhgvatvaektk
eqvtnvggavvtgvtavaqktvegagsiaaatgfvkkdqlgkneegapqegiledmpvdp
dneayempseegyqdyepea
我有一个名为氨基酸的大写字母列表。问题是我无法读取序列,因为字母是小写的。我一直在努力把它变成大写。读取文件没有问题,我认为我已经成功地将其内容转换为字符串(但也许我没有?)。
aminoacids = ['A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y']
content1 = fh.readline() #first line, which is not the sequence
#print content1
charline1 = len(content1)-1 #number of characters in the first line
#print charline1
contentall = fh.readlines() #each line is converted into a string and put into a list
#print contentall
numlines = len(contentall) #number of elements in list = number of lines, not the first one
#print numlines
contentjoined = ''.join(contentall) #list elements are combined, but this includes new lines as characters
contentjoined = contentjoined.translate(None, "\n")
contentjoined = contentjoined.translate(None,''.join([i for i in contentjoined if i not in aminoacids]))
contentjoined = contentjoined.upper()
print contentjoined
numaa = len(contentjoined)
print numaa #this shouldn't be zero but it is
为什么这不起作用?我能做些什么来修复它?我现在处于 with
状态...这以前不是问题,但现在是了吗? Numaa 是 0,而它不应该是。我意识到我可以在我的列表中添加小写字母,但应该有更多 "pythonic" 方法来解决这个问题。
是否因为在检查 aminoacids
中的字符串后将字符串设为大写?尝试将 contentjoined = contentjoined.upper()
向上移动一两行。
当您检查 aminoacids
时,您为 str.translate
提供了一个完全小写的字符串,因此它与字符串不匹配。它最终看起来像这样:
>>> c = contentjoined.translate(None,''.join([i for i in contentjoined if i not in aminoacids]))
>>> c
''
如果您先调用 upper
,您将比较一个大写字符串和一个大写字符串列表,因此您实际上会有匹配项。它看起来像这样:
>>> contentjoined = contentjoined.upper()
>>> c = contentjoined.translate(None,''.join([i for i in contentjoined if i not in aminoacids]))
>>> c
'MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA'
如果想保留字符串为小写字母,可以只与大写字母进行比较,保留小写字母即可。看起来像这样:
>>> c = contentjoined.translate(None,''.join([i for i in contentjoined.upper() if i not in aminoacids]))
>>> c
'mdvfmkglskakegvvaaaektkqgvaeaagktkegvlyvgsktkegvvhgvatvaektkeqvtnvggavvtgvtavaqktvegagsiaaatgfvkkdqlgkneegapqegiledmpvdpdneayempseegyqdyepea'
问题出在您的 translate()
命令中:
contentjoined = contentjoined.translate(None, "\n")
contentjoined = contentjoined.translate(None,''.join([i for i in contentjoined if i not in aminoacids]))
在这里,您要用 None
替换找到的所有内容(嗯,我不确定 contentjoined
或 aminoacids
中有什么数据)。
喜欢如果你尝试:
>>>temp = "this is a test string"
>>>temp.translate(None, "aeiou")
>>>'ths s tst strng' #THIS IS OUTPUT
所以我猜你的整个字符串都变成了 None
。
查看 translate() Docs
拉入文件时,您可以将所有内容转换为大写。也许是这样的?
with open('myfile.txt', 'r') as f:
data = f.read().upper()
print(data)
'MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTK\nEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDP\nDNEAYEMPSEEGYQDYEPEA\n'
所以我有一个氨基酸文件,我正在尝试阅读 mdvfmkglskakegvvaaaektkqgvaeaagktkegvlyvgsktkegvvhgvatvaektk
eqvtnvggavvtgvtavaqktvegagsiaaatgfvkkdqlgkneegapqegiledmpvdp
dneayempseegyqdyepea
我有一个名为氨基酸的大写字母列表。问题是我无法读取序列,因为字母是小写的。我一直在努力把它变成大写。读取文件没有问题,我认为我已经成功地将其内容转换为字符串(但也许我没有?)。
aminoacids = ['A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y']
content1 = fh.readline() #first line, which is not the sequence
#print content1
charline1 = len(content1)-1 #number of characters in the first line
#print charline1
contentall = fh.readlines() #each line is converted into a string and put into a list
#print contentall
numlines = len(contentall) #number of elements in list = number of lines, not the first one
#print numlines
contentjoined = ''.join(contentall) #list elements are combined, but this includes new lines as characters
contentjoined = contentjoined.translate(None, "\n")
contentjoined = contentjoined.translate(None,''.join([i for i in contentjoined if i not in aminoacids]))
contentjoined = contentjoined.upper()
print contentjoined
numaa = len(contentjoined)
print numaa #this shouldn't be zero but it is
为什么这不起作用?我能做些什么来修复它?我现在处于 with
状态...这以前不是问题,但现在是了吗? Numaa 是 0,而它不应该是。我意识到我可以在我的列表中添加小写字母,但应该有更多 "pythonic" 方法来解决这个问题。
是否因为在检查 aminoacids
中的字符串后将字符串设为大写?尝试将 contentjoined = contentjoined.upper()
向上移动一两行。
当您检查 aminoacids
时,您为 str.translate
提供了一个完全小写的字符串,因此它与字符串不匹配。它最终看起来像这样:
>>> c = contentjoined.translate(None,''.join([i for i in contentjoined if i not in aminoacids]))
>>> c
''
如果您先调用 upper
,您将比较一个大写字符串和一个大写字符串列表,因此您实际上会有匹配项。它看起来像这样:
>>> contentjoined = contentjoined.upper()
>>> c = contentjoined.translate(None,''.join([i for i in contentjoined if i not in aminoacids]))
>>> c
'MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA'
如果想保留字符串为小写字母,可以只与大写字母进行比较,保留小写字母即可。看起来像这样:
>>> c = contentjoined.translate(None,''.join([i for i in contentjoined.upper() if i not in aminoacids]))
>>> c
'mdvfmkglskakegvvaaaektkqgvaeaagktkegvlyvgsktkegvvhgvatvaektkeqvtnvggavvtgvtavaqktvegagsiaaatgfvkkdqlgkneegapqegiledmpvdpdneayempseegyqdyepea'
问题出在您的 translate()
命令中:
contentjoined = contentjoined.translate(None, "\n")
contentjoined = contentjoined.translate(None,''.join([i for i in contentjoined if i not in aminoacids]))
在这里,您要用 None
替换找到的所有内容(嗯,我不确定 contentjoined
或 aminoacids
中有什么数据)。
喜欢如果你尝试:
>>>temp = "this is a test string"
>>>temp.translate(None, "aeiou")
>>>'ths s tst strng' #THIS IS OUTPUT
所以我猜你的整个字符串都变成了 None
。
查看 translate() Docs
拉入文件时,您可以将所有内容转换为大写。也许是这样的?
with open('myfile.txt', 'r') as f:
data = f.read().upper()
print(data)
'MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTK\nEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDP\nDNEAYEMPSEEGYQDYEPEA\n'