将 Python 中的字符串转换为 int，仅数字 w/o 个字符

Question

我是 Python 的初学者，我找不到问题的答案。我有一个包含一些数据的文件，我想从这个文件中获取数字。我的程序如下所示：

class Mojaklasa:
def przenumeruj_pdb(self):
    nazwa=raw_input('Podaj nazwe pliku: ')
    plik=open(nazwa).readlines()
    write=open('out.txt','w')
    for i in plik:
        j=i.split()
        if len(j)>5:
            if j[0] == "ATOM":
                    write.write(j[5])
                    write.write("\n")
    zapis.close()

文件中的第 5 个字段有一些从 -19 到 100 的数字，它运行良好。但有时第 5 个字段的数字带有字母 f.e。 28A，只想要 28。转换为 int 不起作用。我该怎么做？

Answer 1

str.translate 将删除任何字母：

s = "10A"
from string import ascii_letters
print(int(s.translate(None,ascii_letters)))
10

或使用 re:

import re
print(int(re.findall("\-?\d+",s))[0])
10
In [22]: s = "-100A"   
In [23]: int(re.findall("\-?\d+",s)[0])
Out[23]: -100

In [24]: int(s.translate(None,ascii_letters))
Out[24]: -100

我也会稍微更改一下您的代码：

class Mojaklasa: # unless there are more methods I would not use a class 
    def przenumeruj_pdb(self):
        nazwa = raw_input('Podaj nazwe pliku: ')
        with open(nazwa) as plik, open("out.txt", "w") as write: # with will close your iles 
            for line in plik: # iterate over file object
                j = line.split()
                if len(j) > 5 and j[0] == "ATOM": # same as nested if's
                    write.write("{}\n".format(j[5].translate(None, ascii_letters)))

Answer 2

这是一个整体改进的方法：

import re

class Mojaklasa:
    def przenumeruj_pdb(self):
        nazwa=raw_input('Podaj nazwe pliku: ')
        with open(nazwa) as plik, open('out.txt','w') as zapis:
            for i in plik:
                j = i.split()
                if len(j) <= 5: continue
                if j[0] == "ATOM":
                    mo = re.match(r'\d+', j[5])
                    if mo is None: continue
                    zapis.write(mo.group() + '\n')

我没有改进您对标识符的选择（除了 write 和 zapis 之间的一些混淆），但改进包括 (a) 有用的缩进，(b) 使用 with 打开文件（因此它们会自动关闭），（c）仅通过 re （与您的 Q 最相关的一个）提取前导数字，（d）使用 if/continue 行而不是缩进（如 "flat is better than nested"）。还有一些我可能忘记了:-)

我会还建议将 j 重命名为 line 并将 i 重命名为 fields chosen language) as i and j "sound" 很像整数循环计数器之类的，很迷惑:-).

Answer 3

您可以用格式正确的正则表达式替换很多逻辑。

for i in plik:
    m = re.match(r'ATOM\s+.*?\s+.*?\s+.*?\s+.*?\s+(-?\d+)', i)
    if m:
        write.write(m.group(1) + '\n')

将 Python 中的字符串转换为 int，仅数字 w/o 个字符

Converting str to int in Python, only numbers w/o characters

python

type-conversion