从列表中提取位置 (Python)

Question

我有一个 H2S 的 .xyz 文件，如果我这样读取文件：

with open('H2S.xyz','r') as stream:
for line in stream:
    print(line)

我明白了：

3

XYZ file of the hydrogen sulphide molecule

S                  0.00000000    0.00000000    0.10224900

H                  0.00000000    0.96805900   -0.81799200

H                  0.00000000   -0.96805900   -0.81799200

第一行给出原子数，最后 3 行给出这些原子的坐标。

我应该写一些代码来提取每个原子在分子中的位置，以列表的形式，其中每个元素是另一个具有原子坐标的列表。

如果我这样做：

with open('H2S.xyz','r') as stream:
new=list(stream)
new

我将每一行作为列表中的一个元素，如果我这样做：

with open('H2S.xyz','r') as stream:
new_list=[]
for line in stream:
    new_list=new_list+line.split()
new_list

我分别获取每一个元素：

['3','XYZ','file','of','the','hydrogen','sulphide','molecule','S',
'0.00000000','0.00000000','0.10224900','H','0.00000000','0.96805900',
'-0.81799200','H','0.00000000','-0.96805900','-0.81799200']

我不想要。我想要的列表如下所示：

[['0.00000000','0.00000000','0.10224900'],
['0.00000000','0.96805900','-0.81799200'],
['0.00000000','-0.96805900','-0.81799200']]

但我不确定如何为此编写代码。

Answer 1

这个函数应该会给你正确的输出。

def parse_xyz(file_name):

    output = []
    with open(file_name) as infile:
        data = infile.readlines()
        for row in data[2:]: # Throw away the first few lines
            if row[1:]: # Throw away the first column
                output.append(row[1:].split())
    return output


result = parse_xyz('h2s.xyz')
print(result)

关于它的一些注意事项：

首先，我将代码包装在一个函数中。这通常是首选，因为这意味着您可以使用不同的文件重复该过程，例如result = parse_xyz('h2o.xyz')
for row in data[2:]: 是 list slicing，因此我们不会开始捕获少数起始行的任何结果。
我们在嵌套的for循环中重复切片符号，相当于丢掉要记录的行的第一个字符。

Answer 2

我会这样做：

import re
with open("file.txt", "r") as f: 
    print([re.split(r"\s+", x.strip(), 3) for x in f if len(re.split(r"\s+", x, 3)) == 4])

[['S', '0.00000000', '0.00000000', '0.10224900'], ['H', '0.00000000', '0.96805900', '-0.81799200'], ['H', '0.00000000', '-0.96805900', '-0.81799200']]

Answer 3

读取 .xyz 文件的所有行，拆分元素和位置并将位置附加到列表中。

H2S.xyz

    3
XYZ file of the hydrogen sulphide molecule
    S       0.00000000      0.00000000      0.10224900
    H       0.00000000      0.96805900     -0.81799200
    H       0.00000000     -0.96805900     -0.81799200

代码

with open('H2S.xyz') as data:
    lines=data.readlines()                  # read all lines
    new_list = []
    for atom in lines[2:]:                  # start from third line
        position = atom.split()             # get the values
        new_list.append(position[1:])       # append only the the positions

print(new_list)

您的列表

[['0.00000000', '0.00000000', '0.10224900'],
['0.00000000', '0.96805900', '-0.81799200'],
['0.00000000', '-0.96805900', '-0.81799200']]

从列表中提取位置 (Python)

Extract positions from a list (Python)

python

chemistry

jupyter-notebook