为什么我在使用 split() 时得到空列表？

Question

我有一个文本文件：

-- Generated ]
FILEUNIT
  METRIC /

Hello
-- timestep: Jan 01,2017 00:00:00
  3*2 344.0392 343.4564 343.7741
  343.9302 343.3884 343.7685 0.0000 341.0843
  342.2441 342.5899 343.0728 343.4850 342.8882
  342.0056 342.0564 341.9619 341.8840 342.0447 /

我写了一个代码来读取文件并删除单词、字符和空行，并对其进行一些其他处理，最后 return 最后四行中的那些数字。我无法理解如何将文本文件的所有数字正确地放入列表中。现在 new_line 生成一个字符串，其中包含数字

import string

def expand(chunk):
    l = chunk.split("*")
    chunk = [str(float(l[1]))] * int(l[0])

    return chunk

with open('old_textfile.txt', 'r') as infile1:
    for line in infile1:
        if set(string.ascii_letters.replace("e","")) & set(line):
            continue

        chunks = line.split(" ")
        #Get rid of newlines
        chunks = list(map(lambda chunk: chunk.strip(), chunks))
        if "/" in chunks:
            chunks.remove("/")

        new_chunks = []
        for i in range(len(chunks)):
            if '*' in chunks[i]:
                new_chunks += expand(chunks[i])
            else:
                new_chunks.append(chunks[i])
        new_chunks[len(new_chunks)-1] = new_chunks[len(new_chunks)-1]+"\n"
        new_line = " ".join(new_chunks)

当我使用

A = new_line.split()
B = list(map(float, A))

它return是一个空列表。你知道我如何将所有这些数字放在一个列表中吗？目前，我正在将 new_line 作为文本文件编写并再次阅读，但它增加了我的运行时间，这并不好。

f = open('new_textfile.txt').read()
A = f.split()
B = list(map(float, A))
list_1.extend(B)

还有一个使用 Regex 的解决方案，但它删除了 3*2。我想将其处理为 2 2 2

import re

with open('old_textfile.txt', 'r') as infile1:
    lines = infile1.read()

nums = re.findall(r'\d+\.\d+', lines)
print(nums)

Answer 1

我不太确定我是否完全理解你正在尝试做的事情，但据我了解，你想要提取所有数字，这些数字要么是小数形式 \d+\.\d+ 要么是整数使用星号乘以另一个整数，所以 \d+\*\d+。您希望所有结果都在浮点数列表中，其中小数直接在列表中，对于整数，第二个由第一个重复。

一种方法是：

lines = """
-- Generated ]
FILEUNIT
  METRIC /

Hello
-- timestep: Jan 01,2017 00:00:00
  3*2 344.0392 343.4564 343.7741
  343.9302 343.3884 343.7685 0.0000 341.0843
  342.2441 342.5899 343.0728 343.4850 342.8882
  342.0056 342.0564 341.9619 341.8840 342.0447 /
"""

nums = []
for n in re.findall(r'(\d+\.\d+|\d+\*\d+)', lines):
    split_by_ast = n.split("*")
    if len(split_by_ast) == 1:
        nums += [float(split_by_ast[0])]
    else:
        nums += [float(split_by_ast[1])] * int(split_by_ast[0])

print(nums)

哪个returns:

[2.0, 2.0, 2.0, 344.0392, 343.4564, 343.7741, 343.9302, 343.3884, 343.7685, 0.0, 341.0843, 342.2441, 342.5899, 343.0728, 343.485, 342.8882, 342.0056, 342.0564, 341.9619, 341.884, 342.0447]

正则表达式搜索与其中一种格式（十进制或 int*int）匹配的数字。然后在十进制的情况下，它直接附加到列表中，在 int*int 的情况下，它被解析为一个较小的列表，将第二个 int 重复第一个 int 次，然后将列表连接起来。

为什么我在使用 split() 时得到空列表？

why I am getting empty list when I use split()?

python

string

split

list

text-files