如何使用科学记数法（张量格式）解析文本文件并将它们转换为浮点数

Question

我有多个 txt 文件，格式为：

[tensor([[1.7744e+02, 4.7730e+02, 1.2396e+02, 1.1678e+02, 5.9988e-01],
         [7.8410e+02, 1.7532e+02, 6.2769e+02, 2.1083e+02, 9.9969e-01],
         device='cuda:0')]

我想删除张量，[], (), ,device='cuda:0' 并将科学记数法转换为十进制以获得输出：

177.44 4.77.30 1.23.96 1.16.78 5.9.988
784.10 175.32 627.69 210.83 99.969

这是我的程序：

for i in os.listdir():
if i.endswith(".txt"):
with open(i, "r+") as f:
    content = f.readlines()

    f.truncate(0)
    f.seek(0)

    for line in content:
        if not line.startswith("[tensor(["):
            f.write(line)
        elif not line.startswith('        '):
            f.write(line)
        elif not line.startswith("device='"):
            f.write(line)

张量字符没有了，其他的都还在，如何去掉其他字符（还有每行开头的白色space）

Answer 1

您好，您可以利用 numpy.matrix 能力来转换具有数组形状的字符串以创建矩阵，然后如果您需要在数组而不是矩阵中使用 numpy.array

进行转换

#data Definition
data = """[tensor([[1.7744e+02, 4.7730e+02, 1.2396e+02, 1.1678e+02, 5.9988e-01],
         [7.8410e+02, 1.7532e+02, 6.2769e+02, 2.1083e+02, 9.9969e-01],
         device='cuda:0')]"""

#cleaningStep, remove tensor, and all other things
elementsToRemove= ['\n',' ','[tensor(','device=',"'cuda:0')"]

cleanData = data
for el in elementsToRemove:
    cleanData = cleanData.replace(el,'')

#convert to numeric using np.matrix
import numpy as np

numericData_matrix = np.matrix(cleanData)
numericData_array = np.array(numericData_matrix)

希望这能解决您的问题！

如何使用科学记数法（张量格式）解析文本文件并将它们转换为浮点数

How can I parse a text file with scientific notation ( in tensor format) and turn them into float

python

floating-point

scientific-notation

type-conversion

tensor