当格式特定时，如何使用 Python 逐行读取文本文件

Question

当格式特定时，如何使用 Python 逐行读取文本文件？我的数据是“space 分隔的，看起来像这样，每行之间有 spaces。实际上没有空行，也没有“结束”卡：

The_Ark.top                                 0   -37.89541   37.89541    0.00000 449.75055 
8 
0.00000     0.00000     29  -37.59748     0.04690   26  -37.89541   449.75055   26 
-0.19951   449.70273     26   -0.15660     4.48848   29  -34.20844     4.80188   26 
-33.71897   443.53000     26   -0.45357   443.32349   26    0.00000     0.00000    0
{possibly more lines ... to the end}

第 1 行的数据：文件名、xMin、xMax、yMin、yMax

第 2 行数据：文件中的点数

第 3 行的数据：x(0), y(0), pen(0), x(1), y(1), pen(1), x(2), y(2),笔(2)

第 4 行的数据：跟第 3 行一样...结束

注意：每行不能有三个 x,y,pen 组合。可以是 1、2 或 3。

到目前为止我有以下内容：

import sys
import os
import numpy as np

filepath = 'The_Ark.top'
with open(filepath) as file:
    data = file.readlines()

lineCount = len(data)

# parse first line
firstLine = data[0]
words = firstLine.split()
objectName = words[0]
mirrorCard = int(words[1])
if mirrorCard == 0:
    mirrorFlag = "True"
else:
    mirrorFlag = "False"
    
xMin = float(words[2])
xMax = float(words[3])
yMin = float(words[4])
yMax = float(words[5])

xCenter = (xMax - xMin)/2 + xMin
yCenter = (yMax - yMin)/2 + yMin

# parse second line
secondLine = data[1]
words = secondLine.split()
numPoints = int(words[0])

# parse remaining lines
.
.
.
# having trouble here...
.
.
.

    
print ("\nRead %d lines\n" % lineCount)

print ("File Name: " + objectName + " and mirror set to: " + mirrorFlag)
print ("xMin: %7.3f  xMax: %7.3f" % (xMin, xMax))
print ("yMin: %7.3f  yMax: %7.3f" % (yMin, yMax))
print ("x Center: %7.3f  y Center: %7.3f" % (xCenter, yCenter))

Answer 1

def regular_line_parse(data, line_number):
    line_index = line_number - 1
    scope_data = data[line_index]
    line_parts = scope_data.split()
    cluster_size = len(line_parts) / 3

    X, Y, PEN = [], [], []
    for i in range(cluster_size):
        X.append(float(line_parts[3 * i]))
        Y.append(float(line_parts[3 * i  + 1]))
        PEN.append(float(line_parts[3 * i + 2]))
   
    return X, Y, PEN

这个功能应该可以帮助您解决您标记的问题区域。它解析数据的特定行号（在您的情况下行号> 2）和returns每种类型的值作为列表，以便您可以根据需要保存它们。

Answer 2

您可以将第 3 行及之后的所有点存储在列表列表中。

您只需更换：

# parse remaining lines
.
.
.
# having trouble here...
.
.
.

与：

line = list()
points = list()

for i in range(2,len(data)):
    line.extend(data[i].split())

points = [line[x:x+3] for x in range(0, len(line),3)]

或者，如果您想将它们中的每一个存储为单独的列表，您可以执行以下操作：

x = list()
y = list()
pen = list()

for i in range(2,len(data)):
    line = data[i].split()
    for j in range(len(line)):
        if j%3 == 0:
            x.append(line[j])
        elif j%3 == 1:
            y.append(line[j])
        else:
            pen.append(line[j])

你可以用这种方式轻松地绘制图表。

当格式特定时，如何使用 Python 逐行读取文本文件

how do i read a text file in line-by-line using Python when the formatting is specific

python

parsing

file

readlines