Python：迭代问题

Question

我正在对一个文本文件执行文本处理，并且一直在尝试迭代到一个 for 循环中。

fields = [1, 2, 3, 4, 5]

i = 0
with open('file path', 'r') as f:
    for line in f:
        # while i is smaller than the number of fields (=5)
        while i <= len(fields)-1:
            currentfield = fields[i]
            # if the first character of the line matches currentfield
            # (that being a number)
            if line[0] == currentfield:
                print(line[4:])  # print the value in the "third column"
            i += 1

文本文件"f"有这样的东西（破折号之间的数字表示年份，每一年都有自己的"entry"）：

-------------2000--------------
1        17824
2        20131125192004.9
3        690714s1969    dcu           000 0 eng
4    a       75601809 
4    a    DLC
4    b    eng
4    c    DLC
5    a    WA 750
-------------2001--------------
1        3224
2        20w125192004.9
3        690714s1969    dcu           000 0 eng
5    a    WA 120
-------------2002--------------
1        6563453
2        2013341524626245.9
3        484914s1969    dcu           000 0 eng
4    a       75601809 
4    a    eng
4    c    DLC
5    a    WA 345

文本文件中实际上没有列，但是 space 字段编号（即 1、2、3、4、5）和后面的值（即 17824）之间有两个制表符- space秒。我只是不知道如何拨打 17824。

我想做的是遍历每个 entry/year 的所有字段，但输出只给我第一个字段的值，1.Thus 我得到如下输出：

17824    
3224     
6563453

它不会遍历所有字段，而是只遍历第一个字段。我如何修复我的代码，以便将输出创建为类似 table 的形式，并在其中迭代字段 2、3、4 和 5？像这样：

17824    20131125192004.9    690714s1969    dcu           000 0 eng  ...and so on
3224     20w125192004.9      690714s1969    dcu           000 0 eng  ...and so on
6563453  2013341524626245.9  484914s1969    dcu           000 0 eng  ...and so on

编辑：我知道我没说清楚，所以我添加了一些部分。

Answer 1

这对你有帮助：

for line in f:
    print '\nline[0] is %s' % line[0]
    for currentfield in fields: # loop through all fields
        # convert currentfield to string
        if line[0] == str(currentfield): #if the first character of the line matches currentfield (that being a number)
            print 'Printing field %d' % current field # debugging
            print line[4:] #print the value in the "third column"

这给了我：

u'''line[0] is -

line[0] is 1
Printing field 1
    17824

line[0] is 2
Printing field 2
    20131125192004.9

line[0] is 3
Printing field 3
    690714s1969    dcu           000 0 eng

line[0] is 4
Printing field 4
a      75601809 

line[0] is 4
Printing field 4
a   DLC

line[0] is 4
Printing field 4
b   eng

line[0] is 4
Printing field 4
c   DLC

line[0] is 5
Printing field 5
a   WA 750

line[0] is -

line[0] is 1
Printing field 1
    3224

line[0] is 2
Printing field 2
    20w125192004.9

line[0] is 3
Printing field 3
    690714s1969    dcu           000 0 eng

line[0] is 5
Printing field 5
a   WA 120

line[0] is -

line[0] is 1
Printing field 1
    6563453

line[0] is 2
Printing field 2
    2013341524626245.9

line[0] is 3
Printing field 3
    484914s1969    dcu           000 0 eng

line[0] is 4
Printing field 4
a      75601809 

line[0] is 4
Printing field 4
a   eng

line[0] is 4
Printing field 4
c   DLC

line[0] is 5
Printing field 5
a   WA 345'''

顺便说一下，将 line[:4] 更改为 line[:8] 将根据您上面粘贴的数据给出第三列。

然后您可以使用正则表达式删除第三列数据后 space 之后的所有内容。

为您更改后的问题编辑

在这里，我连接每一行并删除所有 space，将列作为带有 l = [el for el in ''.join(line) if el != ''] 的列表。然后您可以通过直接引用它来索引该列，例如对于第 4 列：l[4]

for line in f:
    l = [el for el in ''.join(line) if el != '']
    print '\nline[0] is %s' % line[0]
    for currentfield in fields: # loop through all fields
        # convert currentfield to string
        if l[0] == str(currentfield): #if the first character of the line matches currentfield (that being a number)
            print 'Printing field %d' % current field # debugging
            print l[currentfield] #print the value in the "third column"

Python：迭代问题

Python: iteration problems

python

iteration

text