如何从 python 中的文件中获取一行的多个片段？

Question

python有没有办法指定多个切片来从 csv 文件中读取某些列？

例如，数据文件可以如下所示：

col1,col2,col3,...col20
1,1,1,....,1
2,2,2,....,2
3,3,3,....,3
etc

是否可以有一个命令来抓取前 4 列和后 2 列？我尝试了以下两种方法，但我只是在黑暗中刺伤，所以没想到它会起作用。一种方法给我一个 ValueError，另一种方法给我一个 TypeError。

for line in fileObj:
   date, name, time, data1, data2, data3 = line.rstrip().split(',')[0:4][18:20]  # got ValueError

for line in fileObj:
   date, name, time, data1, data2, data3 = line.rstrip().split(',')[0:4,18:20]   # got TypeError

如果没有简单的方法，有人可以给我一个可能的方向的提示吗？

Answer 1

你走在正确的轨道上...

for line in fileObj:
   splitline = line.rstrip().split(',')
   date, name, time, data1 = splitline[0:4]
   data2, data3 = splitline[18:20]

或者如果您想合并 2 行：

for line in fileObj:
       splitline = line.rstrip().split(',')
       date, name, time, data1, data2, data3 = splitline[0:4] + splitline[18:20]

Answer 2

使用 csv 模块：

import csv

with open(filename, 'r') as openfile:
    reader = csv.reader(openfile)
    for line in reader:
        date, name, time, data1 = line[:4]
        data2, data3 = line[-2:]

这将解包前四列和最后两列

Answer 3

Pandas 是在 Python 中处理 .csv 文件的最佳库。例如，使用文件：

col1,col2,col3,col4,col5,col6
1,1,1,1,1,1
2,2,2,2,2,2
3,3,3,3,3,3

要获取前 4 列和后 2 列，您只需要：

import pandas as pd

df = pd.read_csv('csvtest.csv')
first_four_columns = df.ix[:,:4]
last_two_columns = df.ix[:,-2:]

我真的建议你看看 pandas 图书馆： http://pandas.pydata.org/pandas-docs/stable/10min.html

如何从 python 中的文件中获取一行的多个片段？

How to take multiple slices of a line from a file in python?

python

slice