转置 python 中具有不同行长度的 csv

Question

我有许多行长度可变的 csv 文件。例如以下：

Time,0,8,18,46,132,163,224,238,267,303
X,0,14,14,14,15,16,17,15,15,15
Time,0,4,13,22,32,41,50,59,69,78,87,97,106,115,125,127,137,146,155,165,174,183,192,202,211,220,230,239,248,258,267,277,289,298,308
Y,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Time,0,4,13,22,32,41,50,59,69,78,87,97,106,115,125,127,137,146,155,165,174,183,192,202,211,220,230,239,248,258,267,277,289,298,308
Z,0,1,2,1,1,1,1,1,1,2,2,1,0,1,1,2,2,2,2,2,1,1,2,2,2,1,1,1,1,1,2,2,2,2,2
Time,0,308
W,0,0

变为：

Time,X,Time,Y,Time,Z,Time,W
0,0,0,0,0,0,0,0
8,14,4,0,4,1,308,0

丢失了很多数据，每个只取了前2个。

我想在 python 中转置此 CSV。我有以下程序：

import csv
import os
from itertools import izip
import sys

try:
    filename = sys.argv[1]
except IndexError:
    print 'Please add a filename'
    exit(-1)
with open(os.path.splitext(filename)[0] + '_t.csv', 'wb') as outfile, open(filename, 'rb') as infile:
    a = izip(*csv.reader(infile))
    csv.writer(outfile).writerows(a)

不过似乎trim很多数据，因为文件从 20KB 减少到 6KB，并且只保持最小行长度。

关于如何不丢失任何数据有什么想法吗？

Answer 1

这是一种没有itertools.izip的方法：

import csv

with open('transpose.csv') as infile, \
        open('out.csv', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    while True:
        try:
            index = next(reader)
            data = next(reader)
        except StopIteration:
            break
        writer.writerows(zip(index, data))

根据您给定的输入，此代码段会生成以下内容 out.csv：

Time,X
568,0
573,0
577,1
581,1
585,0
590,2
594,0
599,0
603,0
Time,Y
590,0
594,3
599,3
03,0
Time,Z
599,0
603,1

这是你想要的吗？

更新

此修改后的示例应与您更新后的问题相匹配：

import csv
from itertools import zip_longest  # izip_longest in Python 2

with open('transpose.csv') as infile, \
        open('out.csv', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)

    writer.writerows(zip_longest(*reader, fillvalue=0))

将 fillvalue 更新为您希望用其替换缺失值的内容。

Answer 2

izip 根据最短数组压缩，因此您只能从每一行中获取最短数组长度的值。

你应该使用 izip_longest 而不是那个，它用最长的数组压缩，它会把 None 放在没有值的地方。

例子-

import csv
import os
from itertools import izip_longest
import sys

try:
    filename = sys.argv[1]
except IndexError:
    print 'Please add a filename'
    exit(-1)
with open(os.path.splitext(filename)[0] + '_t.csv', 'wb') as outfile, open(filename, 'rb') as infile:
    a = izip_longest(*csv.reader(infile))
    csv.writer(outfile).writerows(a)

我从中得到的结果 -

Time,X,Time,Y,Time,Z,Time,W

0,0,0,0,0,0,0,0

8,14,4,0,4,1,308,0

18,14,13,1,13,2,,

46,14,22,1,22,1,,

132,15,32,1,32,1,,

163,16,41,1,41,1,,

224,17,50,1,50,1,,

238,15,59,1,59,1,,

267,15,69,1,69,1,,

303,15,78,1,78,2,,

,,87,1,87,2,,

,,97,1,97,1,,

,,106,1,106,0,,

,,115,1,115,1,,

,,125,1,125,1,,

,,127,1,127,2,,

,,137,1,137,2,,

,,146,1,146,2,,

,,155,1,155,2,,

,,165,1,165,2,,

,,174,1,174,1,,

,,183,1,183,1,,

,,192,1,192,2,,

,,202,1,202,2,,

,,211,1,211,2,,

,,220,1,220,1,,

,,230,1,230,1,,

,,239,1,239,1,,

,,248,1,248,1,,

,,258,1,258,1,,

,,267,1,267,2,,

,,277,1,277,2,,

,,289,1,289,2,,

,,298,1,298,2,,

,,308,1,308,2,,

转置 python 中具有不同行长度的 csv

Transpose csv in python with different row lengths

python

csv

transpose

更新