拆分 csv 逗号分隔值

Question

我有 csv 文件，行中的信息（id 和文本）如下例所示：

Field1:
Text:
1, A,A,A,B

Field2:
Text:
2, A,B,C,C

我想要的输出：

Field1: Field2:    Field1:   Field2:
ID:      Text:       ID:      Text:
1         A           2         A
1         A           2         B
1         A           2         C
1         B           2         C

我怎样才能通过 python 做到这一点？谢谢！

Answer 1

这里有一种方法可以完成您的问题：

with open('infoo.txt', 'r', encoding="utf-8") as f:
    records = []
    rows = [[x.strip() for x in row.split(',')] for row in f.readlines()]
    i = 0
    while i < len(rows):
        gotOneOrMoreRows = False
        if len(rows[i]) == 1 and rows[i][0].startswith('Field'):
            gotOneOrMoreRows = True
            i += 1
            if i < len(rows) and len(rows[i]) == 1 and rows[i][0] == 'Text:':
                gotOneOrMoreRows = True
                i += 1
                row = rows[i]
                records.append([[row[0], row[i]] for i in range(1, len(row))])
        if not gotOneOrMoreRows:
            i += 1

    with open('outfoo.txt', 'w', encoding="utf-8") as g:
        tab = '\t'
        g.write(f'{tab.join(["Field 1:"+tab+"Field 2:" for record in records])}\n')
        g.write(f'{tab.join(["ID:"+tab+"Text:" for record in records])}\n')
        for i in range(len(records[0])):
            g.write(f'{tab.join(colPair[i][0]+tab+colPair[i][1] for colPair in records)}\n')

# check the output file:
with open('outfoo.txt', 'r', encoding="utf-8") as f:
    print('contents of output file:')
    [print(row.strip('\n')) for row in f.readlines()]

输出：

contents of output file:
Field 1:        Field 2:        Field 1:        Field 2:
ID:     Text:   ID:     Text:
1       A       2       A
1       A       2       B
1       A       2       C
1       B       2       C

请注意，这种事情使用 pandas 可能更容易。

拆分 csv 逗号分隔值

Splitting csv comma delimited values

python

csv