如何创建 class 来定义包含 header 的 CSV 文件布局？

Question

我想创建一种方法，我可以在其中定义 CSV 文件的结构（显然应该遵循 excel 的扩展名），其中有一个行定义以及 header。在这种方法中，一个简单的 re-ordering 定义将移动输出中的列。

我的第一次尝试是使用 namedtuple。实际上处理了我的大部分需求，但我无法创建一个空行并根据需要填充它。我尝试使用 recordclass 但遇到了同样的问题。

我的输出文件可能有 > 30 列，因此必须使用一堆 None 创建一个新实例会变得非常草率。我还希望能够在结构中添加一列而无需更新 __init__ 等

我的想法 pseudo-code（使用 namedtuple 进行说明）是：

class TableRow(namedtuple(TableRow, "id name password hostip"))
    __slots__ = ()


class TableRowHeader:
    def __init__(self):
        header = TableRow()
        header.id = 'ID'
        header.name = "Name"
        header.password = "Password"
        header.hostip = "Host IP"


class OutputTable():
    def __init__(self):
        self.header = TableRowHeader()
        self.rows = list()

    def add(self, new_row):
        # Example assumes new_row is an instance of TableRow
        self.rows.append(new_row)

    def to_csv(self, file_name):
        with open(file_name, 'w') as csv_file:
            # creating a csv writer object
            csv_writer = csv.writer(csv_file)

            # writing the fields
            csv_writer.writerow(self.header)

            for row in sorted(self.rows):
                csv_writer.writerow(row)  


outtable = OutputTable()
row = TableRow()
row.id = 1
row.name = 'Matt'
row.hostip = '10.0.0.1'
row.password = 'obvious'      
outtable.add(row)

outtable.to_csv('./example.csv')

我喜欢这种模式，但在 Python 中想不出一个干净的方法来处理这个问题。

Answer 1

你想要这样的东西吗？

import csv
from collections import namedtuple

TableRowShort = namedtuple('TableRow', "id name password hostip")
TableRowFull = namedtuple('TableRowFull', "id name password hostip description source admin_name")


class TableRowOptional:
    def __init__(self, id, name, password=None, hostip=None, description=None, source=None, admin_name=None):
        super().__init__()

        self.id = id
        self.name = name
        self.password = password
        self.hostip = hostip
        self.description = description
        self.source = source
        self.admin_name = admin_name


class OutputTable():
    def __init__(self):
        self.headers = []
        self.rows = list()

    def add(self, row):
        if hasattr(row, '_asdict'):
            value = row._asdict()
        elif hasattr(row, '__dict__'):
            value = row.__dict__
        elif isinstance(row, dict):
            value = row
        else:
            raise ValueError('Not supported row type: {}'.format(type(row)))

        for header in value.keys():
            if header not in self.headers:
                self.headers.append(header)

        self.rows.append(value)

    def to_csv(self, file_name):
        with open(file_name, 'w') as csv_file:
            # creating a csv writer object
            csv_writer = csv.writer(csv_file)

            # writing the fields
            csv_writer.writerow(self.headers)

            for row in self.rows:
                csv_writer.writerow([row.get(header, None) for header in self.headers])


outtable = OutputTable()
outtable.add(TableRowShort(1, 'Matt', 'obvious', '10.0.0.1'))
outtable.add(TableRowFull(2, 'Maria', 'obvious as usual', '10.1.0.1', 'some description', 'localnet', 'super_admin'))
outtable.add(TableRowOptional(3, 'Maria', hostip='10.1.0.1', description='some description', source='localnet'))
outtable.add({
    'id': 1337,
    'name': 'hacker',
    'hostip': '127.0.0.1',
    'extra': "I've hacked you guys lol!",
})

outtable.to_csv('./example.csv')

此解决方案为您提供了将一些“准备好的命名元组、普通 objects（使用 __dict__ 接口）和原始字典 objects 存储为行的接口。它管理 CSV headers 自动基于提供的行结构:)

看起来很清楚，对我很有用。你怎么看？

输出 CSV

# > cat example.csv
id,name,password,hostip,description,source,admin_name,extra
1,Matt,obvious,10.0.0.1,,,,
2,Maria,obvious as usual,10.1.0.1,some description,localnet,super_admin,
3,Maria,,10.1.0.1,some description,localnet,,
1337,hacker,,127.0.0.1,,,,I've hacked you guys lol!

Answer 2

初始代码可以使用recordclass库重写如下：

import csv
from recordclass import make_dataclass

TableRow = make_dataclass(
            'TableRow', 
            "id name password hostip description source admin_name",
            defaults=5*(None,),
            iterable=True)

class OutputTable():
    def __init__(self):
        self.header = TableRow(*TableRow.__fields__)
        self.rows = list()

    def add(self, new_row):
        # Example assumes new_row is an instance of TableRow
        self.rows.append(new_row)

    def to_csv(self, file_name):
        with open(file_name, 'w') as csv_file:
            # creating a csv writer object
            csv_writer = csv.writer(csv_file)

            # writing the fields
            csv_writer.writerow(self.header)

            for row in sorted(self.rows):
                csv_writer.writerow(row)

outtable = OutputTable()
outtable.add(TableRow(1, 'Matt', 'obvious', '10.0.0.1'))
outtable.add(TableRow(2, 'Maria', 'obvious as usual', '10.1.0.1', 'some description', 'localnet', 'super_admin'))
outtable.add(TableRow(3, 'Maria', hostip='10.1.0.1', description='some description', source='localnet'))

outtable.to_csv('./example.csv')

结果将是：

id,name,password,hostip,description,source,admin_name
1,Matt,obvious,10.0.0.1,,,
2,Maria,obvious as usual,10.1.0.1,some description,localnet,super_admin
3,Maria,,10.1.0.1,some description,localnet,

如何创建 class 来定义包含 header 的 CSV 文件布局？

How to create a class to define a CSV file layout including a header?

python

export-to-csv

namedtuple