如何创建 class 来定义包含 header 的 CSV 文件布局?

How to create a class to define a CSV file layout including a header?

我想创建一种方法,我可以在其中定义 CSV 文件的结构(显然应该遵循 excel 的扩展名),其中有一个行定义以及 header。在这种方法中,一个简单的 re-ordering 定义将移动输出中的列。

我的第一次尝试是使用 namedtuple。实际上处理了我的大部分需求,但我无法创建一个空行并根据需要填充它。我尝试使用 recordclass 但遇到了同样的问题。

我的输出文件可能有 > 30 列,因此必须使用一堆 None 创建一个新实例会变得非常草率。我还希望能够在结构中添加一列而无需更新 __init__

我的想法 pseudo-code(使用 namedtuple 进行说明)是:

class TableRow(namedtuple(TableRow, "id name password hostip"))
    __slots__ = ()


class TableRowHeader:
    def __init__(self):
        header = TableRow()
        header.id = 'ID'
        header.name = "Name"
        header.password = "Password"
        header.hostip = "Host IP"


class OutputTable():
    def __init__(self):
        self.header = TableRowHeader()
        self.rows = list()

    def add(self, new_row):
        # Example assumes new_row is an instance of TableRow
        self.rows.append(new_row)

    def to_csv(self, file_name):
        with open(file_name, 'w') as csv_file:
            # creating a csv writer object
            csv_writer = csv.writer(csv_file)

            # writing the fields
            csv_writer.writerow(self.header)

            for row in sorted(self.rows):
                csv_writer.writerow(row)  


outtable = OutputTable()
row = TableRow()
row.id = 1
row.name = 'Matt'
row.hostip = '10.0.0.1'
row.password = 'obvious'      
outtable.add(row)

outtable.to_csv('./example.csv') 

我喜欢这种模式,但在 Python 中想不出一个干净的方法来处理这个问题。

你想要这样的东西吗?

import csv
from collections import namedtuple

TableRowShort = namedtuple('TableRow', "id name password hostip")
TableRowFull = namedtuple('TableRowFull', "id name password hostip description source admin_name")


class TableRowOptional:
    def __init__(self, id, name, password=None, hostip=None, description=None, source=None, admin_name=None):
        super().__init__()

        self.id = id
        self.name = name
        self.password = password
        self.hostip = hostip
        self.description = description
        self.source = source
        self.admin_name = admin_name


class OutputTable():
    def __init__(self):
        self.headers = []
        self.rows = list()

    def add(self, row):
        if hasattr(row, '_asdict'):
            value = row._asdict()
        elif hasattr(row, '__dict__'):
            value = row.__dict__
        elif isinstance(row, dict):
            value = row
        else:
            raise ValueError('Not supported row type: {}'.format(type(row)))

        for header in value.keys():
            if header not in self.headers:
                self.headers.append(header)

        self.rows.append(value)

    def to_csv(self, file_name):
        with open(file_name, 'w') as csv_file:
            # creating a csv writer object
            csv_writer = csv.writer(csv_file)

            # writing the fields
            csv_writer.writerow(self.headers)

            for row in self.rows:
                csv_writer.writerow([row.get(header, None) for header in self.headers])


outtable = OutputTable()
outtable.add(TableRowShort(1, 'Matt', 'obvious', '10.0.0.1'))
outtable.add(TableRowFull(2, 'Maria', 'obvious as usual', '10.1.0.1', 'some description', 'localnet', 'super_admin'))
outtable.add(TableRowOptional(3, 'Maria', hostip='10.1.0.1', description='some description', source='localnet'))
outtable.add({
    'id': 1337,
    'name': 'hacker',
    'hostip': '127.0.0.1',
    'extra': "I've hacked you guys lol!",
})

outtable.to_csv('./example.csv')


此解决方案为您提供了将一些“准备好的命名元组、普通 objects(使用 __dict__ 接口)和原始字典 objects 存储为行的接口。它管理 CSV headers 自动基于提供的行结构:)

看起来很清楚,对我很有用。你怎么看?

输出 CSV

# > cat example.csv
id,name,password,hostip,description,source,admin_name,extra
1,Matt,obvious,10.0.0.1,,,,
2,Maria,obvious as usual,10.1.0.1,some description,localnet,super_admin,
3,Maria,,10.1.0.1,some description,localnet,,
1337,hacker,,127.0.0.1,,,,I've hacked you guys lol!

初始代码可以使用recordclass库重写如下:

import csv
from recordclass import make_dataclass

TableRow = make_dataclass(
            'TableRow', 
            "id name password hostip description source admin_name",
            defaults=5*(None,),
            iterable=True)

class OutputTable():
    def __init__(self):
        self.header = TableRow(*TableRow.__fields__)
        self.rows = list()

    def add(self, new_row):
        # Example assumes new_row is an instance of TableRow
        self.rows.append(new_row)

    def to_csv(self, file_name):
        with open(file_name, 'w') as csv_file:
            # creating a csv writer object
            csv_writer = csv.writer(csv_file)

            # writing the fields
            csv_writer.writerow(self.header)

            for row in sorted(self.rows):
                csv_writer.writerow(row)

outtable = OutputTable()
outtable.add(TableRow(1, 'Matt', 'obvious', '10.0.0.1'))
outtable.add(TableRow(2, 'Maria', 'obvious as usual', '10.1.0.1', 'some description', 'localnet', 'super_admin'))
outtable.add(TableRow(3, 'Maria', hostip='10.1.0.1', description='some description', source='localnet'))

outtable.to_csv('./example.csv')

结果将是:

id,name,password,hostip,description,source,admin_name
1,Matt,obvious,10.0.0.1,,,
2,Maria,obvious as usual,10.1.0.1,some description,localnet,super_admin
3,Maria,,10.1.0.1,some description,localnet,