将 Csv 读入 namedtuple

Question

我正在尝试加载我从这里获得的 csv 文件：http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data 我已经重写了十几次，现在我收到错误消息说列表索引超出范围。这完全让我感到困惑，因为 len(row) 是 15。我一定在这里遗漏了一些明显的东西。

import csv
from collections import namedtuple

fields = ('age', 
      'workclass', 
      'fnlwgt', 
      'education', 
      'education_num', 
      'marital_status', 
      'occupation', 
      'relationship', 
      'race', 
      'sex', 
      'capital_gain', 
      'capital_loss', 
      'hours_per_week', 
      'native_country', 
      'target')

CensusRecord = namedtuple('CensusRecord', fields)

with open("./data/adult_data.csv","r") as f:
     r = csv.reader(f, delimiter=',')

     for row in r:
           data.append(CensusRecord(
           age              = int(row[0]),
           workclass        = row[1].strip(),
           fnlwgt           = float(row[2].strip()),
           education        = row[3].strip(),
           education_num    = int(row[4]),
           marital_status   = row[5].strip(),
           occupation       = row[6].strip(),
           relationship     = row[7].strip(),
           race             = row[7].strip(),
           sex              = row[9].strip(),
           capital_gain     = int(row[10]),
           capital_loss     = int(row[11]),
           hours_per_week   = int(row[12]),
           native_country   = row[13].strip(),
           target           = row[14].strip()))

Answer 1

我认为这是一个语法错误：你应该这样做...

data.append(CensusRecord("age" = <your_data>, ...)

而不是

data.append(CensusRecord(age = <your data>, ...)

Answer 2

用文本编辑器打开数据集，删除文档末尾的空行。然后运行你的代码

将 Csv 读入 namedtuple

Reading Csv into namedtuple

python

csv

namedtuple