Python XLRD 按列名将列值获取到字典列表中

Question

我有一个 xlsx 文件，其中的数据不是从第一行或第一列开始的。看起来像这样。

这里只知道列名。只要第一列中有“************”，数据就结束。我需要字典列表中的输出，如下所示。

'ListOfDict': [ { 'A':1, 'B':'ABC', 'C':'Very Good', 'D':'Hardware', 'E':200.2 }, { 'A':2, 'B':'DEF', 'C':'Not so good', 'D':'Software', 'E' :100.1}]

我可以找出列名。但是无法获取值。这是我的代码。

import xlrd
from itertools import product

wb = xlrd.open_workbook(filename)
ws = wb.sheet_by_index(0)

for row_index, col_index in product(xrange(ws.nrows), xrange(ws.ncols)):
    if ws.cell(row_index, col_index).value == 'A':
        print "({}, {})".format(row_index, col_index)
        break

key1 = [ws.cell(row_index, col_ind).value for col_ind in range(col_index, ws.ncols)]

val = [ws.cell(row_index + i, col_ind).value 
       for i in range(row_index + 1, ws.nrows) 
       for col_ind in range(col_index, ws.ncols)]

但是这给出了错误 "list index out of range"

请帮忙。谢谢。

Answer 1

你的问题是循环变量i已经是row_index，而不是offset。

所以你只需要把单元格的行索引改成i:

val = [ws.cell(i, col_ind).value 
       for i in range(row_index + 1, ws.nrows) 
       for col_ind in range(col_index, ws.ncols)]

或者修复创建偏移量：

val = [ws.cell(row_index + i, col_ind).value 
       for i in range(1, ws.nrows - row_index) 
       for col_ind in range(col_index, ws.ncols)]

我会先根据你的情况找到最后一行。然后，使用嵌套循环创建字典。类似于：

import xlrd
from itertools import product

wb = xlrd.open_workbook(filename)
ws = wb.sheet_by_index(0)

for row_index in xrange(ws.nrows):
    if ws.cell(row_index, 0).value == '**********':
        last_row = row_index
        break

for row_index, col_index in product(xrange(ws.nrows), xrange(ws.ncols)):
    if ws.cell(row_index, col_index).value == 'A':
        first_row, first_col = row_index, col_index
        print "({}, {})".format(row_index, col_index)
        break

list_of_dicts = []
for row in range(first_row+1, last_row):
    dict = {}
    for col in range(first_col, ws.ncols:
        key = ws.cell(first_row, col).value
        val = ws.cell(row, col).value
        dict[key] = val
    list_of_dicts.append(dict)

并且以更短、更难读的方式（只是为了好玩...）：

list_of_dicts = [{ws.cell(first_row, col).value: ws.cell(row, col).value for col in range(first_col, ws.ncols} for row in range(first_row+1, last_row)]

Python XLRD 按列名将列值获取到字典列表中

Python XLRD Get Column Values by Column Names into List of Dictionaries

python

xlrd