Python XLRD 按列名将列值获取到字典列表中
Python XLRD Get Column Values by Column Names into List of Dictionaries
我有一个 xlsx 文件,其中的数据不是从第一行或第一列开始的。看起来像这样。
这里只知道列名。只要第一列中有“************”,数据就结束。
我需要字典列表中的输出,如下所示。
'ListOfDict':
[ { 'A':1, 'B':'ABC', 'C':'Very Good', 'D':'Hardware', 'E':200.2 },
{ 'A':2, 'B':'DEF', 'C':'Not so good', 'D':'Software', 'E' :100.1}]
我可以找出列名。但是无法获取值。这是我的代码。
import xlrd
from itertools import product
wb = xlrd.open_workbook(filename)
ws = wb.sheet_by_index(0)
for row_index, col_index in product(xrange(ws.nrows), xrange(ws.ncols)):
if ws.cell(row_index, col_index).value == 'A':
print "({}, {})".format(row_index, col_index)
break
key1 = [ws.cell(row_index, col_ind).value for col_ind in range(col_index, ws.ncols)]
val = [ws.cell(row_index + i, col_ind).value
for i in range(row_index + 1, ws.nrows)
for col_ind in range(col_index, ws.ncols)]
但是这给出了错误 "list index out of range"
请帮忙。
谢谢。
你的问题是循环变量i
已经是row_index,而不是offset。
所以你只需要把单元格的行索引改成i
:
val = [ws.cell(i, col_ind).value
for i in range(row_index + 1, ws.nrows)
for col_ind in range(col_index, ws.ncols)]
或者修复创建偏移量:
val = [ws.cell(row_index + i, col_ind).value
for i in range(1, ws.nrows - row_index)
for col_ind in range(col_index, ws.ncols)]
我会先根据你的情况找到最后一行。然后,使用嵌套循环创建字典。类似于:
import xlrd
from itertools import product
wb = xlrd.open_workbook(filename)
ws = wb.sheet_by_index(0)
for row_index in xrange(ws.nrows):
if ws.cell(row_index, 0).value == '**********':
last_row = row_index
break
for row_index, col_index in product(xrange(ws.nrows), xrange(ws.ncols)):
if ws.cell(row_index, col_index).value == 'A':
first_row, first_col = row_index, col_index
print "({}, {})".format(row_index, col_index)
break
list_of_dicts = []
for row in range(first_row+1, last_row):
dict = {}
for col in range(first_col, ws.ncols:
key = ws.cell(first_row, col).value
val = ws.cell(row, col).value
dict[key] = val
list_of_dicts.append(dict)
并且以更短、更难读的方式(只是为了好玩...):
list_of_dicts = [{ws.cell(first_row, col).value: ws.cell(row, col).value for col in range(first_col, ws.ncols} for row in range(first_row+1, last_row)]
我有一个 xlsx 文件,其中的数据不是从第一行或第一列开始的。看起来像这样。
这里只知道列名。只要第一列中有“************”,数据就结束。 我需要字典列表中的输出,如下所示。
'ListOfDict': [ { 'A':1, 'B':'ABC', 'C':'Very Good', 'D':'Hardware', 'E':200.2 }, { 'A':2, 'B':'DEF', 'C':'Not so good', 'D':'Software', 'E' :100.1}]
我可以找出列名。但是无法获取值。这是我的代码。
import xlrd
from itertools import product
wb = xlrd.open_workbook(filename)
ws = wb.sheet_by_index(0)
for row_index, col_index in product(xrange(ws.nrows), xrange(ws.ncols)):
if ws.cell(row_index, col_index).value == 'A':
print "({}, {})".format(row_index, col_index)
break
key1 = [ws.cell(row_index, col_ind).value for col_ind in range(col_index, ws.ncols)]
val = [ws.cell(row_index + i, col_ind).value
for i in range(row_index + 1, ws.nrows)
for col_ind in range(col_index, ws.ncols)]
但是这给出了错误 "list index out of range"
请帮忙。 谢谢。
你的问题是循环变量i
已经是row_index,而不是offset。
所以你只需要把单元格的行索引改成i
:
val = [ws.cell(i, col_ind).value
for i in range(row_index + 1, ws.nrows)
for col_ind in range(col_index, ws.ncols)]
或者修复创建偏移量:
val = [ws.cell(row_index + i, col_ind).value
for i in range(1, ws.nrows - row_index)
for col_ind in range(col_index, ws.ncols)]
我会先根据你的情况找到最后一行。然后,使用嵌套循环创建字典。类似于:
import xlrd
from itertools import product
wb = xlrd.open_workbook(filename)
ws = wb.sheet_by_index(0)
for row_index in xrange(ws.nrows):
if ws.cell(row_index, 0).value == '**********':
last_row = row_index
break
for row_index, col_index in product(xrange(ws.nrows), xrange(ws.ncols)):
if ws.cell(row_index, col_index).value == 'A':
first_row, first_col = row_index, col_index
print "({}, {})".format(row_index, col_index)
break
list_of_dicts = []
for row in range(first_row+1, last_row):
dict = {}
for col in range(first_col, ws.ncols:
key = ws.cell(first_row, col).value
val = ws.cell(row, col).value
dict[key] = val
list_of_dicts.append(dict)
并且以更短、更难读的方式(只是为了好玩...):
list_of_dicts = [{ws.cell(first_row, col).value: ws.cell(row, col).value for col in range(first_col, ws.ncols} for row in range(first_row+1, last_row)]