将列表中的数据排序为 xlsx 文件的列
Sort data in a list to columns for xlsx files
我有一个看起来像这样的列表。 (数据来自几个xlsx文件):
[['A B 10', 2, 'A B 10', 3, 1, AC], ['A B 104', 3, 'A B 104', 2, -1, 'AC']]
[['D B 126', 3, 'D B 126', 2, -1, 'EFG 1'], ['D B 15', 3, 'D B 15', 2, -1, 'EFG 1']
[]
[]
[['D B 544', 2, 'D B 544', 1, -1, 'EFG 11'], ['D B 152', 3, 'D B 152', 2, -1, 'EFG 11'], ['D B 682', 3, 'D B 682', 2, -1, 'EFG 11']
我想将此信息放入一个新的 xlsx 文件中,但首先我需要将数据按行和列排序。我希望每个子列表中的所有第一个字符串都添加到第一列,第二列中的数字等。所以这些列就是逗号所在的位置。我也不希望列表有子列表,所以所有内容都应该在同一个列表中。像这样:
['A B 10', 2, 'A B 10', 3, 1, AC,
'A B 104', 3, 'A B 104', 2, -1, 'AC'
'D B 126', 3, 'D B 126', 2, -1, 'EFG 1'
'D B 15', 3, 'D B 15', 2, -1, 'EFG 1'
'D B 544', 2, 'D B 544', 1, -1, 'EFG 11'
'D B 152', 3, 'D B 152', 2, -1, 'EFG 11'
'D B 682', 3, 'D B 682', 2, -1, 'EFG 11']
这是我在代码上的进展:
import pandas as pd
from openpyxl import Workbook, load_workbook
import glob
from openpyxl.utils.dataframe import dataframe_to_rows
numbers = []
os.chdir(r'C:Myfolder')
files = glob.glob('*.xlsx')
print(files)
for file in files: #Getting the data from xlsx files and to the numbers-list
df = pd.read_excel(file)
m = (df.iloc[:,4] - df.iloc[:,1]) != 0
pos = [0,1,3,4,6,7]
numbers = (df.loc[m, df.columns[pos]].values.tolist())
print(numbers)
excel_input = load_workbook(rapp) #Going from using pandas and dataframes to working with openpyxl (Just because I am not that familiar with pandas).
ws = excel_input.active
for r in dataframe_to_rows(df, index=True, header=True):
ws.append(r)
else:
pass
col1 = [] #Creating open lists to put the data into columns
col2 = []
col4 = []
col5 = []
col7 = []
col8 = []
mainlist = []
try:
for row in numbers: #Putting the data into the columns lists
col1.append(ws.cell(row=row, column=1).value) #This is wrong, and is throwing the error
col2.append(ws.cell(row=row, column=2).value) #Wrong
col4.append(ws.cell(row=row, column=4).value) #Wrong
col5.append(ws.cell(row=row, column=5).value) #Wrong
col7.append(ws.cell(row=row, column=7).value) #Wrong
col8.append(ws.cell(row=row, column=8).value) #Wrong
except AttributeError:
logging.error('Something is wrong')
finally:
columns = zip(col1, col2, col4, col5, col7, col8) #Zipping the lists
for column in columns:
mainlist.append(column)
Traceback:
col1.append(ws.cell(row=row, column=1).value)
File "C:\Python3\lib\site-packages\openpyxl\worksheet\worksheet.py", line
306, in cell
if row < 1 or column < 1:
TypeError: unorderable types: list() < int()
有谁知道如何在不出现错误的情况下执行此操作?错误的地方我已经在代码里评论了
编辑:
通过使用@stovfl 的方法,我能够将一些数据放入 xlsx 文件中,但是由于我的列表是列表的列表,所以只有最后一个列表被添加到我的 xlsx 文件中。
我使用了这段代码:
from openpyxl import load_workbook
report = load_workbook(r"C:\Myworkbook.xlsx")
ws = report.create_sheet('My sheet')
for _list in numbers:
for row_data in _list:
ws.append([row_data])
print(row_data)
report.save(r"C\Myworkbook.xlsx")
打印:
A B 10
2
A B 10
3
1
AC
xlsx 文件中的输出:
问题是只添加了最后一个列表,而不是整个列表列表,我希望输出看起来像这样:
我不知道我是否理解得很好,但是如果你有这样的列表:
l = [rows, rows, rows, ...]
为了创建数据框,您可以遍历列表 l 中的每个元素,例如:
df = pd.DataFrame()
for rows in l:
for row in rows:
df = pd.concat([df, row])
在你的例子中,它给了我以下输出:
0 1 2 3 4 5
0 A B 10 2 A B 10 3 1 AC
1 A B 104 3 A B 104 2 -1 AC
0 D B 126 3 D B 126 2 -1 EFG 1
1 D B 15 3 D B 15 2 -1 EFG 1
0 D B 544 2 D B 544 1 -1 EFG 11
1 D B 152 3 D B 152 2 -1 EFG 11
2 D B 682 3 D B 682 2 -1 EFG 11
Question: ... only the last list was added to my xlsx file
第一个for loop
是你的for file in files:
.
我将下面的代码更新为此。
Question: how to do this without getting the error?
使用openpyxl
编写列表列表的解决方案,例如
newWorkbook = False
newWorksheet = False
if newWorkbook:
from openpyxl import Workbook
wb = Workbook()
# Select First Worksheet
ws = wb.worksheets[0]
else:
from openpyxl import load_workbook
wb = load_workbook("mySortData.xlsx")
if newWorksheet:
# Create a New Worksheet in this Workbook
ws = wb.create_chartsheet('Sheet 2')
else:
# Select a Worksheet by Name in this Workbook
ws = wb['Sheet']
for file in files: # Getting the data from xlsx files and to the numbers-list
df = pd.read_excel(file)
m = (df.iloc[:, 4] - df.iloc[:, 1]) != 0
pos = [0, 1, 3, 4, 6, 7]
numbers = (df.loc[m, df.columns[pos]].values.tolist())
# numbers is a List[Row Data] of List[Columns]
# Iterate List of Row Data
for row_data in numbers:
ws.append(row_data)
wb.save("mySortData.xlsx")
Output:
将列表列表直接写入 CSV
的解决方案,例如:
# Data == List of List
data = [[['A B 10', 2, 'A B 10', 3, 1, 'AC'], ['A B 104', 3, 'A B 104', 2, -1, 'AC']],
[['D B 126', 3, 'D B 126', 2, -1, 'EFG 1'], ['D B 15', 3, 'D B 15', 2, -1, 'EFG 1']],
[],
[],
[['D B 544', 2, 'D B 544', 1, -1, 'EFG 11'], ['D B 152', 3, 'D B 152', 2, -1, 'EFG 11'], ['D B 682', 3, 'D B 682', 2, -1, 'EFG 11']],
]
import csv
# Write to File
with open('Output.csv', 'w') as csv_file:
writer = csv.writer(csv_file)
for _list in data:
for row_data in _list:
writer.writerow(row_data)
Qutput:
A B 10,2,A B 10,3,1,AC
A B 104,3,A B 104,2,-1,AC
D B 126,3,D B 126,2,-1,EFG 1
D B 15,3,D B 15,2,-1,EFG 1
D B 544,2,D B 544,1,-1,EFG 11
D B 152,3,D B 152,2,-1,EFG 11
D B 682,3,D B 682,2,-1,EFG 11
测试 Python: 3.4.2 - openpyxl: 2.4.1
看起来您可以将数据很好地放入 Excel 文件中。那么为什么不使用你使用类似的东西:
col1 = [cell.value for cell in ws['A']]
或
ws.iter_cols(min_col=x, max_col=y)
如果这不是您想要的,请重新表述问题以指出问题所在。
我有一个看起来像这样的列表。 (数据来自几个xlsx文件):
[['A B 10', 2, 'A B 10', 3, 1, AC], ['A B 104', 3, 'A B 104', 2, -1, 'AC']]
[['D B 126', 3, 'D B 126', 2, -1, 'EFG 1'], ['D B 15', 3, 'D B 15', 2, -1, 'EFG 1']
[]
[]
[['D B 544', 2, 'D B 544', 1, -1, 'EFG 11'], ['D B 152', 3, 'D B 152', 2, -1, 'EFG 11'], ['D B 682', 3, 'D B 682', 2, -1, 'EFG 11']
我想将此信息放入一个新的 xlsx 文件中,但首先我需要将数据按行和列排序。我希望每个子列表中的所有第一个字符串都添加到第一列,第二列中的数字等。所以这些列就是逗号所在的位置。我也不希望列表有子列表,所以所有内容都应该在同一个列表中。像这样:
['A B 10', 2, 'A B 10', 3, 1, AC,
'A B 104', 3, 'A B 104', 2, -1, 'AC'
'D B 126', 3, 'D B 126', 2, -1, 'EFG 1'
'D B 15', 3, 'D B 15', 2, -1, 'EFG 1'
'D B 544', 2, 'D B 544', 1, -1, 'EFG 11'
'D B 152', 3, 'D B 152', 2, -1, 'EFG 11'
'D B 682', 3, 'D B 682', 2, -1, 'EFG 11']
这是我在代码上的进展:
import pandas as pd
from openpyxl import Workbook, load_workbook
import glob
from openpyxl.utils.dataframe import dataframe_to_rows
numbers = []
os.chdir(r'C:Myfolder')
files = glob.glob('*.xlsx')
print(files)
for file in files: #Getting the data from xlsx files and to the numbers-list
df = pd.read_excel(file)
m = (df.iloc[:,4] - df.iloc[:,1]) != 0
pos = [0,1,3,4,6,7]
numbers = (df.loc[m, df.columns[pos]].values.tolist())
print(numbers)
excel_input = load_workbook(rapp) #Going from using pandas and dataframes to working with openpyxl (Just because I am not that familiar with pandas).
ws = excel_input.active
for r in dataframe_to_rows(df, index=True, header=True):
ws.append(r)
else:
pass
col1 = [] #Creating open lists to put the data into columns
col2 = []
col4 = []
col5 = []
col7 = []
col8 = []
mainlist = []
try:
for row in numbers: #Putting the data into the columns lists
col1.append(ws.cell(row=row, column=1).value) #This is wrong, and is throwing the error
col2.append(ws.cell(row=row, column=2).value) #Wrong
col4.append(ws.cell(row=row, column=4).value) #Wrong
col5.append(ws.cell(row=row, column=5).value) #Wrong
col7.append(ws.cell(row=row, column=7).value) #Wrong
col8.append(ws.cell(row=row, column=8).value) #Wrong
except AttributeError:
logging.error('Something is wrong')
finally:
columns = zip(col1, col2, col4, col5, col7, col8) #Zipping the lists
for column in columns:
mainlist.append(column)
Traceback: col1.append(ws.cell(row=row, column=1).value) File "C:\Python3\lib\site-packages\openpyxl\worksheet\worksheet.py", line 306, in cell if row < 1 or column < 1: TypeError: unorderable types: list() < int()
有谁知道如何在不出现错误的情况下执行此操作?错误的地方我已经在代码里评论了
编辑:
通过使用@stovfl 的方法,我能够将一些数据放入 xlsx 文件中,但是由于我的列表是列表的列表,所以只有最后一个列表被添加到我的 xlsx 文件中。 我使用了这段代码:
from openpyxl import load_workbook
report = load_workbook(r"C:\Myworkbook.xlsx")
ws = report.create_sheet('My sheet')
for _list in numbers:
for row_data in _list:
ws.append([row_data])
print(row_data)
report.save(r"C\Myworkbook.xlsx")
打印:
A B 10
2
A B 10
3
1
AC
xlsx 文件中的输出:
问题是只添加了最后一个列表,而不是整个列表列表,我希望输出看起来像这样:
我不知道我是否理解得很好,但是如果你有这样的列表:
l = [rows, rows, rows, ...]
为了创建数据框,您可以遍历列表 l 中的每个元素,例如:
df = pd.DataFrame()
for rows in l:
for row in rows:
df = pd.concat([df, row])
在你的例子中,它给了我以下输出:
0 1 2 3 4 5
0 A B 10 2 A B 10 3 1 AC
1 A B 104 3 A B 104 2 -1 AC
0 D B 126 3 D B 126 2 -1 EFG 1
1 D B 15 3 D B 15 2 -1 EFG 1
0 D B 544 2 D B 544 1 -1 EFG 11
1 D B 152 3 D B 152 2 -1 EFG 11
2 D B 682 3 D B 682 2 -1 EFG 11
Question: ... only the last list was added to my xlsx file
第一个for loop
是你的for file in files:
.
我将下面的代码更新为此。
Question: how to do this without getting the error?
使用
openpyxl
编写列表列表的解决方案,例如newWorkbook = False newWorksheet = False if newWorkbook: from openpyxl import Workbook wb = Workbook() # Select First Worksheet ws = wb.worksheets[0] else: from openpyxl import load_workbook wb = load_workbook("mySortData.xlsx") if newWorksheet: # Create a New Worksheet in this Workbook ws = wb.create_chartsheet('Sheet 2') else: # Select a Worksheet by Name in this Workbook ws = wb['Sheet'] for file in files: # Getting the data from xlsx files and to the numbers-list df = pd.read_excel(file) m = (df.iloc[:, 4] - df.iloc[:, 1]) != 0 pos = [0, 1, 3, 4, 6, 7] numbers = (df.loc[m, df.columns[pos]].values.tolist()) # numbers is a List[Row Data] of List[Columns] # Iterate List of Row Data for row_data in numbers: ws.append(row_data) wb.save("mySortData.xlsx")
Output:
将列表列表直接写入
CSV
的解决方案,例如:# Data == List of List data = [[['A B 10', 2, 'A B 10', 3, 1, 'AC'], ['A B 104', 3, 'A B 104', 2, -1, 'AC']], [['D B 126', 3, 'D B 126', 2, -1, 'EFG 1'], ['D B 15', 3, 'D B 15', 2, -1, 'EFG 1']], [], [], [['D B 544', 2, 'D B 544', 1, -1, 'EFG 11'], ['D B 152', 3, 'D B 152', 2, -1, 'EFG 11'], ['D B 682', 3, 'D B 682', 2, -1, 'EFG 11']], ] import csv # Write to File with open('Output.csv', 'w') as csv_file: writer = csv.writer(csv_file) for _list in data: for row_data in _list: writer.writerow(row_data)
Qutput:
A B 10,2,A B 10,3,1,AC A B 104,3,A B 104,2,-1,AC D B 126,3,D B 126,2,-1,EFG 1 D B 15,3,D B 15,2,-1,EFG 1 D B 544,2,D B 544,1,-1,EFG 11 D B 152,3,D B 152,2,-1,EFG 11 D B 682,3,D B 682,2,-1,EFG 11
测试 Python: 3.4.2 - openpyxl: 2.4.1
看起来您可以将数据很好地放入 Excel 文件中。那么为什么不使用你使用类似的东西:
col1 = [cell.value for cell in ws['A']]
或
ws.iter_cols(min_col=x, max_col=y)
如果这不是您想要的,请重新表述问题以指出问题所在。