对于使用 Python 的工作簿,如何按列名删除某些列?
How do I drop certain columns by colname for workbooks using Python?
我正在尝试了解如何添加到我当前的脚本中,以便我能够在 sheet 级别进行更改。我希望能够从此处的平面文件中的 worksheets 中删除列。例如,如果一列名为 'company',我想将其删除,以便我的最终 wb.save 删除这些列。我有多个列名,我想从 wb-
中的所有 sheet 中删除
cols_to_drop = ['Company','Type','Firstname','lastname']
到目前为止,我的代码已经设法从文件中删除了特定的 sheet 并更新了 colnames,如下所示-
from openpyxl import load_workbook
import os
column_name_update_map = {'LocationName': 'Company Name','StreetAddress':'Address','City':'City','State':'State',
'Zip':'Zip','GeneralPhone':'Phone Number','GeneralEmail':'Email','DateJoined':'Status Date',
'Date Removed':'Status Date'}
for file in os.listdir("C:/Users/hhh/Desktop/aaa/python/Matching"):
if file.startswith("TVC"):
wb = load_workbook(file)
if 'Opt-Ins' in wb.sheetnames:
wb.remove(wb['Opt-Ins'])
wb.remove(wb['New Voting Members'])
wb.remove(wb['Temporary Members'])
for ws in wb:
for header in next(ws.rows):
try:
header.value = column_name_update_map[header.value]
except KeyError:
pass
wb.save(file + " (updated headers).xlsx")
这部分代码运行完美,并给出了我想要的结果。但是,我无法应用像 df.drop(['Company', 'Type', 'Firstname'], axis=1) 这样的数据框逻辑,因为它是工作簿而不是数据框
由于您已将问题标记为 pandas
,您可以只使用 pandas
阅读和 drop
:
for file in os.listdir("C:/Users/hhh/Desktop/aaa/python/Matching"):
if file.startswith("TVC"):
dfs = pd.read_excel(file, sheet_name=None)
output = dict()
for ws, df in dfs.items():
if ws in ["Opt-Ins", "New Voting Members", "Temporary Members"]:
continue
#drop unneeded columns
temp = df.drop(cols_to_drop, errors="ignore", axis=1)
#rename columns
temp = temp.rename(columns=column_name_update_map)
#drop empty columns
temp = temp.dropna(how="all", axis=1)
output[ws] = temp
writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
for ws, df in output.items():
df.to_excel(writer, index=None, sheet_name=ws)
writer.save()
writer.close()
我正在尝试了解如何添加到我当前的脚本中,以便我能够在 sheet 级别进行更改。我希望能够从此处的平面文件中的 worksheets 中删除列。例如,如果一列名为 'company',我想将其删除,以便我的最终 wb.save 删除这些列。我有多个列名,我想从 wb-
中的所有 sheet 中删除cols_to_drop = ['Company','Type','Firstname','lastname']
到目前为止,我的代码已经设法从文件中删除了特定的 sheet 并更新了 colnames,如下所示-
from openpyxl import load_workbook
import os
column_name_update_map = {'LocationName': 'Company Name','StreetAddress':'Address','City':'City','State':'State',
'Zip':'Zip','GeneralPhone':'Phone Number','GeneralEmail':'Email','DateJoined':'Status Date',
'Date Removed':'Status Date'}
for file in os.listdir("C:/Users/hhh/Desktop/aaa/python/Matching"):
if file.startswith("TVC"):
wb = load_workbook(file)
if 'Opt-Ins' in wb.sheetnames:
wb.remove(wb['Opt-Ins'])
wb.remove(wb['New Voting Members'])
wb.remove(wb['Temporary Members'])
for ws in wb:
for header in next(ws.rows):
try:
header.value = column_name_update_map[header.value]
except KeyError:
pass
wb.save(file + " (updated headers).xlsx")
这部分代码运行完美,并给出了我想要的结果。但是,我无法应用像 df.drop(['Company', 'Type', 'Firstname'], axis=1) 这样的数据框逻辑,因为它是工作簿而不是数据框
由于您已将问题标记为 pandas
,您可以只使用 pandas
阅读和 drop
:
for file in os.listdir("C:/Users/hhh/Desktop/aaa/python/Matching"):
if file.startswith("TVC"):
dfs = pd.read_excel(file, sheet_name=None)
output = dict()
for ws, df in dfs.items():
if ws in ["Opt-Ins", "New Voting Members", "Temporary Members"]:
continue
#drop unneeded columns
temp = df.drop(cols_to_drop, errors="ignore", axis=1)
#rename columns
temp = temp.rename(columns=column_name_update_map)
#drop empty columns
temp = temp.dropna(how="all", axis=1)
output[ws] = temp
writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
for ws, df in output.items():
df.to_excel(writer, index=None, sheet_name=ws)
writer.save()
writer.close()