如何创建一个 if 语句来检查以 Python 中的特定字符串开头的工作表名称?
How do I create an if statement that checks for sheetnames that startwith a certain string in Python?
我最后的 objective 是创建一个名为 'Status' 的列,根据 sheet 的名称指示是活动还是取消。我需要它来检查 sheet 名称是否以单词 'Full Member List' 开头。如果是,则为活动状态,否则状态列应为已取消。下面我该怎么做?
我只需要此代码中的一行帮助,我在下面的行中注释了#need help。我收到该行的无效语法错误
我的尝试-
import pandas as pd
import os
from openpyxl import load_workbook
cols_to_drop = ['PSI ID','PSIvet Region','PSIvet region num','Fax','County','Ship state']
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State','Billing state':'State'}
for file in os.listdir("C:/Users/hh/Desktop/autotranscribe/python/Matching"):
if file.startswith("PSI"):
dfs = pd.read_excel(file, sheet_name=None,skiprows=5)
output = dict()
for ws, df in dfs.items():
if any(ws.startswith(x) for x in ["New Members", "PVCC"]):
continue
temp = df
#need help with below line
temp['Status'] = "Active" if any(ws.startwith(x) for x in == "Full Member List" else "Cancelled" )
#drop unneeded columns
temp = df.drop(cols_to_drop, errors="ignore", axis=1)
#rename columns
temp = temp.rename(columns=column_name_update_map)
#drop empty columns
temp = temp.dropna(how="all", axis=1)
temp['Partner'] = "PSI"
output[ws] = temp
writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
for ws, df in output.items():
df.to_excel(writer, index=None, sheet_name=ws)
writer.save()
writer.close()
如果您想检查工作表名称开头的完整字符串“Full Member List”。
temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"
检查工作表名称中是否出现“Full”、“Member”、“List”中的任何一个:
for x in "Full Member List".split(" "):
if ws.startswith(x):
temp["Status"] = "Active"
break
if temp["Status"] != "Active":
temp["Status"] = "Cancelled"
我认为你需要:
- 修复代码中阅读
temp = df
行的缩进
- 将拼写错误“startwith”修正为
startswith
- 考虑添加逻辑以忽略包含
(updated headers)
的文件
- 将您询问的行更改为:
temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"
您的代码的更新版本如下所示:
import pandas as pd
import os
from openpyxl import load_workbook
cols_to_drop = ['PSI ID','PSIvet Region','PSIvet region num','Fax','County','Ship state']
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State','Billing state':'State'}
for file in os.listdir("."):
if file.startswith("PSI") and "(updated headers)" not in file:
dfs = pd.read_excel(file, sheet_name=None,skiprows=5)
output = dict()
for ws, df in dfs.items():
if any(ws.startswith(x) for x in ["New Members", "PVCC"]):
continue
temp = df
temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"
#drop unneeded columns
temp = df.drop(cols_to_drop, errors="ignore", axis=1)
#rename columns
temp = temp.rename(columns=column_name_update_map)
#drop empty columns
temp = temp.dropna(how="all", axis=1)
temp['Partner'] = "PSI"
output[ws] = temp
writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
for ws, df in output.items():
df.to_excel(writer, index=None, sheet_name=ws)
writer.save()
writer.close()
为了对此进行测试,我创建了一个名为 PSI 001.xlsx
的 xlsx 文件,其中包含一个名为 Full Member List 001
的 sheet,其中包含锚定在单元格 A1 中的以下内容:
will skip
will skip
will skip
will skip
will skip
foo
1
2
3
输出存储到名为 PSI 001 (updated headers).xlsx
的文件中,文件名为 sheet,其中以下内容锚定在单元格 A1 中:
foo Status Partner
1 Active PSI
2 Active PSI
3 Active PSI
我最后的 objective 是创建一个名为 'Status' 的列,根据 sheet 的名称指示是活动还是取消。我需要它来检查 sheet 名称是否以单词 'Full Member List' 开头。如果是,则为活动状态,否则状态列应为已取消。下面我该怎么做? 我只需要此代码中的一行帮助,我在下面的行中注释了#need help。我收到该行的无效语法错误
我的尝试-
import pandas as pd
import os
from openpyxl import load_workbook
cols_to_drop = ['PSI ID','PSIvet Region','PSIvet region num','Fax','County','Ship state']
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State','Billing state':'State'}
for file in os.listdir("C:/Users/hh/Desktop/autotranscribe/python/Matching"):
if file.startswith("PSI"):
dfs = pd.read_excel(file, sheet_name=None,skiprows=5)
output = dict()
for ws, df in dfs.items():
if any(ws.startswith(x) for x in ["New Members", "PVCC"]):
continue
temp = df
#need help with below line
temp['Status'] = "Active" if any(ws.startwith(x) for x in == "Full Member List" else "Cancelled" )
#drop unneeded columns
temp = df.drop(cols_to_drop, errors="ignore", axis=1)
#rename columns
temp = temp.rename(columns=column_name_update_map)
#drop empty columns
temp = temp.dropna(how="all", axis=1)
temp['Partner'] = "PSI"
output[ws] = temp
writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
for ws, df in output.items():
df.to_excel(writer, index=None, sheet_name=ws)
writer.save()
writer.close()
如果您想检查工作表名称开头的完整字符串“Full Member List”。
temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"
检查工作表名称中是否出现“Full”、“Member”、“List”中的任何一个:
for x in "Full Member List".split(" "):
if ws.startswith(x):
temp["Status"] = "Active"
break
if temp["Status"] != "Active":
temp["Status"] = "Cancelled"
我认为你需要:
- 修复代码中阅读
temp = df
行的缩进
- 将拼写错误“startwith”修正为
startswith
- 考虑添加逻辑以忽略包含
(updated headers)
的文件
- 将您询问的行更改为:
temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"
您的代码的更新版本如下所示:
import pandas as pd
import os
from openpyxl import load_workbook
cols_to_drop = ['PSI ID','PSIvet Region','PSIvet region num','Fax','County','Ship state']
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State','Billing state':'State'}
for file in os.listdir("."):
if file.startswith("PSI") and "(updated headers)" not in file:
dfs = pd.read_excel(file, sheet_name=None,skiprows=5)
output = dict()
for ws, df in dfs.items():
if any(ws.startswith(x) for x in ["New Members", "PVCC"]):
continue
temp = df
temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"
#drop unneeded columns
temp = df.drop(cols_to_drop, errors="ignore", axis=1)
#rename columns
temp = temp.rename(columns=column_name_update_map)
#drop empty columns
temp = temp.dropna(how="all", axis=1)
temp['Partner'] = "PSI"
output[ws] = temp
writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
for ws, df in output.items():
df.to_excel(writer, index=None, sheet_name=ws)
writer.save()
writer.close()
为了对此进行测试,我创建了一个名为 PSI 001.xlsx
的 xlsx 文件,其中包含一个名为 Full Member List 001
的 sheet,其中包含锚定在单元格 A1 中的以下内容:
will skip
will skip
will skip
will skip
will skip
foo
1
2
3
输出存储到名为 PSI 001 (updated headers).xlsx
的文件中,文件名为 sheet,其中以下内容锚定在单元格 A1 中:
foo Status Partner
1 Active PSI
2 Active PSI
3 Active PSI