如何创建一个 if 语句来检查以 Python 中的特定字符串开头的工作表名称?

How do I create an if statement that checks for sheetnames that startwith a certain string in Python?

我最后的 objective 是创建一个名为 'Status' 的列,根据 sheet 的名称指示是活动还是取消。我需要它来检查 sheet 名称是否以单词 'Full Member List' 开头。如果是,则为活动状态,否则状态列应为已取消。下面我该怎么做? 我只需要此代码中的一行帮助,我在下面的行中注释了#need help。我收到该行的无效语法错误

我的尝试-

import pandas as pd
import os
from openpyxl import load_workbook

cols_to_drop =  ['PSI ID','PSIvet Region','PSIvet region num','Fax','County','Ship state']              
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State','Billing state':'State'} 

for file in os.listdir("C:/Users/hh/Desktop/autotranscribe/python/Matching"):
    if file.startswith("PSI"):
        dfs = pd.read_excel(file, sheet_name=None,skiprows=5)
        output = dict()
        for ws, df in dfs.items():
            if any(ws.startswith(x) for x in ["New Members", "PVCC"]):
                continue  
                temp = df
                #need help with below line
                temp['Status'] = "Active" if any(ws.startwith(x) for x in == "Full Member List" else "Cancelled" )   
            #drop unneeded columns
            temp = df.drop(cols_to_drop, errors="ignore", axis=1)
            #rename columns
            temp = temp.rename(columns=column_name_update_map)
            #drop empty columns
            temp = temp.dropna(how="all", axis=1)
            temp['Partner'] = "PSI"
            output[ws] = temp
        writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
        for ws, df in output.items():
            df.to_excel(writer, index=None, sheet_name=ws)
        writer.save()
        writer.close()

如果您想检查工作表名称开头的完整字符串“Full Member List”。

temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"

检查工作表名称中是否出现“Full”、“Member”、“List”中的任何一个:

for x in "Full Member List".split(" "):
    if ws.startswith(x):
        temp["Status"] = "Active"
        break

if temp["Status"] != "Active":
    temp["Status"] = "Cancelled"

我认为你需要:

  • 修复代码中阅读 temp = df
  • 行的缩进
  • 将拼写错误“startwith”修正为startswith
  • 考虑添加逻辑以忽略包含 (updated headers)
  • 的文件
  • 将您询问的行更改为:
                temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"

您的代码的更新版本如下所示:

import pandas as pd
import os
from openpyxl import load_workbook

cols_to_drop =  ['PSI ID','PSIvet Region','PSIvet region num','Fax','County','Ship state']              
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State','Billing state':'State'} 

for file in os.listdir("."):
    if file.startswith("PSI") and "(updated headers)" not in file:
        dfs = pd.read_excel(file, sheet_name=None,skiprows=5)
        output = dict()
        for ws, df in dfs.items():
            if any(ws.startswith(x) for x in ["New Members", "PVCC"]):
                continue  
            temp = df
            temp['Status'] = "Active" if ws.startswith("Full Member List") else "Cancelled"   
            #drop unneeded columns
            temp = df.drop(cols_to_drop, errors="ignore", axis=1)
            #rename columns
            temp = temp.rename(columns=column_name_update_map)
            #drop empty columns
            temp = temp.dropna(how="all", axis=1)
            temp['Partner'] = "PSI"
            output[ws] = temp
        writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
        for ws, df in output.items():
            df.to_excel(writer, index=None, sheet_name=ws)
        writer.save()
        writer.close()

为了对此进行测试,我创建了一个名为 PSI 001.xlsx 的 xlsx 文件,其中包含一个名为 Full Member List 001 的 sheet,其中包含锚定在单元格 A1 中的以下内容:

will skip
will skip
will skip
will skip
will skip
foo
1
2
3

输出存储到名为 PSI 001 (updated headers).xlsx 的文件中,文件名为 sheet,其中以下内容锚定在单元格 A1 中:

foo Status  Partner
1   Active  PSI
2   Active  PSI
3   Active  PSI