KeyError : 0 - When looping through a dataframe

KeyError : 0 - When looping through a dataframe

提前感谢您对此的帮助。我对此还很陌生,我真的不知道自己在做什么。我已经尝试了多种方法来做到这一点,但我不断收到错误。我试过使用 iterrows、iloc、loc 等等,但没有成功。我不明白如何获取每一行数据并使用该行中的值发送电子邮件。

代码:

email_list 
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
|   | Client Name | Staff Name      | Role | Due Date   | Submission ID             | Staff Email   | Generated Due Docs ID            |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 1 | H.Pot       | JohannaNameLast | IP   | 2020-04-01 | H.POT-Johanna-IP-4/1/2020 | xyz@gmail.com | h.potjohannanamelastip2020-04-01 |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 2 | S.Man       | DaveSmith       | TS   | 2020-04-01 | S.MAN-David-TS-4/1/2020   | abc@gmail.com | s.mandabc2020-04-01              |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 3 | S.Man       | LouisLastName   | IP   | 2020-04-01 | S.MAN-Louis-IP-4/1/2020   | def@gmail.com | s.manlouislastnameip2020-04-01   |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 5 | T.Hul       | KellyDLastName  | IP   | 2020-04-01 | T.HUL-Kelly-IP-4/1/2020   | ghi@gmail.com | t.hulkelleydlastnameip2020-04-01 |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
# Get all the Names, Email Addresses, roles and due dates.
all_clients = email_list['Client Name']
all_staff = email_list['Staff Name']
all_roles = email_list['Role']
#all_types = email_list['Form Type']
all_due_dates = email_list['Due Date']
all_emails = email_list['Staff Email']

for idx in range(len(email_list)):
    # Get each records name, email, subject and message
    client = all_clients[idx]
    staff = all_staff[idx]
    role = all_roles[idx]
    #form_type = all_types[idx]
    due_date = all_due_dates[idx]
    email_address = all_emails[idx]

    # Get all the Names, Email Addresses, roles and due dates.
    subject = f"Your monthly summary was Due on {due_date} Days For {client.upper()}"
    message = f"Hi {staff.title()}, \n\nThe {form_type} is due in {due_date} days for {client.upper()}.  Please turn it in before the due date. \n\nThanks, \n\nJudy"


full_email = ("From: {0} <{1}>\n"
                  "To: {2} <{3}>\n"
                  "Subject: {4}\n\n"
                  "{5}"
                  .format(your_name, your_email, staff, email_address, subject, message))
    # In the email field, you can add multiple other emails if you want
    # all of them to receive the same text
try:
    server.sendmail(your_email, [email_address], full_email)
    print('Email to {} successfully sent!\n\n'.format(email_address))
except Exception as e:
    print('Email to {} could not be sent :( because {}\n\n'.format(email_address, str(e)))

# Close the smtp server
server.close()

email_list.iterrows() returns 一个迭代器,它产生索引以及数据框中该索引的行。所以迭代可以这样进行:

for idx, row in email_list.iterrows():
    # Get each records name, email, subject and message
    client = row['Client Name']
    staff = row['Staff Name']
    role = row['Role']
    #form_type = row['Form Type']
    due_date = row['Due Date']
    email_address = row['Staff Email']

    # Get all the Names, Email Addresses, roles and due dates.
    subject = f"Your monthly summary was Due on {due_date} Days For {client.upper()}"
    message = f"Hi {staff.title()}, \n\nThe {form_type} is due in {due_date} days for {client.upper()}.  Please turn it in before the due date. \n\nThanks, \n\nJudy"

您可以了解更多关于 pandas.DataFrame.iterrows() here

提取所有列然后获取相应元素是一种错误的模式 来自每个这样的变量(包含一个列)。

使用以下模式:

for idx, row in email_list.iterrows():
    row.Role
    row['Staff Name']

如果您不使用 idx,请改用 _

这个变体比你的要快得多。上面的代码实际执行 这里是 单次 迭代(超过行),而您的代码执行:

  • 也是对行号的单次迭代,
  • 但随后您的代码会针对单个元素执行 n 次查找, 在 每列 .
  • 中具有特定索引

让我们回到我的代码示例。 有 2 种变体可以访问当前行的元素:

  • row.Role - 如果列名不包含 "special" 个字符(例如空格)。
  • row['Staff Name'] - 在其他(更复杂的)情况下。

而你得到KeyError: 0.

的原因

注意:

  • 您的行的索引以 1 开头(最左边的列, 没有标题),
  • 但是在你的循环中 idx0,
  • 开始
  • 访问每个 "column variable" 实际上只是由 索引值,不是 "wanted" 元素的整数位置。

所以错误发生在循环的第一圈,当你:

  • idx == 0,
  • 没有列变量(实际上是一个系列)包含一个元素 索引 == 0.

实际上Pandas在这里使用了2个不同的名字(Keyindex value) 对于同一件事,因此可以讨论这在何种程度上 消息是可读的。 你无能为力。你只需要知道它。

或者,如果您出于某种原因希望保留代码的当前版本, 仅将 for 指令更改为:

for idx in range(1, len(email_list) + 1):
    ...

那么这个循环将从idx == 1开始,应该不会出错,只要 因为您的索引是 连续 个数字,从 1.

开始

但正如我注意到的,您的指数:

  • 123 开头(到目前为止还不错),
  • 但是有一个 "gap",你没有索引为 4.
  • 的行