KeyError : 0 - When looping through a dataframe
KeyError : 0 - When looping through a dataframe
提前感谢您对此的帮助。我对此还很陌生,我真的不知道自己在做什么。我已经尝试了多种方法来做到这一点,但我不断收到错误。我试过使用 iterrows、iloc、loc 等等,但没有成功。我不明白如何获取每一行数据并使用该行中的值发送电子邮件。
代码:
email_list
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| | Client Name | Staff Name | Role | Due Date | Submission ID | Staff Email | Generated Due Docs ID |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 1 | H.Pot | JohannaNameLast | IP | 2020-04-01 | H.POT-Johanna-IP-4/1/2020 | xyz@gmail.com | h.potjohannanamelastip2020-04-01 |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 2 | S.Man | DaveSmith | TS | 2020-04-01 | S.MAN-David-TS-4/1/2020 | abc@gmail.com | s.mandabc2020-04-01 |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 3 | S.Man | LouisLastName | IP | 2020-04-01 | S.MAN-Louis-IP-4/1/2020 | def@gmail.com | s.manlouislastnameip2020-04-01 |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 5 | T.Hul | KellyDLastName | IP | 2020-04-01 | T.HUL-Kelly-IP-4/1/2020 | ghi@gmail.com | t.hulkelleydlastnameip2020-04-01 |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
# Get all the Names, Email Addresses, roles and due dates.
all_clients = email_list['Client Name']
all_staff = email_list['Staff Name']
all_roles = email_list['Role']
#all_types = email_list['Form Type']
all_due_dates = email_list['Due Date']
all_emails = email_list['Staff Email']
for idx in range(len(email_list)):
# Get each records name, email, subject and message
client = all_clients[idx]
staff = all_staff[idx]
role = all_roles[idx]
#form_type = all_types[idx]
due_date = all_due_dates[idx]
email_address = all_emails[idx]
# Get all the Names, Email Addresses, roles and due dates.
subject = f"Your monthly summary was Due on {due_date} Days For {client.upper()}"
message = f"Hi {staff.title()}, \n\nThe {form_type} is due in {due_date} days for {client.upper()}. Please turn it in before the due date. \n\nThanks, \n\nJudy"
full_email = ("From: {0} <{1}>\n"
"To: {2} <{3}>\n"
"Subject: {4}\n\n"
"{5}"
.format(your_name, your_email, staff, email_address, subject, message))
# In the email field, you can add multiple other emails if you want
# all of them to receive the same text
try:
server.sendmail(your_email, [email_address], full_email)
print('Email to {} successfully sent!\n\n'.format(email_address))
except Exception as e:
print('Email to {} could not be sent :( because {}\n\n'.format(email_address, str(e)))
# Close the smtp server
server.close()
email_list.iterrows()
returns 一个迭代器,它产生索引以及数据框中该索引的行。所以迭代可以这样进行:
for idx, row in email_list.iterrows():
# Get each records name, email, subject and message
client = row['Client Name']
staff = row['Staff Name']
role = row['Role']
#form_type = row['Form Type']
due_date = row['Due Date']
email_address = row['Staff Email']
# Get all the Names, Email Addresses, roles and due dates.
subject = f"Your monthly summary was Due on {due_date} Days For {client.upper()}"
message = f"Hi {staff.title()}, \n\nThe {form_type} is due in {due_date} days for {client.upper()}. Please turn it in before the due date. \n\nThanks, \n\nJudy"
您可以了解更多关于 pandas.DataFrame.iterrows() here
提取所有列然后获取相应元素是一种错误的模式
来自每个这样的变量(包含一个列)。
使用以下模式:
for idx, row in email_list.iterrows():
row.Role
row['Staff Name']
如果您不使用 idx
,请改用 _
。
这个变体比你的要快得多。上面的代码实际执行
这里是 单次 迭代(超过行),而您的代码执行:
- 也是对行号的单次迭代,
- 但随后您的代码会针对单个元素执行 n 次查找,
在 每列 .
中具有特定索引
让我们回到我的代码示例。
有 2 种变体可以访问当前行的元素:
row.Role
- 如果列名不包含 "special" 个字符(例如空格)。
row['Staff Name']
- 在其他(更复杂的)情况下。
而你得到KeyError: 0.
的原因
注意:
- 您的行的索引以 1 开头(最左边的列,
没有标题),
- 但是在你的循环中 idx 从 0,
开始
- 访问每个 "column variable" 实际上只是由
索引值,不是 "wanted" 元素的整数位置。
所以错误发生在循环的第一圈,当你:
- 有idx == 0,
- 没有列变量(实际上是一个系列)包含一个元素
索引 == 0.
实际上Pandas在这里使用了2个不同的名字(Key和index value)
对于同一件事,因此可以讨论这在何种程度上
消息是可读的。
你无能为力。你只需要知道它。
或者,如果您出于某种原因希望保留代码的当前版本,
仅将 for 指令更改为:
for idx in range(1, len(email_list) + 1):
...
那么这个循环将从idx == 1开始,应该不会出错,只要
因为您的索引是 连续 个数字,从 1.
开始
但正如我注意到的,您的指数:
- 以 1、2 和 3 开头(到目前为止还不错),
- 但是有一个 "gap",你没有索引为 4.
的行
提前感谢您对此的帮助。我对此还很陌生,我真的不知道自己在做什么。我已经尝试了多种方法来做到这一点,但我不断收到错误。我试过使用 iterrows、iloc、loc 等等,但没有成功。我不明白如何获取每一行数据并使用该行中的值发送电子邮件。
代码:
email_list
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| | Client Name | Staff Name | Role | Due Date | Submission ID | Staff Email | Generated Due Docs ID |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 1 | H.Pot | JohannaNameLast | IP | 2020-04-01 | H.POT-Johanna-IP-4/1/2020 | xyz@gmail.com | h.potjohannanamelastip2020-04-01 |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 2 | S.Man | DaveSmith | TS | 2020-04-01 | S.MAN-David-TS-4/1/2020 | abc@gmail.com | s.mandabc2020-04-01 |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 3 | S.Man | LouisLastName | IP | 2020-04-01 | S.MAN-Louis-IP-4/1/2020 | def@gmail.com | s.manlouislastnameip2020-04-01 |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
| 5 | T.Hul | KellyDLastName | IP | 2020-04-01 | T.HUL-Kelly-IP-4/1/2020 | ghi@gmail.com | t.hulkelleydlastnameip2020-04-01 |
+---+-------------+-----------------+------+------------+---------------------------+---------------+----------------------------------+
# Get all the Names, Email Addresses, roles and due dates.
all_clients = email_list['Client Name']
all_staff = email_list['Staff Name']
all_roles = email_list['Role']
#all_types = email_list['Form Type']
all_due_dates = email_list['Due Date']
all_emails = email_list['Staff Email']
for idx in range(len(email_list)):
# Get each records name, email, subject and message
client = all_clients[idx]
staff = all_staff[idx]
role = all_roles[idx]
#form_type = all_types[idx]
due_date = all_due_dates[idx]
email_address = all_emails[idx]
# Get all the Names, Email Addresses, roles and due dates.
subject = f"Your monthly summary was Due on {due_date} Days For {client.upper()}"
message = f"Hi {staff.title()}, \n\nThe {form_type} is due in {due_date} days for {client.upper()}. Please turn it in before the due date. \n\nThanks, \n\nJudy"
full_email = ("From: {0} <{1}>\n"
"To: {2} <{3}>\n"
"Subject: {4}\n\n"
"{5}"
.format(your_name, your_email, staff, email_address, subject, message))
# In the email field, you can add multiple other emails if you want
# all of them to receive the same text
try:
server.sendmail(your_email, [email_address], full_email)
print('Email to {} successfully sent!\n\n'.format(email_address))
except Exception as e:
print('Email to {} could not be sent :( because {}\n\n'.format(email_address, str(e)))
# Close the smtp server
server.close()
email_list.iterrows()
returns 一个迭代器,它产生索引以及数据框中该索引的行。所以迭代可以这样进行:
for idx, row in email_list.iterrows():
# Get each records name, email, subject and message
client = row['Client Name']
staff = row['Staff Name']
role = row['Role']
#form_type = row['Form Type']
due_date = row['Due Date']
email_address = row['Staff Email']
# Get all the Names, Email Addresses, roles and due dates.
subject = f"Your monthly summary was Due on {due_date} Days For {client.upper()}"
message = f"Hi {staff.title()}, \n\nThe {form_type} is due in {due_date} days for {client.upper()}. Please turn it in before the due date. \n\nThanks, \n\nJudy"
您可以了解更多关于 pandas.DataFrame.iterrows() here
提取所有列然后获取相应元素是一种错误的模式 来自每个这样的变量(包含一个列)。
使用以下模式:
for idx, row in email_list.iterrows():
row.Role
row['Staff Name']
如果您不使用 idx
,请改用 _
。
这个变体比你的要快得多。上面的代码实际执行 这里是 单次 迭代(超过行),而您的代码执行:
- 也是对行号的单次迭代,
- 但随后您的代码会针对单个元素执行 n 次查找, 在 每列 . 中具有特定索引
让我们回到我的代码示例。 有 2 种变体可以访问当前行的元素:
row.Role
- 如果列名不包含 "special" 个字符(例如空格)。row['Staff Name']
- 在其他(更复杂的)情况下。
而你得到KeyError: 0.
的原因注意:
- 您的行的索引以 1 开头(最左边的列, 没有标题),
- 但是在你的循环中 idx 从 0, 开始
- 访问每个 "column variable" 实际上只是由 索引值,不是 "wanted" 元素的整数位置。
所以错误发生在循环的第一圈,当你:
- 有idx == 0,
- 没有列变量(实际上是一个系列)包含一个元素 索引 == 0.
实际上Pandas在这里使用了2个不同的名字(Key和index value) 对于同一件事,因此可以讨论这在何种程度上 消息是可读的。 你无能为力。你只需要知道它。
或者,如果您出于某种原因希望保留代码的当前版本, 仅将 for 指令更改为:
for idx in range(1, len(email_list) + 1):
...
那么这个循环将从idx == 1开始,应该不会出错,只要 因为您的索引是 连续 个数字,从 1.
开始但正如我注意到的,您的指数:
- 以 1、2 和 3 开头(到目前为止还不错),
- 但是有一个 "gap",你没有索引为 4. 的行