即使列表 x 比 y 长,如何使 str x 列表与列表 y 中的列表完全匹配?我想要额外的 x 值与 None 配对
How can I make a list of str x match completely those in list y even when list x is longer than y? I want extra x values paired with None
我正在尝试设置一种方法来匹配电子邮件列表和名称列表作为元组。但是,我发现当它到达姓氏时,那些没有要配对的名字的电子邮件不包含在我的元组中,我怎样才能让这些额外的电子邮件简单地配对一个空字符串(“”)?
基本上,我有 excel 行格式,我将其设置到 pandas 数据帧中:
cust_ID
buyer_names
buyer_emails
1234
name 1; name 2; name 3
email1; email2; email3; email 4
.....
.....
......
我试过这个:
# Set regular expression to catch emails
regex = r"[a-zA-Z0-9_.+-]*@[a-zA-Z0-9-]+.[a-zA-Z\.]*"
# Initialise empty list to add query ready emails
emails_query_format = []
# Iterate over retailer_id / emails template rows and append formatted emails to list
for i, row in df.iterrows():
# Put all emails in the row into a list
emails = re.findall(regex, df['additional_emails'][i])
emails = [email.strip() for email in emails]
# Put all additional buyers into a list
buyer_names = row['additional_buyers']
buyers = re.split(r";", buyer_names)
buyers = [buyer.strip() for buyer in buyers]
buyer_email_tuple = [*zip(emails, buyers)]
最终,在遍历此元组并将它们放入查询格式后,如下所示:
# For each pair I want to create a row with the formated
for email, buyer in buyer_email_tuple:
# Here I am just putting it into a specific format to copy paste to query template
query_format = "(" + str(row['retailer_id']) + "," + "'" + buyer + "'" + "," + "'" + \
email + "'" + ")" + ","
emails_query_format.append(query_format)
# New DataFrame to input query ready emails
query_df = pd.DataFrame(emails_query_format, columns=['query_ready'])
这样,元组就不会包含额外的 'email4'。我想到了集合模块中的容器,但我并没有真正看到为此使用 defaultdict 的明确方法。
如何使元组包含 email4 并仅将“”值作为名称与之配对?
提前致谢。
解决了问题:
for idx in range(len(emails)):
if idx <= len(buyers) -1:
buyer_emails_tuple_list.append((buyers[idx], emails[idx]))
elif idx > len(buyers) -1:
buyer_emails_tuple_list.append(("", emails[idx]))
现在我可以确保对于那些没有相应买家名称的电子邮件,我将它们与空字符串配对,如下所示:
("", email4)
我正在尝试设置一种方法来匹配电子邮件列表和名称列表作为元组。但是,我发现当它到达姓氏时,那些没有要配对的名字的电子邮件不包含在我的元组中,我怎样才能让这些额外的电子邮件简单地配对一个空字符串(“”)?
基本上,我有 excel 行格式,我将其设置到 pandas 数据帧中:
cust_ID | buyer_names | buyer_emails |
---|---|---|
1234 | name 1; name 2; name 3 | email1; email2; email3; email 4 |
..... | ..... | ...... |
我试过这个:
# Set regular expression to catch emails
regex = r"[a-zA-Z0-9_.+-]*@[a-zA-Z0-9-]+.[a-zA-Z\.]*"
# Initialise empty list to add query ready emails
emails_query_format = []
# Iterate over retailer_id / emails template rows and append formatted emails to list
for i, row in df.iterrows():
# Put all emails in the row into a list
emails = re.findall(regex, df['additional_emails'][i])
emails = [email.strip() for email in emails]
# Put all additional buyers into a list
buyer_names = row['additional_buyers']
buyers = re.split(r";", buyer_names)
buyers = [buyer.strip() for buyer in buyers]
buyer_email_tuple = [*zip(emails, buyers)]
最终,在遍历此元组并将它们放入查询格式后,如下所示:
# For each pair I want to create a row with the formated
for email, buyer in buyer_email_tuple:
# Here I am just putting it into a specific format to copy paste to query template
query_format = "(" + str(row['retailer_id']) + "," + "'" + buyer + "'" + "," + "'" + \
email + "'" + ")" + ","
emails_query_format.append(query_format)
# New DataFrame to input query ready emails
query_df = pd.DataFrame(emails_query_format, columns=['query_ready'])
这样,元组就不会包含额外的 'email4'。我想到了集合模块中的容器,但我并没有真正看到为此使用 defaultdict 的明确方法。
如何使元组包含 email4 并仅将“”值作为名称与之配对?
提前致谢。
解决了问题:
for idx in range(len(emails)):
if idx <= len(buyers) -1:
buyer_emails_tuple_list.append((buyers[idx], emails[idx]))
elif idx > len(buyers) -1:
buyer_emails_tuple_list.append(("", emails[idx]))
现在我可以确保对于那些没有相应买家名称的电子邮件,我将它们与空字符串配对,如下所示:
("", email4)