DRY Python For循环怎么写
How to write DRY Python For Loop
我有一个大麻数据集,其中有一个 "Effects" 列,我正在尝试为不包含某些影响的菌株添加一个二进制 "nice_buds" 列。这是代码:
nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
for row in sample["Effects"]:
if "Sleepy" not in row and "Hungry" not in row and "Giggly" not in row and "Tingly" not in row and "Aroused" not in row and "Talkative" not in row:
nice_buds.append(1)
else:
nice_buds.append(0)
sample["nice_buds"] = nice_buds
截至目前,undesired_effects
列表什么也没做,代码在提供我想要的输出方面工作得很好。
我的问题是,是否有更多 "Pythonic" 或 "DRY" 方法来解决这个问题...
您可以使用 all()
和生成器表达式来简化 if 语句
nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
for row in sample["Effects"]:
if all(effect not in row for effect in undesired_effects):
nice_buds.append(1)
else:
nice_buds.append(0)
sample["nice_buds"] = nice_buds
或使用 any()
& 检查是否存在效果:
nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
for row in sample["Effects"]:
if any(effect in row for effect in undesired_effects):
nice_buds.append(0)
else:
nice_buds.append(1)
sample["nice_buds"] = nice_buds
给定一个数据框sample
- 使用
np.where
- 使用
pandas.str.contains
- 字符串有可能是大写或小写,所以最好强制一个大小写,因为 Giggly != giggly
for row in sample["Effects"]
告诉我你正在使用数据框。你不应该使用 for-loop
到 iterate through a dataframe.
import pandas as pd
import numpy as np
# create dataframe
data = {'Effects': ['I feel great', 'I feel sleepy', 'I fell hungry', 'I feel giggly', 'I feel tingly', 'I feel aroused', 'I feel talkative']}
sample = pd.DataFrame(data)
| | Effects |
|---:|:-----------------|
| 0 | I feel great |
| 1 | I feel sleepy |
| 2 | I fell hungry |
| 3 | I feel giggly |
| 4 | I feel tingly |
| 5 | I feel aroused |
| 6 | I feel talkative |
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
# words should be 1 case for matching, lower in this instance
undesired_effects = [effect.lower() for effect in undesired_effects]
# values to match as string with | (or)
match_vals = '|'.join(undesired_effects)
# create the nice buds column
sample['nice buds'] = np.where(sample['Effects'].str.lower().str.contains(match_vals), 0, 1)
display(sample)
| | Effects | nice buds |
|---:|:-----------------|------------:|
| 0 | I feel great | 1 |
| 1 | I feel sleepy | 0 |
| 2 | I fell hungry | 0 |
| 3 | I feel giggly | 0 |
| 4 | I feel tingly | 0 |
| 5 | I feel aroused | 0 |
| 6 | I feel talkative | 0 |
我有一个大麻数据集,其中有一个 "Effects" 列,我正在尝试为不包含某些影响的菌株添加一个二进制 "nice_buds" 列。这是代码:
nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
for row in sample["Effects"]:
if "Sleepy" not in row and "Hungry" not in row and "Giggly" not in row and "Tingly" not in row and "Aroused" not in row and "Talkative" not in row:
nice_buds.append(1)
else:
nice_buds.append(0)
sample["nice_buds"] = nice_buds
截至目前,undesired_effects
列表什么也没做,代码在提供我想要的输出方面工作得很好。
我的问题是,是否有更多 "Pythonic" 或 "DRY" 方法来解决这个问题...
您可以使用 all()
和生成器表达式来简化 if 语句
nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
for row in sample["Effects"]:
if all(effect not in row for effect in undesired_effects):
nice_buds.append(1)
else:
nice_buds.append(0)
sample["nice_buds"] = nice_buds
或使用 any()
& 检查是否存在效果:
nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
for row in sample["Effects"]:
if any(effect in row for effect in undesired_effects):
nice_buds.append(0)
else:
nice_buds.append(1)
sample["nice_buds"] = nice_buds
给定一个数据框sample
- 使用
np.where
- 使用
pandas.str.contains
- 字符串有可能是大写或小写,所以最好强制一个大小写,因为 Giggly != giggly
for row in sample["Effects"]
告诉我你正在使用数据框。你不应该使用for-loop
到 iterate through a dataframe.
import pandas as pd
import numpy as np
# create dataframe
data = {'Effects': ['I feel great', 'I feel sleepy', 'I fell hungry', 'I feel giggly', 'I feel tingly', 'I feel aroused', 'I feel talkative']}
sample = pd.DataFrame(data)
| | Effects |
|---:|:-----------------|
| 0 | I feel great |
| 1 | I feel sleepy |
| 2 | I fell hungry |
| 3 | I feel giggly |
| 4 | I feel tingly |
| 5 | I feel aroused |
| 6 | I feel talkative |
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]
# words should be 1 case for matching, lower in this instance
undesired_effects = [effect.lower() for effect in undesired_effects]
# values to match as string with | (or)
match_vals = '|'.join(undesired_effects)
# create the nice buds column
sample['nice buds'] = np.where(sample['Effects'].str.lower().str.contains(match_vals), 0, 1)
display(sample)
| | Effects | nice buds |
|---:|:-----------------|------------:|
| 0 | I feel great | 1 |
| 1 | I feel sleepy | 0 |
| 2 | I fell hungry | 0 |
| 3 | I feel giggly | 0 |
| 4 | I feel tingly | 0 |
| 5 | I feel aroused | 0 |
| 6 | I feel talkative | 0 |