DRY Python For循环怎么写

Question

我有一个大麻数据集，其中有一个 "Effects" 列，我正在尝试为不包含某些影响的菌株添加一个二进制 "nice_buds" 列。这是代码：

nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]

for row in sample["Effects"]:
    if "Sleepy" not in row and "Hungry" not in row and "Giggly" not in row and "Tingly" not in row and "Aroused" not in row and "Talkative" not in row:
        nice_buds.append(1)
    else:
        nice_buds.append(0)

sample["nice_buds"] = nice_buds

截至目前，undesired_effects 列表什么也没做，代码在提供我想要的输出方面工作得很好。

我的问题是，是否有更多 "Pythonic" 或 "DRY" 方法来解决这个问题...

Answer 1

您可以使用 all() 和生成器表达式来简化 if 语句

nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]

for row in sample["Effects"]:
    if all(effect not in row for effect in undesired_effects):
        nice_buds.append(1)
    else:
        nice_buds.append(0)

sample["nice_buds"] = nice_buds

或使用 any() & 检查是否存在效果：

nice_buds = []
undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]

for row in sample["Effects"]:
    if any(effect in row for effect in undesired_effects):
        nice_buds.append(0)
    else:
        nice_buds.append(1)

sample["nice_buds"] = nice_buds

Answer 2

给定一个数据框`sample`

使用np.where
使用pandas.str.contains
字符串有可能是大写或小写，所以最好强制一个大小写，因为 Giggly != giggly
for row in sample["Effects"] 告诉我你正在使用数据框。你不应该使用 for-loop 到 iterate through a dataframe.

import pandas as pd
import numpy as np

# create dataframe
data = {'Effects': ['I feel great', 'I feel sleepy', 'I fell hungry', 'I feel giggly', 'I feel tingly', 'I feel aroused', 'I feel talkative']}

sample = pd.DataFrame(data)

|    | Effects          |
|---:|:-----------------|
|  0 | I feel great     |
|  1 | I feel sleepy    |
|  2 | I fell hungry    |
|  3 | I feel giggly    |
|  4 | I feel tingly    |
|  5 | I feel aroused   |
|  6 | I feel talkative |

undesired_effects = ["Sleepy", "Hungry", "Giggly", "Tingly", "Aroused", "Talkative"]

# words should be 1 case for matching, lower in this instance
undesired_effects = [effect.lower() for effect in undesired_effects]

# values to match as string with | (or)
match_vals = '|'.join(undesired_effects)

# create the nice buds column
sample['nice buds'] = np.where(sample['Effects'].str.lower().str.contains(match_vals), 0, 1)

`display(sample)`

|    | Effects          |   nice buds |
|---:|:-----------------|------------:|
|  0 | I feel great     |           1 |
|  1 | I feel sleepy    |           0 |
|  2 | I fell hungry    |           0 |
|  3 | I feel giggly    |           0 |
|  4 | I feel tingly    |           0 |
|  5 | I feel aroused   |           0 |
|  6 | I feel talkative |           0 |

DRY Python For循环怎么写

How to write DRY Python For Loop

python

for-loop

dry

给定一个数据框`sample`

`display(sample)`

DRY Python For循环怎么写

How to write DRY Python For Loop

python

for-loop

dry

给定一个数据框sample

display(sample)

给定一个数据框`sample`

`display(sample)`