如何避免对 if 语句进行两次编码

Question

我主要是自己编程，所以没有人检查我的代码。我觉得我养成了一堆坏习惯。

我在此处粘贴的代码有效，但我想听听其他一些解决方案。

我创建了一个名为 teams_shots 的字典。我遍历 pandas 数据框，其中客队名称和主队名称排成一行。我想跟踪数据框中出现的每个团队的投篮。这就是为什么我检查 home_team_name 或 away_team_name 是否在字典中没有条目，如果是的话我创建一个。

for index,match in df.iterrows():
    if match['home_team_name'] not in teams_shots:
        #we have to setup an entry in the dictionary
        teams_shots[match['home_team_name']]=[]
        teams_shots[match['home_team_name']].append(match['home_team_shots'])
        home_shots_avg.append(None)
    else:
        home_shots_avg.append(np.mean(teams_shots[match['home_team_name']]))
        teams_shots[match['home_team_name']].append(match['home_team_shots'])

    if match['away_team_name'] not in teams_shots:
        teams_shots[match['away_team_name']]=[]
        teams_shots[match['away_team_name']].append(match['away_team_shots'])
        away_shots_avg.append(None)
    else:
        away_shots_avg.append(np.mean(teams_shots[match['away_team_name']])) 
        teams_shots[match['away_team_name']].append(match['away_team_shots'])

如你所见，几乎相同的代码被写了两次，这不是好的编程的标志。我考虑过在 if 语句中使用 or 运算符，但可能已经有一个条目，我会截断它。关于如何更好地编写此代码的任何想法。

Answer 1

在这种情况下，我认为额外的 for 循环应该可以解决问题：

for index,match in df.iterrows():
        for name, shots in {'home_team_name':'home_team_shots',
                            'away_team_name':'away_team_shots'}:

            if match[name] not in teams_shots:
                #we have to setup an entry in the dictionary
                teams_shots[name]=[]
                teams_shots[name].append(match[shots])
                home_shots_avg.append(None)
             else:
                home_shots_avg.append(np.mean(teams_shots[name]))

但可能有一种方法可以以矢量化方式处理此问题。

Answer 2

我会使用 get 作为快速查找。它不会抛出 KeyErrors 并且默认 None 在 truthiness

中充当 False

for index, match in df.iterrows():
    home, away, home_shots, away_shots = match['home_team_name'],
                           match['away_team_name'],
                           match['home_team_shots'],
                           match['away_team_shots']


    if not teams_shots.get(home):
        # No need to separately allocate the array
        teams_shots[home] = [home]
        home_shots_avg.append(None)
    else:
        home_shots_avg.append(np.mean(teams_shots[home_shots]))

    if not teams_shots.get(away):
        teams_shots[away] = [away]
        away_shots_avg.append(None)
    else:
        away_shots_avg.append(np.mean(teams_shots[away_shots]))

如何避免对 if 语句进行两次编码

How to avoid coding if statement twice

dictionary

code-duplication

logical-operators

dataframe

pandas