根据条件从数据框创建数组
Create an array from a data frame, based off of conditions
我有一个示例数据框;
df=pd.DataFrame({'degree_awarded':['yes','no','yes','yes',
'yes','yes' ,'yes','no'],
'avg_score':[78,87,94,55,68,76,78,8]
})
degree_awarded
avg_score
yes
78
no
87
yes
94
yes
55
etc.
etc.
我想将 'degree_awarded' 列分成 'degree_awarded'、'no_degree_awarded' 数组以及相关分数,例如
degree_awarded: [78, 94, 55, etc.]
no_degree_awarded: [87, etc.]
但我不知道该怎么做。
如有任何帮助,我们将不胜感激,感谢您的宝贵时间。
listScoreAwarded=list(df[df['degree_awarded']=='yes']['avg_score'])
listScoreNotAwarded=list(df[df['degree_awarded']=='no']['avg_score'])
这两个列表都应该有效
你可以assign
the labels you want, then use groupby.agg(list)
.
作为系列:
(df
.assign(group=df['degree_awarded'].map({'yes': 'degree_awarded',
'no': 'no_degree_awarded'}))
.groupby('group')['avg_score'].agg(list)
)
输出:
group
degree_awarded [78, 94, 55, 68, 76, 78]
no_degree_awarded [87, 8]
Name: avg_score, dtype: object
作为字典:
(df
.assign(group=df['degree_awarded'].map({'yes': 'degree_awarded',
'no': 'no_degree_awarded'}))
.groupby('group')['avg_score'].agg(list)
.to_dict()
)
输出:{'degree_awarded': [78, 94, 55, 68, 76, 78], 'no_degree_awarded': [87, 8]}
我有一个示例数据框;
df=pd.DataFrame({'degree_awarded':['yes','no','yes','yes',
'yes','yes' ,'yes','no'],
'avg_score':[78,87,94,55,68,76,78,8]
})
degree_awarded | avg_score |
---|---|
yes | 78 |
no | 87 |
yes | 94 |
yes | 55 |
etc. | etc. |
我想将 'degree_awarded' 列分成 'degree_awarded'、'no_degree_awarded' 数组以及相关分数,例如
degree_awarded: [78, 94, 55, etc.]
no_degree_awarded: [87, etc.]
但我不知道该怎么做。
如有任何帮助,我们将不胜感激,感谢您的宝贵时间。
listScoreAwarded=list(df[df['degree_awarded']=='yes']['avg_score'])
listScoreNotAwarded=list(df[df['degree_awarded']=='no']['avg_score'])
这两个列表都应该有效
你可以assign
the labels you want, then use groupby.agg(list)
.
作为系列:
(df
.assign(group=df['degree_awarded'].map({'yes': 'degree_awarded',
'no': 'no_degree_awarded'}))
.groupby('group')['avg_score'].agg(list)
)
输出:
group
degree_awarded [78, 94, 55, 68, 76, 78]
no_degree_awarded [87, 8]
Name: avg_score, dtype: object
作为字典:
(df
.assign(group=df['degree_awarded'].map({'yes': 'degree_awarded',
'no': 'no_degree_awarded'}))
.groupby('group')['avg_score'].agg(list)
.to_dict()
)
输出:{'degree_awarded': [78, 94, 55, 68, 76, 78], 'no_degree_awarded': [87, 8]}