如何计算pythonpandas中的特定单元格值?
How to calculate specific cell values in python pandas?
有一个pandas数据框如下:
import pandas as pd
raw_data = {'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
'age': [20, 19, 22, 21],
'favorite_color': ['blue', 'blue', 'yellow', "green"],
'grade': [88, 92, 95, 70]}
df = pd.DataFrame(raw_data)
我想将 age
和 grade
列中等于 blue
的数字单元格值除以 125.0 值,将 yellow
值除以 130.0 和green
到 135.0。结果必须插入新列 age_new
、grade_new
。
通过下面的代码我收到错误。
df['age_new'] =(df.loc[df['favorite_color']=='blue']/125.0)
df['age_new'] =(df.loc[df['favorite_color']=='yellow']/130.0)
df['age_new'] =(df.loc[df['favorite_color']=='green']/135.0)
df['grade_new'] =(df.loc[df['favorite_color']=='blue']/125.0)
df['grade_new'] =(df.loc[df['favorite_color']=='yellow']/130.0)
df['grade_new'] =(df.loc[df['favorite_color']=='green']/135.0)
错误:
TypeError: unsupported operand type(s) for /: 'str' and 'int'
map
mods = {'blue': 125, 'yellow': 130, 'green': 135}
df.assign(
mods=df.favorite_color.map(mods),
age_new=lambda d: d.age / d.mods,
grade_new=lambda d: d.grade / d.mods
)
name age favorite_color grade mods age_new grade_new
0 Willard Morris 20 blue 88 125 0.160000 0.704000
1 Al Jennings 19 blue 92 125 0.152000 0.736000
2 Omar Mullins 22 yellow 95 130 0.169231 0.730769
3 Spencer McDaniel 21 green 70 135 0.155556 0.518519
相似
mods = {'blue': 125, 'yellow': 130, 'green': 135}
df.join(df[['age', 'grade']].div(df.favorite_color.map(mods), axis=0).add_suffix('_new'))
name age favorite_color grade age_new grade_new
0 Willard Morris 20 blue 88 0.160000 0.704000
1 Al Jennings 19 blue 92 0.152000 0.736000
2 Omar Mullins 22 yellow 95 0.169231 0.730769
3 Spencer McDaniel 21 green 70 0.155556 0.518519
您可以使用.replace
代替.loc
,这样您只执行一次操作。
import pandas as pd
raw_data = {
'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
'age': [20, 19, 22, 21],
'favorite_color': ['blue', 'blue', 'yellow', "green"],
'grade': [88, 92, 95, 70]}
df = pd.DataFrame(raw_data)
color_d = {
"blue": 125,
"yellow": 130,
"green": 135
}
df[["age_new", "grade_new"]] = df[["age", "grade"]].div(
df['favorite_color'].replace(color_d),
axis=0)
df.head()
给出
name age favorite_color grade age_new grade_new
0 Willard Morris 20 blue 88 0.160000 0.704000
1 Al Jennings 19 blue 92 0.152000 0.736000
2 Omar Mullins 22 yellow 95 0.169231 0.730769
3 Spencer McDaniel 21 green 70 0.155556 0.518519
有一个pandas数据框如下:
import pandas as pd
raw_data = {'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
'age': [20, 19, 22, 21],
'favorite_color': ['blue', 'blue', 'yellow', "green"],
'grade': [88, 92, 95, 70]}
df = pd.DataFrame(raw_data)
我想将 age
和 grade
列中等于 blue
的数字单元格值除以 125.0 值,将 yellow
值除以 130.0 和green
到 135.0。结果必须插入新列 age_new
、grade_new
。
通过下面的代码我收到错误。
df['age_new'] =(df.loc[df['favorite_color']=='blue']/125.0)
df['age_new'] =(df.loc[df['favorite_color']=='yellow']/130.0)
df['age_new'] =(df.loc[df['favorite_color']=='green']/135.0)
df['grade_new'] =(df.loc[df['favorite_color']=='blue']/125.0)
df['grade_new'] =(df.loc[df['favorite_color']=='yellow']/130.0)
df['grade_new'] =(df.loc[df['favorite_color']=='green']/135.0)
错误:
TypeError: unsupported operand type(s) for /: 'str' and 'int'
map
mods = {'blue': 125, 'yellow': 130, 'green': 135}
df.assign(
mods=df.favorite_color.map(mods),
age_new=lambda d: d.age / d.mods,
grade_new=lambda d: d.grade / d.mods
)
name age favorite_color grade mods age_new grade_new
0 Willard Morris 20 blue 88 125 0.160000 0.704000
1 Al Jennings 19 blue 92 125 0.152000 0.736000
2 Omar Mullins 22 yellow 95 130 0.169231 0.730769
3 Spencer McDaniel 21 green 70 135 0.155556 0.518519
相似
mods = {'blue': 125, 'yellow': 130, 'green': 135}
df.join(df[['age', 'grade']].div(df.favorite_color.map(mods), axis=0).add_suffix('_new'))
name age favorite_color grade age_new grade_new
0 Willard Morris 20 blue 88 0.160000 0.704000
1 Al Jennings 19 blue 92 0.152000 0.736000
2 Omar Mullins 22 yellow 95 0.169231 0.730769
3 Spencer McDaniel 21 green 70 0.155556 0.518519
您可以使用.replace
代替.loc
,这样您只执行一次操作。
import pandas as pd
raw_data = {
'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
'age': [20, 19, 22, 21],
'favorite_color': ['blue', 'blue', 'yellow', "green"],
'grade': [88, 92, 95, 70]}
df = pd.DataFrame(raw_data)
color_d = {
"blue": 125,
"yellow": 130,
"green": 135
}
df[["age_new", "grade_new"]] = df[["age", "grade"]].div(
df['favorite_color'].replace(color_d),
axis=0)
df.head()
给出
name age favorite_color grade age_new grade_new
0 Willard Morris 20 blue 88 0.160000 0.704000
1 Al Jennings 19 blue 92 0.152000 0.736000
2 Omar Mullins 22 yellow 95 0.169231 0.730769
3 Spencer McDaniel 21 green 70 0.155556 0.518519