取 A、B、C、D、E、F 的平均值,并根据 python 中的平均值添加一列
Take average for A,B,C,D,E,F and add a column based on average in python
取 A、B、C、D、E、F 的平均值,并根据平均值添加一列。
x=[A,B,C,D,E,F]
def avgfun(x):
if (np.mean(x,axis=0)) >= 4.5:
return "High"
elif (np.mean(x,axis=0)>=3.5) & (np.mean(x,axis=0)<4.5):
return "Moderate"
elif (np.mean(x,axis=0)>=2.5) & (np.mean(x,axis=0)<3.5):
return "Passive"
else:
return "Low"
df["Average"]= df[x].mean(axis=1)
先计算平均值,然后应用函数得到新值比较容易。
df = pd.DataFrame({'A': np.random.randint(0, 10, 100),
'B': np.random.randint(0, 10, 100),
'C': np.random.randint(0, 10, 100),
'D': np.random.randint(0, 10, 100),
'E': np.random.randint(0, 10, 100),
'F': np.random.randint(0, 10, 100)})
def avgfun(x):
if x >= 4.5:
return "High"
elif (x >= 3.5) & (x < 4.5):
return "Moderate"
elif (x >= 2.5) & (x < 3.5):
return "Passive"
else:
return "Low"
l=['A', 'B', 'C', 'D', 'E', 'F']
df['average'] = df[l].mean(1).apply(lambda x: avgfun(x))
你可以试试pd.cut
import pandas as pd
x = ['A', 'B', 'C', 'D', 'E', 'F']
df['col'] = pd.cut(df[x].mean(1), [-np.inf, 2.5, 3.5, 4.5, np.inf],
labels=['Low', 'Passive', 'Moderate', 'High'], include_lowest=True, right=False)
取 A、B、C、D、E、F 的平均值,并根据平均值添加一列。
x=[A,B,C,D,E,F]
def avgfun(x):
if (np.mean(x,axis=0)) >= 4.5:
return "High"
elif (np.mean(x,axis=0)>=3.5) & (np.mean(x,axis=0)<4.5):
return "Moderate"
elif (np.mean(x,axis=0)>=2.5) & (np.mean(x,axis=0)<3.5):
return "Passive"
else:
return "Low"
df["Average"]= df[x].mean(axis=1)
先计算平均值,然后应用函数得到新值比较容易。
df = pd.DataFrame({'A': np.random.randint(0, 10, 100),
'B': np.random.randint(0, 10, 100),
'C': np.random.randint(0, 10, 100),
'D': np.random.randint(0, 10, 100),
'E': np.random.randint(0, 10, 100),
'F': np.random.randint(0, 10, 100)})
def avgfun(x):
if x >= 4.5:
return "High"
elif (x >= 3.5) & (x < 4.5):
return "Moderate"
elif (x >= 2.5) & (x < 3.5):
return "Passive"
else:
return "Low"
l=['A', 'B', 'C', 'D', 'E', 'F']
df['average'] = df[l].mean(1).apply(lambda x: avgfun(x))
你可以试试pd.cut
import pandas as pd
x = ['A', 'B', 'C', 'D', 'E', 'F']
df['col'] = pd.cut(df[x].mean(1), [-np.inf, 2.5, 3.5, 4.5, np.inf],
labels=['Low', 'Passive', 'Moderate', 'High'], include_lowest=True, right=False)