如何使用 .startswith 在数据框中创建新变量？

Question

我在 python 中有一个这样的数据框：

data = [['a_subj.163', 1], ['b_subj.164', 2], ['c_subj.165', 3]]
df = pd.DataFrame(data, columns = ['subj', 'mean'])

    subj       mean
0   a_subj.163  1
1   b_subj.164  2
2   c_subj.165  3

我需要取主语以 'a.subj' 开头的平均值，并将其添加到名为 mean_a 的新变量中。

我尝试了以下但得到了 TypeError: 'DataFrame' object is not callable:

df['mean_a'] = np.where(df(subj.startswith("a_subj")), mean, '')

我也试过这个，我没有收到错误，但没有创建新变量：

for subj in df:
    if subj.startswith('a_subj'):
        df['mean_a'] = mean

对我哪里出错有什么建议吗？

Answer 1

这里是调用 DataFrame 而不是访问它

np.where(df(subj.startswith("a_subj")), mean, '')

要访问您需要使用方括号：

np.where(df[subj.startswith("a_subj")], mean, '')

Answer 2

你说你想要它在一个“新变量”中，但你的代码似乎试图将平均值放入一个新列中。如果您的目标是将其放入变量中，请尝试：

mean_a = df['mean'][df.subj.str.startswith('a_subj')].mean()

Answer 3

如果您想使用 for 循环，您可以这样做：

df["mean_a"] = "" # remove this line if you want nan in the rest of the values
for i, row in df.iterrows():
    if row.subj.startswith('a_subj'):
        df.at[i, 'mean_a'] = row["mean"]

如何使用 .startswith 在数据框中创建新变量？

How to create a new variable in a dataframe using .startswith?

python

startswith

dataframe

pandas