如何统计python(pandas)中某列的数据?

how to count data in a certain column in python(pandas)?

希望你一切顺利。 我尝试在下面的 table 中的另一个绿色行之后计算绿色行 在 [1] 中:df = pd.DataFrame([[green], [red], [red]], columns=['A'])

我尝试计算 greengreen 的代码:

 for index,row in data.iterrows():
   if finalData['Color'].loc[i]=='green' & finalData['Color'].loc[i+1]=='green':
    greengreen+=1
    i+=1

但是没有成功,希望你能帮忙。注意:我是数据科学的新手

data = {'col1': ['A','B','C','D'],\
        'col2': ['green','green', 'red','green']}
df = pd.DataFrame(data) 
df
index col1 col2
0 A green
1 B green
2 C red
3 D green
df.col2.values
greengreen = 0
greenred = 0
redgreen = 0

for i in range(len(df.col2.values)):
  if i < (len(df.col2.values)-1): 
    if df.col2.values[i] == 'green' and df.col2.values[i+1] == 'green':
      greengreen += 1
    elif df.col2.values[i] == 'green' and df.col2.values[i+1] == 'red':
      greenred += 1
    elif df.col2.values[i] == 'red' and df.col2.values[i+1] == 'green':
      redgreen += 1
    else:
      print('?')

print(greengreen, greenred, redgreen)
1 1 1

您可以使用:

# is the color green?
m = df['color'].eq('green')
# count the matches that precede another match
greengreen = (m&m.shift()).sum()

作为 one-liner (python ≥ 3.8):

greengreen = ((m:=df['color'].eq('green'))&m.shift()).sum()

示例输入:

df = pd.DataFrame({'color': ['green', 'green', 'green', 'red', 'green', 'red', 'green', 'green']})

输出:3

IIUC,

count = (df['Color'].eq('green') & df['Color'].shift().eq('green')).sum()