
pandas: increment based on a condition in another column


import pandas as pd

dataframe =pd.DataFrame({'text': ['##weather','how is today?', 'we go out', '##rain',
                     'my day is rainy', 'I am not feeling well','rainy 
                    blues','##flower','the blue flower', 'she likes red',
                    'this flower is nice']})

我想添加一个名为 'id' 的第二列,并在每次该行包含“##”时递增。所以我想要的输出是,

                    text  id
0              ##weather  100
1          how is today?  100
2              we go out  100
3                 ##rain  101
4        my day is rainy  101
5  I am not feeling well  101
6            rainy blues  101
7                ##flower 102
8         the blue flower 102
9           she likes red 102
10    this flower is nice 102


dataframe['id']= 100
dataframe.loc[dataframe['text'].str.contains('## intent:'), 'id'] += 1

您可以尝试 groupbyngroup

m = dataframe['text'].str.contains('##').cumsum()

dataframe['id'] = dataframe.groupby(m).ngroup() + 100

                     text   id
0               ##weather  100
1           how is today?  100
2               we go out  100
3                  ##rain  101
4         my day is rainy  101
5   I am not feeling well  101
6                   rainy  101
7                   blues  101
8                ##flower  102
9         the blue flower  102
10          she likes red  102
11    this flower is nice  102