为给定类别创建唯一标识符 (Python)

Creating unique identifier for given category (Python)

我的数据是这样的:

 category2  cat-ID
0   cat1    0000
1   cat1    0000
2   cat2    0000
3   cat2    0000
4   cat2    0000
5   cat3    0000
6   cat4    0000
7   cat4    0000

目的是获取一个ID,该ID采用类别编号以及特定类别元素的计数器。它应该是这样的:

 category2  cat-ID
    0   cat1    1001
    1   cat1    1002
    2   cat2    2001
    3   cat2    2002
    4   cat2    2003
    5   cat3    3001
    6   cat4    4001
    7   cat4    4002

从字符串中提取整数部分并乘以1000,分组并为每一行分配行号。添加两列

import pandas as pd

df = pd.DataFrame(
[['cat1',0000],['cat1',0000],['cat2',0000],['cat2',0000],
['cat2',0000],['cat3',0000],['cat4',0000],['cat4',0000]],columns = ['category2','cat-ID'])


df['cat_helper']=df['category2'].str.extract('(\d+)').astype(int)*1000
df['row_number'] = df.groupby(['category2','cat_helper']).cumcount()+1
df['final_cat_ID'] = df['row_number'] +df['cat_helper']
df

输出

category2   cat-ID  cat_helper  row_number  final_cat_ID
cat1           0       1000          1           1001
cat1           0       1000          2           1002
cat2           0       2000          1           2001
cat2           0       2000          2           2002
cat2           0       2000          3           2003
cat3           0       3000          1           3001
cat4           0       4000          1           4001
cat4           0       4000          2           4002