为给定类别创建唯一标识符 (Python)
Creating unique identifier for given category (Python)
我的数据是这样的:
category2 cat-ID
0 cat1 0000
1 cat1 0000
2 cat2 0000
3 cat2 0000
4 cat2 0000
5 cat3 0000
6 cat4 0000
7 cat4 0000
目的是获取一个ID,该ID采用类别编号以及特定类别元素的计数器。它应该是这样的:
category2 cat-ID
0 cat1 1001
1 cat1 1002
2 cat2 2001
3 cat2 2002
4 cat2 2003
5 cat3 3001
6 cat4 4001
7 cat4 4002
从字符串中提取整数部分并乘以1000,分组并为每一行分配行号。添加两列
import pandas as pd
df = pd.DataFrame(
[['cat1',0000],['cat1',0000],['cat2',0000],['cat2',0000],
['cat2',0000],['cat3',0000],['cat4',0000],['cat4',0000]],columns = ['category2','cat-ID'])
df['cat_helper']=df['category2'].str.extract('(\d+)').astype(int)*1000
df['row_number'] = df.groupby(['category2','cat_helper']).cumcount()+1
df['final_cat_ID'] = df['row_number'] +df['cat_helper']
df
输出
category2 cat-ID cat_helper row_number final_cat_ID
cat1 0 1000 1 1001
cat1 0 1000 2 1002
cat2 0 2000 1 2001
cat2 0 2000 2 2002
cat2 0 2000 3 2003
cat3 0 3000 1 3001
cat4 0 4000 1 4001
cat4 0 4000 2 4002
我的数据是这样的:
category2 cat-ID
0 cat1 0000
1 cat1 0000
2 cat2 0000
3 cat2 0000
4 cat2 0000
5 cat3 0000
6 cat4 0000
7 cat4 0000
目的是获取一个ID,该ID采用类别编号以及特定类别元素的计数器。它应该是这样的:
category2 cat-ID
0 cat1 1001
1 cat1 1002
2 cat2 2001
3 cat2 2002
4 cat2 2003
5 cat3 3001
6 cat4 4001
7 cat4 4002
从字符串中提取整数部分并乘以1000,分组并为每一行分配行号。添加两列
import pandas as pd
df = pd.DataFrame(
[['cat1',0000],['cat1',0000],['cat2',0000],['cat2',0000],
['cat2',0000],['cat3',0000],['cat4',0000],['cat4',0000]],columns = ['category2','cat-ID'])
df['cat_helper']=df['category2'].str.extract('(\d+)').astype(int)*1000
df['row_number'] = df.groupby(['category2','cat_helper']).cumcount()+1
df['final_cat_ID'] = df['row_number'] +df['cat_helper']
df
输出
category2 cat-ID cat_helper row_number final_cat_ID
cat1 0 1000 1 1001
cat1 0 1000 2 1002
cat2 0 2000 1 2001
cat2 0 2000 2 2002
cat2 0 2000 3 2003
cat3 0 3000 1 3001
cat4 0 4000 1 4001
cat4 0 4000 2 4002