重命名类别
Renaming categories
text category
-----------------------------------------------
nike shoes from nike brought by ankit
flour grocery
rice grocery
adidas shoes from adidas are cool
以上是我的数据集格式。我如何在分类时概括类别。
示例我希望输出为:-
text category
-----------------------------------------------
nike shoes from brand
flour grocery
rice grocery
adidas shoes from brand
一种方法是使用自定义函数 pd.DataFrame.apply
:
import pandas as pd
df = pd.DataFrame({'text': ['nike', 'flour', 'rice', 'adidas'],
'category': ['shoes from nike bought by ankit', 'grocery', 'grocery',
'shoes from adidas are cool']})
def converter(row):
if row['text'] in row['category']:
return row['category'].split(' from ')[0] + ' from brand'
else:
return row['category']
df['category'] = df.apply(converter, axis=1)
# category text
# 0 shoes from brand nike
# 1 grocery flour
# 2 grocery rice
# 3 shoes from brand adidas
text category
-----------------------------------------------
nike shoes from nike brought by ankit
flour grocery
rice grocery
adidas shoes from adidas are cool
以上是我的数据集格式。我如何在分类时概括类别。 示例我希望输出为:-
text category
-----------------------------------------------
nike shoes from brand
flour grocery
rice grocery
adidas shoes from brand
一种方法是使用自定义函数 pd.DataFrame.apply
:
import pandas as pd
df = pd.DataFrame({'text': ['nike', 'flour', 'rice', 'adidas'],
'category': ['shoes from nike bought by ankit', 'grocery', 'grocery',
'shoes from adidas are cool']})
def converter(row):
if row['text'] in row['category']:
return row['category'].split(' from ')[0] + ' from brand'
else:
return row['category']
df['category'] = df.apply(converter, axis=1)
# category text
# 0 shoes from brand nike
# 1 grocery flour
# 2 grocery rice
# 3 shoes from brand adidas