如何对关联规则分析(先验)进行一次热编码数据框
How to One Hot Encode Dataframe for Association Rule Analysis (apriori)
我得到了一个模拟购物清单的数据框:
import pandas as pd
data = {'Produce': ['Brocolli', 'Spinach','Spinach','Lettuce','Brocolli','Lettuce','Lettuce',],
'Dairy': ['Milk', '','Milk','Cheese','Milk','Yogurt','Yogurt',],
'Beverage': ['', '','Orange Juice','Soda','Soda','Orange juice','',],
'Fruit': ['Brocolli', 'Spinach','Spinach','Lettuce','Brocolli','Lettuce','Lettuce',],
'Poultry': ['Chicken Tender', 'Chicken Breasts','Chicken Tender','Chicken Thigh','Chicken Breasts','','Chicken Breasts',],
'Deli': ['Turkey Breasts', 'Ham','Ham','','','Turkey Breasts','',],
}
df = pd.DataFrame (data, columns = ['Produce','Dairy','Beverage','Fruit','Deli'])
df
我如何执行单热编码来转换此数据框,以便我可以 运行 先验地处理它(基本上所有独特的值作为列标签和值都替换为布尔值,据我所知)?
你可以试试:
pd.get_dummies(df)
我得到了一个模拟购物清单的数据框:
import pandas as pd
data = {'Produce': ['Brocolli', 'Spinach','Spinach','Lettuce','Brocolli','Lettuce','Lettuce',],
'Dairy': ['Milk', '','Milk','Cheese','Milk','Yogurt','Yogurt',],
'Beverage': ['', '','Orange Juice','Soda','Soda','Orange juice','',],
'Fruit': ['Brocolli', 'Spinach','Spinach','Lettuce','Brocolli','Lettuce','Lettuce',],
'Poultry': ['Chicken Tender', 'Chicken Breasts','Chicken Tender','Chicken Thigh','Chicken Breasts','','Chicken Breasts',],
'Deli': ['Turkey Breasts', 'Ham','Ham','','','Turkey Breasts','',],
}
df = pd.DataFrame (data, columns = ['Produce','Dairy','Beverage','Fruit','Deli'])
df
我如何执行单热编码来转换此数据框,以便我可以 运行 先验地处理它(基本上所有独特的值作为列标签和值都替换为布尔值,据我所知)?
你可以试试:
pd.get_dummies(df)