将列的数据转换为枚举字典键值

Convert column's data to enumerated dictionary key-value

是否有更好的方法(在最少代码的意义上)可以执行以下操作:将列转换为枚举数值,因此它应该是这样的:

  1. 列中获取
  2. 制作一个枚举字典,键值
  3. 用值
  4. 还原密钥
  5. 使用键值结果而不是新列中的数据。

所以这就是我今天所做的,想知道是否有人可以展示一种经典的方法来做到这一点,这样我就可以避免编写函数 get_color_val:

import pandas as pd  
cars = pd.DataFrame({"car_name": ["BMW","BMW","ACCURA","ACCURA","ACCURA","BMW","BMW","BMW"],"color":["RED","RED","RED","RED","GREEN","BLACK","BLUE","BLUE"]})

color_dict = dict(enumerate(set(cars["color"])))
color_dict = dict((y,x) for x,y in color_dict.iteritems())

def get_color_val(row):
    my_key = row["color"]
    my_value = color_dict.get(my_key)
    return my_value

cars["color_val"] = cars.apply(get_color_val, axis=1)
cars = cars.drop("color",1)
print cars

Result

Before------------
car_name  color
0      BMW    RED
1      BMW    RED
2   ACCURA    RED
3   ACCURA    RED
4   ACCURA  GREEN
5      BMW  BLACK
6      BMW   BLUE
7      BMW   BLUE


After------------
car_name  color_val
0      BMW          3
1      BMW          3
2   ACCURA          3
3   ACCURA          3
4   ACCURA          2
5      BMW          1
6      BMW          0
7      BMW          0

在这种情况下我会使用 pd.factorize()

In [8]: cars['color_val'] = pd.factorize(cars.color)[0]

In [9]: cars
Out[9]:
  car_name  color  color_val
0      BMW    RED          0
1      BMW    RED          0
2   ACCURA    RED          0
3   ACCURA    RED          0
4   ACCURA  GREEN          1
5      BMW  BLACK          2
6      BMW   BLUE          3
7      BMW   BLUE          3