如何 select python 中的特定类别的垃圾箱?
How to select a specific category of bins in python?
我有一个数字列表,我使用 pandas.cut() 将它们分成了多个箱子。我怎样才能 select 一类垃圾箱?
manhattanBedrmsPrice.head()
0 859
5 1055
9 615
11 663
13 1317
Name: Price Value, dtype: int64
bins = [400,600,800,1000,1200, 1400,1600,1800,2000,2200,2400,2600,2800,3000]
manPriceCategories = pd.cut(manhattanBedrmsPrice, bins)
我得到以下类别:
Categories (13, interval[int64]): [(400, 600] < (600, 800] < (800, 1000] < (1000, 1200] ... (2200, 2400] < (2400, 2600] < (2600, 2800] < (2800, 3000]]
如何 select 特定类别?
你的情况
manPriceCategories.loc[manPriceCategories.isin([pd.Interval(600,800)])]
或者使用类别数据的代码
manPriceCategories[manPriceCategories.cat.codes==1]
1 (600, 800]
2 (600, 800]
Name: 1, dtype: category
您可以将类别分配给一个变量(例如 cats
),然后使用布尔索引检查您的系列中的值是否等于感兴趣的类别。
cats = manPriceCategories.cat.categories
>>> manPriceCategories.loc[manPriceCategories.eq(cats[1])]
9 (600, 800]
11 (600, 800]
Name: Price Value, dtype: category
Categories (13, interval[int64]): [(400, 600] < (600, 800] < (800, 1000] < (1000, 1200] ... (2200, 2400] < (2400, 2600] < (2600, 2800] < (2800, 3000]]
cats = manPriceCategories.cat.categories
cats = manPriceCategories.cat.categories
您可以使用字典理解来枚举您的类别,以便您知道它们的索引位置(例如,1
表示上面的 (600, 800])。
>>> {n: cat for n, cat in enumerate(cats)}
{0: Interval(400, 600, closed='right'),
1: Interval(600, 800, closed='right'),
2: Interval(800, 1000, closed='right'),
3: Interval(1000, 1200, closed='right'),
4: Interval(1200, 1400, closed='right'),
5: Interval(1400, 1600, closed='right'),
6: Interval(1600, 1800, closed='right'),
7: Interval(1800, 2000, closed='right'),
8: Interval(2000, 2200, closed='right'),
9: Interval(2200, 2400, closed='right'),
10: Interval(2400, 2600, closed='right'),
11: Interval(2600, 2800, closed='right'),
12: Interval(2800, 3000, closed='right')}
我有一个数字列表,我使用 pandas.cut() 将它们分成了多个箱子。我怎样才能 select 一类垃圾箱?
manhattanBedrmsPrice.head()
0 859
5 1055
9 615
11 663
13 1317
Name: Price Value, dtype: int64
bins = [400,600,800,1000,1200, 1400,1600,1800,2000,2200,2400,2600,2800,3000]
manPriceCategories = pd.cut(manhattanBedrmsPrice, bins)
我得到以下类别:
Categories (13, interval[int64]): [(400, 600] < (600, 800] < (800, 1000] < (1000, 1200] ... (2200, 2400] < (2400, 2600] < (2600, 2800] < (2800, 3000]]
如何 select 特定类别?
你的情况
manPriceCategories.loc[manPriceCategories.isin([pd.Interval(600,800)])]
或者使用类别数据的代码
manPriceCategories[manPriceCategories.cat.codes==1]
1 (600, 800]
2 (600, 800]
Name: 1, dtype: category
您可以将类别分配给一个变量(例如 cats
),然后使用布尔索引检查您的系列中的值是否等于感兴趣的类别。
cats = manPriceCategories.cat.categories
>>> manPriceCategories.loc[manPriceCategories.eq(cats[1])]
9 (600, 800]
11 (600, 800]
Name: Price Value, dtype: category
Categories (13, interval[int64]): [(400, 600] < (600, 800] < (800, 1000] < (1000, 1200] ... (2200, 2400] < (2400, 2600] < (2600, 2800] < (2800, 3000]]
cats = manPriceCategories.cat.categories
cats = manPriceCategories.cat.categories
您可以使用字典理解来枚举您的类别,以便您知道它们的索引位置(例如,1
表示上面的 (600, 800])。
>>> {n: cat for n, cat in enumerate(cats)}
{0: Interval(400, 600, closed='right'),
1: Interval(600, 800, closed='right'),
2: Interval(800, 1000, closed='right'),
3: Interval(1000, 1200, closed='right'),
4: Interval(1200, 1400, closed='right'),
5: Interval(1400, 1600, closed='right'),
6: Interval(1600, 1800, closed='right'),
7: Interval(1800, 2000, closed='right'),
8: Interval(2000, 2200, closed='right'),
9: Interval(2200, 2400, closed='right'),
10: Interval(2400, 2600, closed='right'),
11: Interval(2600, 2800, closed='right'),
12: Interval(2800, 3000, closed='right')}