在工作日重新订购 Pandas 个系列
Re-order Pandas Series on weekday
使用 Pandas,我提取了一个 CSV 文件,然后创建了一系列数据来找出一周中哪几天崩溃最多:
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
然后我将其绘制出来,但当然它按照与系列相同的排名顺序绘制它们。
crashes_by_day.plot(kind='bar')
将这些重新排列为周一、周二、周三、周四、周五、周六、周日的最有效方法是什么?
我必须把它分解成一个列表吗?谢谢。
您可以使用 Ordered Categorical
and then sort_index
:
print bc
DAY_OF_WEEK a b
0 Sunday 0.7 0.5
1 Monday 0.4 0.1
2 Tuesday 0.3 0.2
3 Wednesday 0.4 0.1
4 Thursday 0.3 0.6
5 Friday 0.4 0.9
6 Saturday 0.3 0.2
7 Sunday 0.7 0.5
8 Monday 0.4 0.1
9 Tuesday 0.3 0.2
10 Wednesday 0.4 0.1
11 Thursday 0.3 0.6
12 Friday 0.4 0.9
13 Saturday 0.3 0.2
14 Sunday 0.7 0.5
15 Monday 0.4 0.1
16 Tuesday 0.3 0.2
17 Wednesday 0.4 0.1
18 Thursday 0.3 0.6
19 Friday 0.4 0.9
20 Saturday 0.3 0.2
bc['DAY_OF_WEEK'] = pd.Categorical(bc['DAY_OF_WEEK'], categories=
['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday'],
ordered=True)
print bc['DAY_OF_WEEK']
0 Sunday
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday
7 Sunday
8 Monday
9 Tuesday
10 Wednesday
11 Thursday
12 Friday
13 Saturday
14 Sunday
15 Monday
16 Tuesday
17 Wednesday
18 Thursday
19 Friday
20 Saturday
Name: DAY_OF_WEEK, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
crashes_by_day = crashes_by_day.sort_index()
print crashes_by_day
Monday 3
Tuesday 3
Wednesday 3
Thursday 3
Friday 3
Saturday 3
Sunday 3
dtype: int64
crashes_by_day.plot(kind='bar')
没有 Categorical
的下一个可能的解决方案是通过映射设置排序:
crashes_by_day = bc['DAY_OF_WEEK'].value_counts().reset_index()
crashes_by_day.columns = ['DAY_OF_WEEK', 'count']
print crashes_by_day
DAY_OF_WEEK count
0 Thursday 3
1 Wednesday 3
2 Friday 3
3 Tuesday 3
4 Monday 3
5 Saturday 3
6 Sunday 3
days = ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday']
mapping = {day: i for i, day in enumerate(days)}
key = crashes_by_day['DAY_OF_WEEK'].map(mapping)
print key
0 3
1 2
2 4
3 1
4 0
5 5
6 6
Name: DAY_OF_WEEK, dtype: int64
crashes_by_day = crashes_by_day.iloc[key.argsort()].set_index('DAY_OF_WEEK')
print crashes_by_day
count
DAY_OF_WEEK
Monday 3
Tuesday 3
Wednesday 3
Thursday 3
Friday 3
Saturday 3
Sunday 3
crashes_by_day.plot(kind='bar')
使用 Pandas,我提取了一个 CSV 文件,然后创建了一系列数据来找出一周中哪几天崩溃最多:
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
然后我将其绘制出来,但当然它按照与系列相同的排名顺序绘制它们。
crashes_by_day.plot(kind='bar')
将这些重新排列为周一、周二、周三、周四、周五、周六、周日的最有效方法是什么?
我必须把它分解成一个列表吗?谢谢。
您可以使用 Ordered Categorical
and then sort_index
:
print bc
DAY_OF_WEEK a b
0 Sunday 0.7 0.5
1 Monday 0.4 0.1
2 Tuesday 0.3 0.2
3 Wednesday 0.4 0.1
4 Thursday 0.3 0.6
5 Friday 0.4 0.9
6 Saturday 0.3 0.2
7 Sunday 0.7 0.5
8 Monday 0.4 0.1
9 Tuesday 0.3 0.2
10 Wednesday 0.4 0.1
11 Thursday 0.3 0.6
12 Friday 0.4 0.9
13 Saturday 0.3 0.2
14 Sunday 0.7 0.5
15 Monday 0.4 0.1
16 Tuesday 0.3 0.2
17 Wednesday 0.4 0.1
18 Thursday 0.3 0.6
19 Friday 0.4 0.9
20 Saturday 0.3 0.2
bc['DAY_OF_WEEK'] = pd.Categorical(bc['DAY_OF_WEEK'], categories=
['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday'],
ordered=True)
print bc['DAY_OF_WEEK']
0 Sunday
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday
7 Sunday
8 Monday
9 Tuesday
10 Wednesday
11 Thursday
12 Friday
13 Saturday
14 Sunday
15 Monday
16 Tuesday
17 Wednesday
18 Thursday
19 Friday
20 Saturday
Name: DAY_OF_WEEK, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
crashes_by_day = crashes_by_day.sort_index()
print crashes_by_day
Monday 3
Tuesday 3
Wednesday 3
Thursday 3
Friday 3
Saturday 3
Sunday 3
dtype: int64
crashes_by_day.plot(kind='bar')
没有 Categorical
的下一个可能的解决方案是通过映射设置排序:
crashes_by_day = bc['DAY_OF_WEEK'].value_counts().reset_index()
crashes_by_day.columns = ['DAY_OF_WEEK', 'count']
print crashes_by_day
DAY_OF_WEEK count
0 Thursday 3
1 Wednesday 3
2 Friday 3
3 Tuesday 3
4 Monday 3
5 Saturday 3
6 Sunday 3
days = ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday']
mapping = {day: i for i, day in enumerate(days)}
key = crashes_by_day['DAY_OF_WEEK'].map(mapping)
print key
0 3
1 2
2 4
3 1
4 0
5 5
6 6
Name: DAY_OF_WEEK, dtype: int64
crashes_by_day = crashes_by_day.iloc[key.argsort()].set_index('DAY_OF_WEEK')
print crashes_by_day
count
DAY_OF_WEEK
Monday 3
Tuesday 3
Wednesday 3
Thursday 3
Friday 3
Saturday 3
Sunday 3
crashes_by_day.plot(kind='bar')