无法理解 python 中 .mode() 的用法
Not able to understand the use of .mode() in python
我有一个要求,我需要找出最受欢迎的开始时间。
以下是帮助我找到正确解决方案的代码。
import time
import pandas as pd
import numpy as np
# bunch of code comes
# here
# that help in reaching the following steps
df = pd.read_csv(CITY_DATA[selected_city])
# convert the Start Time column to datetime
df['Start Time'] = pd.to_datetime(df['Start Time'])
# extract hour from the Start Time column to create an hour column
df['hour'] = df['Start Time'].dt.hour
# extract month and day of week from Start Time to create new columns
df['month'] = df['Start Time'].dt.month
df['day_of_week'] = df['Start Time'].dt.weekday_name
# find the most popular hour
popular_hour = df['hour'].mode()[0]
这是我尝试 运行 此查询
时得到的示例 o/p
"print(df['hour'])"
0 15
1 17
2 8
3 13
4 14
5 9
6 9
7 17
8 16
9 17
10 7
11 17
Name: hour, Length: 300000, dtype: int64
我使用
得到的o/p
print(type(df['hour']))
<class 'pandas.core.series.Series'>
最受欢迎的开始时间值存储在popular_hour中等于“17”(这是正确的值)
但是我无法理解 .mode()[0]
的部分
What does this .mode() do and why [0] ?
And will the same concept be to calculate popular month and popular day of the week also irrespective of their datatype
mode
returns 一个系列:
df.mode()
0 17
dtype: int64
由此,您通过调用
获得第一项
df.mode()[0]
17
注意总是返回一个Series,有时如果mode有多个值,则全部返回:
pd.Series([1, 1, 2, 2, 3, 3]).mode()
0 1
1 2
2 3
dtype: int64
您仍然会每次都取第一个值并丢弃其余值。请注意,当返回多个模式时,它们总是 排序。
阅读 mode
上的文档了解更多信息。
我有一个要求,我需要找出最受欢迎的开始时间。 以下是帮助我找到正确解决方案的代码。
import time
import pandas as pd
import numpy as np
# bunch of code comes
# here
# that help in reaching the following steps
df = pd.read_csv(CITY_DATA[selected_city])
# convert the Start Time column to datetime
df['Start Time'] = pd.to_datetime(df['Start Time'])
# extract hour from the Start Time column to create an hour column
df['hour'] = df['Start Time'].dt.hour
# extract month and day of week from Start Time to create new columns
df['month'] = df['Start Time'].dt.month
df['day_of_week'] = df['Start Time'].dt.weekday_name
# find the most popular hour
popular_hour = df['hour'].mode()[0]
这是我尝试 运行 此查询
时得到的示例 o/p"print(df['hour'])"
0 15
1 17
2 8
3 13
4 14
5 9
6 9
7 17
8 16
9 17
10 7
11 17
Name: hour, Length: 300000, dtype: int64
我使用
得到的o/pprint(type(df['hour']))
<class 'pandas.core.series.Series'>
最受欢迎的开始时间值存储在popular_hour中等于“17”(这是正确的值)
但是我无法理解 .mode()[0]
的部分What does this .mode() do and why [0] ?
And will the same concept be to calculate popular month and popular day of the week also irrespective of their datatype
mode
returns 一个系列:
df.mode()
0 17
dtype: int64
由此,您通过调用
获得第一项df.mode()[0]
17
注意总是返回一个Series,有时如果mode有多个值,则全部返回:
pd.Series([1, 1, 2, 2, 3, 3]).mode()
0 1
1 2
2 3
dtype: int64
您仍然会每次都取第一个值并丢弃其余值。请注意,当返回多个模式时,它们总是 排序。
阅读 mode
上的文档了解更多信息。