按日期升序排列的日期列的频数条形图
FREQUENCY BAR CHART OF A DATE COLUMN IN AN ASCENDING ORDER OF DATES
所以,我有一个数据集(它的一些第一行粘贴在这里)。我的目标是绘制 'sample_date' 列的频率分布。起初对我来说似乎很简单。只需将列转换为日期时间,默认按升序对值(日期)进行排序,最后绘制条形图。但问题是条形图显示的不是日期的升序(这是我想要得到的),而是显示与这些日期相对应的价值计数的降序。
代码如下:
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('dataset.csv')
data['sample_date'] = pd.to_datetime(data['sample_date'])
data = data.sort_values(by='sample_date')
data['sample_date'].value_counts().plot(kind='bar')
这里是 dataset.csv:
,sequence_name,sample_date,epi_week,epi_date,lineage
1,England/MILK-1647769/2021,2021-06-07,76,2021-06-06,C.37
2,England/MILK-156082C/2021,2021-05-06,71,2021-05-02,C.37
3,England/CAMC-149B04F/2021,2021-03-30,66,2021-03-28,C.37
4,England/CAMC-13962F4/2021,2021-03-04,62,2021-02-28,C.37
5,England/CAMC-13238EB/2021,2021-02-23,61,2021-02-21,C.37
0,England/PHEC-L304L78C/2021,2021-05-12,72,2021-05-09,B.1.617.3
1,England/MILK-15607D4/2021,2021-05-06,71,2021-05-02,B.1.617.3
2,England/MILK-156C77E/2021,2021-05-05,71,2021-05-02,B.1.617.3
4,England/PHEC-K305K062/2021,2021-04-25,70,2021-04-25,B.1.617.3
5,England/PHEC-K305K080/2021,2021-04-25,70,2021-04-25,B.1.617.3
6,England/ALDP-153351C/2021,2021-04-23,69,2021-04-18,B.1.617.3
7,England/PHEC-30C13B/2021,2021-04-22,69,2021-04-18,B.1.617.3
8,England/PHEC-30AFE8/2021,2021-04-22,69,2021-04-18,B.1.617.3
9,England/PHEC-30A935/2021,2021-04-21,69,2021-04-18,B.1.617.3
10,England/ALDP-152BC6D/2021,2021-04-21,69,2021-04-18,B.1.617.3
11,England/ALDP-15192D9/2021,2021-04-17,68,2021-04-11,B.1.617.3
12,England/ALDP-1511E0A/2021,2021-04-15,68,2021-04-11,B.1.617.3
13,England/PHEC-306896/2021,2021-04-12,68,2021-04-11,B.1.617.3
14,England/PORT-2DFB70/2021,2021-04-06,67,2021-04-04,B.1.617.3
这是我得到的和不想得到的:
BAR CHART FOR THE 'SAMPLE_DATE' COLUMN IN A DESCENDING ORDER OF VALUE COUNTS OF THE DATES
value_counts()
为您提供了添加标志的选项 - ascending
您只需将其设置为 True,条形图将按升序排列。实际上你根本不需要使用 sort_values()
。
查看 value_counts()
文档:https://pandas.pydata.org/docs/reference/api/pandas.Series.value_counts.html
代码:
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('dataset.csv')
data['sample_date'] = pd.to_datetime(data['sample_date'])
data['sample_date'].value_counts(ascending=True).plot(kind='bar')
plt.show()
输出:
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('dataset.csv')
data['sample_date'] = pd.to_datetime(data['sample_date'])
data['sample_date'].value_counts().sort_index().plot(kind='bar') # Use sort_index()
plt.tight_layout()
plt.show()
所以,我有一个数据集(它的一些第一行粘贴在这里)。我的目标是绘制 'sample_date' 列的频率分布。起初对我来说似乎很简单。只需将列转换为日期时间,默认按升序对值(日期)进行排序,最后绘制条形图。但问题是条形图显示的不是日期的升序(这是我想要得到的),而是显示与这些日期相对应的价值计数的降序。
代码如下:
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('dataset.csv')
data['sample_date'] = pd.to_datetime(data['sample_date'])
data = data.sort_values(by='sample_date')
data['sample_date'].value_counts().plot(kind='bar')
这里是 dataset.csv:
,sequence_name,sample_date,epi_week,epi_date,lineage
1,England/MILK-1647769/2021,2021-06-07,76,2021-06-06,C.37
2,England/MILK-156082C/2021,2021-05-06,71,2021-05-02,C.37
3,England/CAMC-149B04F/2021,2021-03-30,66,2021-03-28,C.37
4,England/CAMC-13962F4/2021,2021-03-04,62,2021-02-28,C.37
5,England/CAMC-13238EB/2021,2021-02-23,61,2021-02-21,C.37
0,England/PHEC-L304L78C/2021,2021-05-12,72,2021-05-09,B.1.617.3
1,England/MILK-15607D4/2021,2021-05-06,71,2021-05-02,B.1.617.3
2,England/MILK-156C77E/2021,2021-05-05,71,2021-05-02,B.1.617.3
4,England/PHEC-K305K062/2021,2021-04-25,70,2021-04-25,B.1.617.3
5,England/PHEC-K305K080/2021,2021-04-25,70,2021-04-25,B.1.617.3
6,England/ALDP-153351C/2021,2021-04-23,69,2021-04-18,B.1.617.3
7,England/PHEC-30C13B/2021,2021-04-22,69,2021-04-18,B.1.617.3
8,England/PHEC-30AFE8/2021,2021-04-22,69,2021-04-18,B.1.617.3
9,England/PHEC-30A935/2021,2021-04-21,69,2021-04-18,B.1.617.3
10,England/ALDP-152BC6D/2021,2021-04-21,69,2021-04-18,B.1.617.3
11,England/ALDP-15192D9/2021,2021-04-17,68,2021-04-11,B.1.617.3
12,England/ALDP-1511E0A/2021,2021-04-15,68,2021-04-11,B.1.617.3
13,England/PHEC-306896/2021,2021-04-12,68,2021-04-11,B.1.617.3
14,England/PORT-2DFB70/2021,2021-04-06,67,2021-04-04,B.1.617.3
这是我得到的和不想得到的: BAR CHART FOR THE 'SAMPLE_DATE' COLUMN IN A DESCENDING ORDER OF VALUE COUNTS OF THE DATES
value_counts()
为您提供了添加标志的选项 - ascending
您只需将其设置为 True,条形图将按升序排列。实际上你根本不需要使用 sort_values()
。
查看 value_counts()
文档:https://pandas.pydata.org/docs/reference/api/pandas.Series.value_counts.html
代码:
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('dataset.csv')
data['sample_date'] = pd.to_datetime(data['sample_date'])
data['sample_date'].value_counts(ascending=True).plot(kind='bar')
plt.show()
输出:
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('dataset.csv')
data['sample_date'] = pd.to_datetime(data['sample_date'])
data['sample_date'].value_counts().sort_index().plot(kind='bar') # Use sort_index()
plt.tight_layout()
plt.show()