如何在 python 中按日期对阳性病例求和
How to sum positive cases by date in python
我是编码新手,我处理的是 COVID 阳性病例。我有一个包含所有阳性病例的 DataFrame(例如 case 1 : positive : 2020-04-30),我必须以数字方式计算每天阳性病例的总和(i.g 3 个阳性案例:2020-04-30) 为了创建一个情节,但我不知道如何用 python 做他,我知道我必须使用 groupby()
和 sum ()
函数,但我不知道该怎么做,你能帮帮我吗?
谢谢!
Here is a table which represent my DataFrame
Here is a table of what I want to have as result
我的实际数据是保密的,但我给你一个样本my actual data
您应该按列聚合然后对结果求和,试试这个:
请注意,患者姓名应该有一个数字计数器以供跟踪。
import pandas as pd
import datetime
import numpy as np
# this a dummy set, you should have already this in your data frame
dict_df = {'Patient': [1,2,3,4,5], 'Positive': ['Positive'] * 5, 'Date': [datetime.date(2020, 4, 21), datetime.date(2020, 4, 22), datetime.date(2020, 4, 21), datetime.date(2020, 4, 23), datetime.date(2020, 4, 22)]}
df = pd.DataFrame(dict_df)
# create a numerics counter
cases = df['Positive'].to_numpy()
counter = np.where(cases == 'Positive', 1, 0)
# add column to data frame
df['counter'] = counter
# use groupby and sum
results = df['counter'].groupby(df['Date']).sum()
print(results)
#Results
#Date Cases
#2020-04-21 2
#2020-04-22 2
#2020-04-23 1
我是编码新手,我处理的是 COVID 阳性病例。我有一个包含所有阳性病例的 DataFrame(例如 case 1 : positive : 2020-04-30),我必须以数字方式计算每天阳性病例的总和(i.g 3 个阳性案例:2020-04-30) 为了创建一个情节,但我不知道如何用 python 做他,我知道我必须使用 groupby()
和 sum ()
函数,但我不知道该怎么做,你能帮帮我吗?
谢谢!
Here is a table which represent my DataFrame
Here is a table of what I want to have as result
我的实际数据是保密的,但我给你一个样本my actual data
您应该按列聚合然后对结果求和,试试这个:
请注意,患者姓名应该有一个数字计数器以供跟踪。
import pandas as pd
import datetime
import numpy as np
# this a dummy set, you should have already this in your data frame
dict_df = {'Patient': [1,2,3,4,5], 'Positive': ['Positive'] * 5, 'Date': [datetime.date(2020, 4, 21), datetime.date(2020, 4, 22), datetime.date(2020, 4, 21), datetime.date(2020, 4, 23), datetime.date(2020, 4, 22)]}
df = pd.DataFrame(dict_df)
# create a numerics counter
cases = df['Positive'].to_numpy()
counter = np.where(cases == 'Positive', 1, 0)
# add column to data frame
df['counter'] = counter
# use groupby and sum
results = df['counter'].groupby(df['Date']).sum()
print(results)
#Results
#Date Cases
#2020-04-21 2
#2020-04-22 2
#2020-04-23 1