如何查找一天合并了多少行
how to find how many rows were merged by one day
我有一个数据框
tickers dt AAPL AMC AMZN ... TH TSLA VIAC WKHS
0 2021-03-19 15:11:11+00:00 0 0 0 ... 0 0 0 0
1 2021-03-22 12:43:45+00:00 0 0 0 ... 0 0 0 0
2 2021-03-22 13:07:46+00:00 0 0 0 ... 0 1 0 0
3 2021-03-22 13:55:05+00:00 0 0 0 ... 0 2 0 0
4 2021-03-23 04:59:01+00:00 0 0 0 ... 0 0 0 0
.. ... ... ... ... ... .. ... ... ...
835 2021-07-29 23:05:30+00:00 0 0 0 ... 0 0 0 0
836 2021-07-30 01:52:35+00:00 0 0 1 ... 0 0 0 0
我想在 1 天内合并整个数据框。之后,我想将每列的每个合并数字除以每天的行数。
我试过合并
bullish_comments_df1 = bullish_comments_df1.resample('1D').sum()
但是我不知道如何除以每天合并的行数
谢谢你的帮助
使用均值而不是求和和计数按天购买重新抽样。
df = df.reset_index(drop=True)
df.index = pd.DatetimeIndex(df['dt'])
df.resample('1D').mean()
AAPL AMC AMZN TH TSLA VIAC WKHS
dt
2021-03-19 00:00:00+00:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2021-03-20 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN
2021-03-21 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN
2021-03-22 00:00:00+00:00 0.0 0.0 0.0 0.0 1.0 0.0 0.0
2021-03-23 00:00:00+00:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0
可以使用 dropna
:
删除没有样本的天数
df.resample('1D').mean().dropna(axis=0, how='all')
AAPL AMC AMZN TH TSLA VIAC WKHS
dt
2021-03-19 00:00:00+00:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2021-03-22 00:00:00+00:00 0.0 0.0 0.0 0.0 1.0 0.0 0.0
2021-03-23 00:00:00+00:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0
希望有用
我有一个数据框
tickers dt AAPL AMC AMZN ... TH TSLA VIAC WKHS
0 2021-03-19 15:11:11+00:00 0 0 0 ... 0 0 0 0
1 2021-03-22 12:43:45+00:00 0 0 0 ... 0 0 0 0
2 2021-03-22 13:07:46+00:00 0 0 0 ... 0 1 0 0
3 2021-03-22 13:55:05+00:00 0 0 0 ... 0 2 0 0
4 2021-03-23 04:59:01+00:00 0 0 0 ... 0 0 0 0
.. ... ... ... ... ... .. ... ... ...
835 2021-07-29 23:05:30+00:00 0 0 0 ... 0 0 0 0
836 2021-07-30 01:52:35+00:00 0 0 1 ... 0 0 0 0
我想在 1 天内合并整个数据框。之后,我想将每列的每个合并数字除以每天的行数。 我试过合并
bullish_comments_df1 = bullish_comments_df1.resample('1D').sum()
但是我不知道如何除以每天合并的行数 谢谢你的帮助
使用均值而不是求和和计数按天购买重新抽样。
df = df.reset_index(drop=True)
df.index = pd.DatetimeIndex(df['dt'])
df.resample('1D').mean()
AAPL AMC AMZN TH TSLA VIAC WKHS
dt
2021-03-19 00:00:00+00:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2021-03-20 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN
2021-03-21 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN
2021-03-22 00:00:00+00:00 0.0 0.0 0.0 0.0 1.0 0.0 0.0
2021-03-23 00:00:00+00:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0
可以使用 dropna
:
df.resample('1D').mean().dropna(axis=0, how='all')
AAPL AMC AMZN TH TSLA VIAC WKHS
dt
2021-03-19 00:00:00+00:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2021-03-22 00:00:00+00:00 0.0 0.0 0.0 0.0 1.0 0.0 0.0
2021-03-23 00:00:00+00:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0
希望有用