如何在 python 中使用 plotly 绘制累积数据？

Question

我正在使用 Google colab 在 python 中使用 plotly 生成图形和图表。我正在分析的 csv 文件中存储了 6,97,000 行数据。我正在使用以下代码生成条形图并且效果很好。

fig = px.bar(df, x='IP', y="Epid_ID")
fig.update_traces(marker=dict(line=dict(width=3,color='blue')))
fig.show()

现在，我想要一个显示累积数据的图表。以下是我的数据集的示例。

IP             Epid_ID
05/08/2021     COV-NEP-PR4-LAM-21-01936
05/08/2021     COV-NEP-PR4-LAM-21-01937
06/08/2021     COV-NEP-PR4-LAM-21-01938
06/08/2021     COV-NEP-PR4-LAM-21-01939
07/08/2021     COV-NEP-PR4-LAM-21-01940

我的预期输出是显示累积数据的条形图。当前输出：

预期输出

我尝试使用以下 link 来使用 cumsum。 https://www.codegrepper.com/code-examples/python/cumulative+chart+python+plotly

并尝试使用以下代码将 Date 变量保持为 x。

 x = df['IP']
 y = df['Epid_ID']
 cumsum = np.cumsum(x)

但是，当我使用这段代码时，我的运行时崩溃了。请帮忙！

Answer 1

所以我的解释是您想要按计数升序排序的输出？您是否尝试使用 df['Epid_ID'].sort("Epid_ID",ascending=False) 对 DataFrame 或 SubDataFrame 进行排序？您也可以尝试在使用 .count().

之前聚合 DataFrame

df.groupBy("salutation").count().sort("count",ascending=False).show()
+------------+------+
|  salutation| count|
+------------+------+
|not reported|   255|
|     Company|   321|
|      Family|  1467|
|          Mr| 12012|
|         Mrs|382567|
+------------+------+

Answer 2

构建直方图将为您提供预期的输出，因为它按范围分布数据。

试试这个

import plotly.express as px
import plotly.graph_objects as go 
    
df = px.data.iris() 
    
fig = go.Figure(data=[go.Histogram(y=df['sepal_width'], cumulative_enabled=True)]) 
fig.show()

如何在 python 中使用 plotly 绘制累积数据？

How to plot cumulative data using plotly in python?

python

bar-chart

cumulative-frequency

plotly