从多类 pandas 数据帧绘制 CDF

Plotting a CDF from a multiclass pandas dataframe

我了解包 empiricaldist 根据 documentation.

提供了 CDF 函数

但是,我发现在具有多个值的列中绘制我的数据框很棘手。

df.head()
    +------+---------+---------------+-------------+----------+----------+-------+--------------+-----------+-----------+-----------+-----------+------------+
    |      | trip_id | seconds_start | seconds_end | duration | distance | speed | acceleration | lat_start | lon_start |  lat_end  |  lon_end  | travelmode |
    +------+---------+---------------+-------------+----------+----------+-------+--------------+-----------+-----------+-----------+-----------+------------+
    | 0    |  318410 |    1461743310 |  1461745298 |     1988 | 5121.49  | 2.58  | 0.00130      | 41.162687 | -8.615425 | 41.177888 | -8.597549 | car        |
    | 1    |  318411 |    1461749359 |  1461750290 |      931 | 1520.71  | 1.63  | 0.00175      | 41.177949 | -8.597074 | 41.177839 | -8.597574 | bus        |
    | 2    |  318421 |    1461806871 |  1461806941 |       70 | 508.15   | 7.26  | 0.10370      | 37.091240 | -8.211239 | 37.092322 | -8.206681 | foot       |
    | 3    |  318422 |    1461837354 |  1461838024 |      670 | 1207.39  | 1.80  | 0.00269      | 37.092082 | -8.205060 | 37.091659 | -8.206462 | car        |
    | 4    |  318425 |    1461852790 |  1461853845 |     1055 | 1470.49  | 1.39  | 0.00132      | 37.091628 | -8.202143 | 37.092095 | -8.205070 | foot       |
    +------+---------+---------------+-------------+----------+----------+-------+--------------+-----------+-----------+-----------+-----------+------------+

想要为每个出行模式的 travelmode 列绘制 CDF。

groups = df.groupby('travelmode')

但是,我真的不明白如何从文档中做到这一点。

你可以像这样循环绘制它们

import matplotlib.pyplot as plt

def decorate_plot(title):
    ''' Adds labels to plot '''
    plt.xlabel('Outcome')
    plt.ylabel('CDF')
    plt.title(title)

for tm in df['travelmode'].unique():
    for col in df.columns:
        if col != 'travelmode':
            # Create new figures for each plot
            fig, ax = plt.subplots()
            d4 = Cdf.from_seq(df[col])
            d4.plot()
            decorate_plot(f"{tm} - {col}")