如何删除 x 轴子图上的重复值和空值或未标记值

How to remove repeating and empty or unmarked values on subplot of x-axis

我正在开发一组图表来绘制一些 Pandas DataFrame 值。为此,我使用以下代码使用各种 pandas、numpy 和 matplotlib 模块和函数:

    import pandas as pd
    import numpy as np
    from matplotlib import pyplot as plt
    import matplotlib.ticker as ticker
    
    data = {'Name': ['immoControlCmd', 'BrkTerrMde', 'GlblClkYr', 'HsaStat', 'TesterPhysicalResGWM', 'FapLc','FirstRowBuckleDriver', 'GlblClkDay'],
            'Value': [0, 5, 0, 4, 0, 1, 1, 1],
            'Id_Par': [0, 0, 3, 3, 3, 3, 0, 0]
            }
    
    signals_df = pd.DataFrame(data)
    
    
    def plot_signals(signals_df):
        # Count signals by par
        signals_df['Count'] = signals_df.groupby('Id_Par').cumcount().add(1).mask(signals_df['Id_Par'].eq(0), 0)
        # Subtract Par values from the index column
        signals_df['Sub'] = signals_df.index - signals_df['Count']
        id_par_prev = signals_df['Id_Par'].unique()
        id_par = np.delete(id_par_prev, 0)
        signals_df['Prev'] = [1 if x in id_par else 0 for x in signals_df['Id_Par']]
        signals_df['Final'] = signals_df['Prev'] + signals_df['Sub']
        # signals_df['Finall'] = signals_df['Final'].unique()
        # print(signals_df['Finall'])
        # Convert and set Subtract to index
        signals_df.set_index('Final', inplace=True)
        # pos_x = len(signals_df.index.unique()) - 1
        # print(pos_x)
    
        # Get individual names and variables for the chart
        names_list = [name for name in signals_df['Name'].unique()]
        num_names_list = len(names_list)
        num_axis_x = len(signals_df["Name"])
    
        # Creation Graphics
        fig, ax = plt.subplots(nrows=num_names_list, figsize=(10, 10), sharex=True)
        plt.xticks(np.arange(0, num_axis_x), color='SteelBlue', fontweight='bold')
        for pos, (a_, name) in enumerate(zip(ax, names_list)):
            # Get data
            data = signals_df[signals_df["Name"] == name]["Value"]
            # Get values axis-x and axis-y
            x_ = np.hstack([-1, data.index.values, len(signals_df) - 1])
            # print(data.index.values)
            y_ = np.hstack([0, data.values, data.iloc[-1]])
            # Plotting the data by position
            ax[pos].plot(x_, y_, drawstyle='steps-post', marker='*', markersize=8, color='k', linewidth=2)
            ax[pos].set_ylabel(name, fontsize=8, fontweight='bold', color='SteelBlue', rotation=30, labelpad=35)
            ax[pos].yaxis.set_major_formatter(ticker.FormatStrFormatter('%0.1f'))
            ax[pos].yaxis.set_tick_params(labelsize=6)
            ax[pos].grid(alpha=0.4, color='SteelBlue')
        plt.show()
    
    
    plot_signals(signals_df)

我想要的是去掉x轴上没有画的或者没有在图上标注的点或者位置,但是最后保留图片中的值和名称;从 Pandas 来看,这将是“最终”列,在绘制子图之前,将其分配为索引,这是该列中某些值重复的地方;将是从图中删除红色框内的值,但保留最后图像中的值和名称:

                            Name  Value  Id_Par  Count  Sub  Prev
     Final                                                       
     0            immoControlCmd      0       0      0    0     0
     1                BrkTerrMde      5       0      0    1     0
     2                 GlblClkYr      0       3      1    1     1
     2                   HsaStat      4       3      2    1     1
     2      TesterPhysicalResGWM      0       3      3    1     1
     2                     FapLc      1       3      4    1     1
     6      FirstRowBuckleDriver      1       0      0    6     0
     7                GlblClkDay      1       0      0    7     0

我一直试图带上最后一列的唯一值,这将是 x 轴应该是的值,但由于数据框是另一个大小或维度,我得到一个错误:ValueError: Length of values ​​(5) does not match length of index (8),然后我必须调整图表的大小,但在这种情况下我不知道该怎么做:

        signals_df['Final'] = signals_df['Prev'] + signals_df['Sub']
        signals_df['Finall'] = signals_df['Final'].unique()
        print(signals_df['Finall'])

我也试过带上唯一索引的大小,之前分配给变量 x_ 的 data.index.values 应用减法,但它没有带我想要的,因为它是收集所有值并批量减去它们而不是单独减去,就像 data.index.values:

    signals_df.set_index('Final', inplace=True)
    pos_x = len(signals_df.index.unique()) - 1
    ...
    ..
    .
         x_ = np.hstack([-1, data.index.values-pos-x, len(signals_df) - 1])

是否有 Pandas and/or Matplotlib 函数允许我使用?或者有人可以给我一个建议,帮助我更好地理解如何去做吗?我希望实现的是下面的情节:

非常感谢您的帮助,任何评论都有帮助。 我有 Python 版本:3.6.5,Pandas 版本:1.1.5 和 Matplotlib 版本:3.3.2

一种可能的方法是将 x 轴值转换为字符串,这意味着 matplotlib 将绘制“分类”图。 See examples of that here.

对于您的情况,因为您的子图可能具有不同的值,并且它们的顺序并不总是正确的,所以我们需要先做一些小技巧来确保刻度以正确的顺序出现。为此,我们可以使用 方法,他们绘制一些以正确顺序使用所有 x 值的东西,然后将其删除。

要将所有 xtick 值收集在一起,您可以这样做,在其中创建值列表,使用 set 将其缩减为唯一值,然后对这些值进行排序,然后转换使用列表理解和 str():

到字符串
# First make a list of all the xticks we want
xvals = [-1,]
for name in names_list:
    xvals.append(signals_df[signals_df["Name"] == name]["Value"].index.values[0])
xvals.append(len(signals_df)-1)

# Reduce to only unique values, sorted, and then convert to strings
xvals = [str(i) for i in sorted(set(xvals))]

一旦你有了这些,你就可以制作一个虚拟图,然后将其删除,就像这样(这是为了以正确的顺序修复刻度位置)。 请注意,对于 matplotlib 版本 3.3.4 和更早版本,这需要在您的绘图循环中:

# To get the ticks in the right order on all subplots, we need to make
# a dummy plot here and then remove it
dummy, = ax[0].plot(xvals, np.zeros_like(xvals))
dummy.remove()

最后,当你在循环中实际绘制真实数据时,你只需要在绘制它们时将 x_ 转换为字符串:

ax[pos].plot(x_.astype('str'), y_, drawstyle='steps-post', marker='*', markersize=8, color='k', linewidth=2)

请注意,我所做的唯一其他更改是未明确设置 xtick 位置(您使用 plt.xticks 进行了设置),但您仍然可以使用该命令设置字体颜色和粗细

plt.xticks(color='SteelBlue', fontweight='bold')

这是输出:

为了完整起见,我已将所有内容放在您的脚本中:

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import matplotlib.ticker as ticker

import matplotlib
print(matplotlib.__version__)

data = {'Name': ['immoControlCmd', 'BrkTerrMde', 'GlblClkYr', 'HsaStat', 'TesterPhysicalResGWM', 'FapLc',
                 'FirstRowBuckleDriver', 'GlblClkDay'],
        'Value': [0, 5, 0, 4, 0, 1, 1, 1],
        'Id_Par': [0, 0, 3, 3, 3, 3, 0, 0]
        }

signals_df = pd.DataFrame(data)


def plot_signals(signals_df):
    # Count signals by par
    signals_df['Count'] = signals_df.groupby('Id_Par').cumcount().add(1).mask(signals_df['Id_Par'].eq(0), 0)
    # Subtract Par values from the index column
    signals_df['Sub'] = signals_df.index - signals_df['Count']
    id_par_prev = signals_df['Id_Par'].unique()
    id_par = np.delete(id_par_prev, 0)
    signals_df['Prev'] = [1 if x in id_par else 0 for x in signals_df['Id_Par']]
    signals_df['Final'] = signals_df['Prev'] + signals_df['Sub']
    # signals_df['Finall'] = signals_df['Final'].unique()
    # print(signals_df['Finall'])
    # Convert and set Subtract to index
    signals_df.set_index('Final', inplace=True)
    # pos_x = len(signals_df.index.unique()) - 1
    # print(pos_x)

    # Get individual names and variables for the chart
    names_list = [name for name in signals_df['Name'].unique()]
    num_names_list = len(names_list)
    num_axis_x = len(signals_df["Name"])

    # Creation Graphics
    fig, ax = plt.subplots(nrows=num_names_list, figsize=(10, 10), sharex=True)

    # No longer any need to define where the ticks go, but still set the colour and weight here
    plt.xticks(color='SteelBlue', fontweight='bold')

    # First make a list of all the xticks we want
    xvals = [-1, ]
    for name in names_list:
        xvals.append(signals_df[signals_df["Name"] == name]["Value"].index.values[0])
    xvals.append(len(signals_df) - 1)

    # Reduce to only unique values, sorted, and then convert to strings
    xvals = [str(i) for i in sorted(set(xvals))]

    for pos, (a_, name) in enumerate(zip(ax, names_list)):
    
        # To get the ticks in the right order on all subplots,
        # we need to make a dummy plot here and then remove it
        dummy, = ax[pos].plot(xvals, np.zeros_like(xvals))
        dummy.remove()
        # Get data
        data = signals_df[signals_df["Name"] == name]["Value"]
        # Get values axis-x and axis-y
        x_ = np.hstack([-1, data.index.values, len(signals_df) - 1])
        y_ = np.hstack([0, data.values, data.iloc[-1]])
        # Plotting the data by position
        # NOTE: here we convert x_ to strings as we plot, to make sure they are plotted as catagorical values
        ax[pos].plot(x_.astype('str'), y_, drawstyle='steps-post', marker='*', markersize=8, color='k', linewidth=2)
        ax[pos].set_ylabel(name, fontsize=8, fontweight='bold', color='SteelBlue', rotation=30, labelpad=35)
        ax[pos].yaxis.set_major_formatter(ticker.FormatStrFormatter('%0.1f'))
        ax[pos].yaxis.set_tick_params(labelsize=6)
        ax[pos].grid(alpha=0.4, color='SteelBlue')

    plt.show()


plot_signals(signals_df)