保持分类变量的原始顺序

Question

我想绘制分类变量的置信区间。这是我的数据：

Cluster pairs                           coef    Conf. Int. Low      Conf. Int. Upp.
Strong sci-tech – Strong science    0.656977    0.470414            0.843541
Weak science – Strong science      -0.060731   -0.238301            0.116839
Weak sci-tech – Strong science     -0.238147   -0.424907           -0.051388
Weak science – Strong sci-tech     -0.717708   -0.880094           -0.555322

我使用以下代码绘制这些间隔：

for lower, upper, y in zip(
    confidence_interval["Conf. Int. Low"],
    confidence_interval["Conf. Int. Upp."],
    range(len(confidence_interval)),
):
    plt.plot((lower, upper), (y, y), "ro-", color="blue")
    plt.yticks(
        range(len(confidence_interval)),
        list(confidence_interval["Cluster pairs"]),
    )
    plt.ylabel("Cluster pairs", fontsize=20)
    plt.xlabel("Coefficient differences", fontsize=20)
    plt.axvline(x=0, linestyle="--", color="black")

这里的问题是我的分类变量被重新排序了。我想保留原来的顺序。

Answer 1

我找到了解决这个问题的简单方法：

confidence_interval['cat'] = ['1','2', '3', '4', '5', '6'] 
confidence_interval = confidence_interval.sort_values('cat', ascending = False)

它给出以下输出：

保持分类变量的原始顺序

Keep the original order of the categorical variables

python

matplotlib

confidence-interval