仅显示在 seaborn 上出现次数最多的元素

Question

大家下午好，我是python的初学者，希望有人能帮助我！我在数据框中有一个 netflix 电影列表和每部电影收到的票数。例如：

Title : The100 Votes : 1500
Title : Marania Votes : 2000

我的问题很简单：

我想使用 seaborn 和 matplotlib 将 5 部电影打印成图形最高票数（附上自己的票数）。

我的尝试：

import seaborn as sns

...

sns.catplot(x='title', y='votes_number', data=top5_series)

但我真的不明白我怎么只能打印“5 最好”。

提前致谢！

Answer 1

你可以用pandas

做任何事

import pandas as pd
import numpy as np
import string

np.random.seed(1)
df = pd.DataFrame(
    {
        "movie": [i for i in string.ascii_uppercase],
        "vote": np.random.randint(low=10, high=500, size=len(string.ascii_uppercase))
    }
)

# If you want a different number change the n_most
n_most = 5
df.nlargest(n_most , ["vote"]).plot(kind="bar", x="movie", y="vote", figsize=(15,6), rot=45)

Answer 2

我将在这里使用 Seaborn 的示例数据集之一，因为我没有你的。此替代方案对数据框进行排序，然后仅绘制其中的一个子集。

我已经从绘图代码中取出一些位，在绘图之前定义变量以使其更易于阅读，但这些值也可以在 sns.catplot() 函数中替换。

import seaborn as sns
sns.set_theme(style="whitegrid")

# Get data
penguins = sns.load_dataset("penguins")

# Sort data so I can select top values with a subset
# `ascending` is set to `False` because `NaN` values get put at the end
penguins.sort_values("bill_length_mm", ascending=False, inplace=True)


# Pull out only the first five rows
subset = penguins.iloc[ :5, :]

# Get index values to use for `x`, so it doesn't group them
# as it would if `x='species'` or `island`

idx = penguins.index.to_list()[:5]

# Plot
g = sns.catplot(
    data=subset, kind="bar",
    x=idx, y="body_mass_g",
)

仅显示在 seaborn 上出现次数最多的元素

show only elements with the most recurrence On seaborn

python

matplotlib

seaborn