TypeError: Cannot use method 'nlargest' with dtype object

Question

我有这个数据：

              ID      Date_utc Upvotes Number of Comments                                     Subthread name Post Author
0     sw73ml  1645266563.0       2                NaN  I fucking love cars, but actually driving, or ...         NaN
1     sw73sa  1645266581.0       3                NaN                               It's my birthday!!!!         NaN
2     sw73va  1645266588.0       3                NaN                            My bike just got stolen         NaN
3     sw73x0  1645266593.0       4                NaN                   I feel like an outsider socially         NaN
4     sw75gk  1645266754.0      10                NaN     Hallo? Ist dis Bert und Ernie’s BDSM emporium?         NaN
...      ...           ...     ...                ...                                                ...         ...
7703  uou8wd  1652455643.0       2                NaN                       Holy crap I forgot how good…         NaN
7704  uou8yy  1652455648.0       4                NaN                       Just got told to kill myself         NaN
7705  uou8zv  1652455650.0       4                NaN                                            hey YOU         NaN
7706  uouagi  1652455771.0       1                NaN                           STEVEN UNIVERSE IS GREAT         NaN
7707  uouaks  1652455780.0       1                NaN  drinking water after chewing gum is the cold e...         NaN

我想得到 ['Upvotes'] 列中的第 10 个最大的数字。但是我得到了这个错误 TypeError: Cannot use method 'nlargest' with dtype object

filelar = df_p.nlargest(10, "Upvotes" )

尽管 Upvotes 中的项目可能是数字，但它给出了该错误，所以我尝试了这个：

for i in df_p["Upvotes"].items():
        try: 
            df_p["Upvotes"] = df_p["Upvotes"].astype(float) 
            print('succsessful') 
        except: 
            pass 
            print('failed')

但它只是为 Upvotes 中的每一项打印了 failed。然后我打印 i 连同 failed 打印语句。我注意到索引 7505 中的这个 (7505, 'Upvotes') 而不是数字 Upvotes。我发现了其中的 10 个。我认为这可能是导致问题的原因。

所以如果我是对的，那就是问题所在。有什么办法可以跳过导致此问题的项目吗？因为我用 try and except 尝试的方式并不顺利。

谢谢

Answer 1

你可以试试pandas.to_numeric

df['Upvotes'] = pd.to_numeric(df['Upvotes'], errors='coerce')

filelar = df.nlargest(10, "Upvotes" )

TypeError: Cannot use method 'nlargest' with dtype object

TypeError: Cannot use method 'nlargest' with dtype object

python

typeerror

dataframe

pandas