Python: float() 参数必须是字符串或数字,而不是 'pandas
Python: float() argument must be a string or a number,not 'pandas
我试图通过以下代码绘制图表:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import mpld3
my_list = [1,2,3,4,5,7,8,9,11,23,56,78,3,3,5,7,9,12]
new_list = pd.Series(my_list)
df1 = pd.DataFrame({'Range1':new_list.value_counts().index, 'Range2':new_list.value_counts().values})
df1.sort_values(by=["Range1"],inplace=True)
df2 = df1.groupby(pd.cut(df1["Range1"], [0,1,2,3,4,5,6,7,8,9,10,11,df1['Range1'].max()])).sum()
objects = df2['Range2'].index
y_pos = np.arange(len(df2['Range2'].index))
plt.bar(df2['Range2'].index.values, df2['Range2'].values)
但收到以下错误消息:
TypeError: float() argument must be a string or a number, not 'pandas._libs.interval.Interval'
不知道这个浮动错误的来源。非常感谢任何建议。
pd.cut
操作产生区间:
In [11]: pd.cut(df1["Range1"], [0,1,2,3,4,5,6,7,8,9,10,11,df1['Range1'].max()])
Out[11]:
12 (0, 1]
11 (1, 2]
0 (2, 3]
10 (3, 4]
3 (4, 5]
2 (6, 7]
9 (7, 8]
1 (8, 9]
8 (10, 11]
7 (11, 78]
5 (11, 78]
4 (11, 78]
6 (11, 78]
Name: Range1, dtype: category
Categories (12, interval[int64]): [(0, 1] < (1, 2] < (2, 3] < (3, 4] ... (8, 9] < (9, 10] < (10, 11] <
(11, 78]]
在groupby
操作中使用时,根据上面的切操作的索引进行匹配,然后按照你指定的操作进行分组求和。
因此,间隔最终成为 df2
中的索引:
In [14]: df2
Out[14]:
Range1 Range2
Range1
(0, 1] 1 1
(1, 2] 2 1
(2, 3] 3 3
(3, 4] 4 1
(4, 5] 5 2
(5, 6] 0 0
(6, 7] 7 2
(7, 8] 8 1
(8, 9] 9 2
(9, 10] 0 0
(10, 11] 11 1
(11, 78] 169 4
当您使用 df2['Range2'].index.values
时,这些间隔的 array
将作为第一个参数传递给 bar
,这不能按照 matplotlib 期望的方式转换为浮点数。
如果您只想绘制 df2.Range2
的条形图并且您很高兴将间隔作为轴标签,这将起作用:
plt.bar(range(len(df2)), df2.Range2.values, tick_label=df2.Range2.index.values)
并为我制作这张图片:
Matplotlib 无法绘制 category
数据类型。您需要转换为字符串。
plt.bar(df2['Range2'].index.astype(str), df2['Range2'].values)
我试图通过以下代码绘制图表:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import mpld3
my_list = [1,2,3,4,5,7,8,9,11,23,56,78,3,3,5,7,9,12]
new_list = pd.Series(my_list)
df1 = pd.DataFrame({'Range1':new_list.value_counts().index, 'Range2':new_list.value_counts().values})
df1.sort_values(by=["Range1"],inplace=True)
df2 = df1.groupby(pd.cut(df1["Range1"], [0,1,2,3,4,5,6,7,8,9,10,11,df1['Range1'].max()])).sum()
objects = df2['Range2'].index
y_pos = np.arange(len(df2['Range2'].index))
plt.bar(df2['Range2'].index.values, df2['Range2'].values)
但收到以下错误消息:
TypeError: float() argument must be a string or a number, not 'pandas._libs.interval.Interval'
不知道这个浮动错误的来源。非常感谢任何建议。
pd.cut
操作产生区间:
In [11]: pd.cut(df1["Range1"], [0,1,2,3,4,5,6,7,8,9,10,11,df1['Range1'].max()])
Out[11]:
12 (0, 1]
11 (1, 2]
0 (2, 3]
10 (3, 4]
3 (4, 5]
2 (6, 7]
9 (7, 8]
1 (8, 9]
8 (10, 11]
7 (11, 78]
5 (11, 78]
4 (11, 78]
6 (11, 78]
Name: Range1, dtype: category
Categories (12, interval[int64]): [(0, 1] < (1, 2] < (2, 3] < (3, 4] ... (8, 9] < (9, 10] < (10, 11] <
(11, 78]]
在groupby
操作中使用时,根据上面的切操作的索引进行匹配,然后按照你指定的操作进行分组求和。
因此,间隔最终成为 df2
中的索引:
In [14]: df2
Out[14]:
Range1 Range2
Range1
(0, 1] 1 1
(1, 2] 2 1
(2, 3] 3 3
(3, 4] 4 1
(4, 5] 5 2
(5, 6] 0 0
(6, 7] 7 2
(7, 8] 8 1
(8, 9] 9 2
(9, 10] 0 0
(10, 11] 11 1
(11, 78] 169 4
当您使用 df2['Range2'].index.values
时,这些间隔的 array
将作为第一个参数传递给 bar
,这不能按照 matplotlib 期望的方式转换为浮点数。
如果您只想绘制 df2.Range2
的条形图并且您很高兴将间隔作为轴标签,这将起作用:
plt.bar(range(len(df2)), df2.Range2.values, tick_label=df2.Range2.index.values)
并为我制作这张图片:
Matplotlib 无法绘制 category
数据类型。您需要转换为字符串。
plt.bar(df2['Range2'].index.astype(str), df2['Range2'].values)