Python 3.+, Scipy 统计模式函数给出类型错误不可排序的类型:str() > float()
Python 3.+, Scipy Stats Mode function gives Type Error unorderable types: str() > float()
我正在尝试解决 kaggle titanic disaster 问题,特别是使用 mode/ mean/ median 来输入缺失值。这是我的数据集的峰值
Parch Ticket Fare Cabin Embarked
0 0 A/5 21171 7.2500 NaN S
1 0 PC 17599 71.2833 C85 C
2 0 STON/O2. 3101282 7.9250 NaN S
3 0 113803 53.1000 C123 S
4 0 373450 8.0500 NaN S
我正在尝试获取 'Embarked' 列的模式并键入 'Object'。我正在使用 python3。这是代码片段:
modeEmbarked = mode(df.Embarked)
这是错误片段:
<ipython-input-39-1b4237d65022> in clean(df)
18
19 # Cleaning Embarked column
---> 20 modeEmbarked = mode(df.Embarked)
21 # print(mode(df.Embarked))
22 # le_embarked = preprocessing.LabelEncoder()
/home/singhaniya/anaconda3/lib/python3.5/site-packages/scipy/stats/stats.py in mode(a, axis)
635 return np.array([]), np.array([])
636
--> 637 scores = np.unique(np.ravel(a)) # get ALL unique values
638 testshape = list(a.shape)
639 testshape[axis] = 1
/home/singhaniya/anaconda3/lib/python3.5/site-packages/numpy/lib/arraysetops.py in unique(ar, return_index, return_inverse, return_counts)
196 aux = ar[perm]
197 else:
--> 198 ar.sort()
199 aux = ar
200 flag = np.concatenate(([True], aux[1:] != aux[:-1]))
TypeError: unorderable types: str() > float()
这是因为您在 df.Embarked
中有混合类型。确保所有项目都是相同类型(或可以比较的类型)。
或者使用Series.mode()
,可以处理混合类型。
modeEmbarked = mode(df.Embarked.dropna())
用这个代替
modeEmbarked = mode(df.Embarked)
解决问题。
我正在尝试解决 kaggle titanic disaster 问题,特别是使用 mode/ mean/ median 来输入缺失值。这是我的数据集的峰值
Parch Ticket Fare Cabin Embarked
0 0 A/5 21171 7.2500 NaN S
1 0 PC 17599 71.2833 C85 C
2 0 STON/O2. 3101282 7.9250 NaN S
3 0 113803 53.1000 C123 S
4 0 373450 8.0500 NaN S
我正在尝试获取 'Embarked' 列的模式并键入 'Object'。我正在使用 python3。这是代码片段:
modeEmbarked = mode(df.Embarked)
这是错误片段:
<ipython-input-39-1b4237d65022> in clean(df)
18
19 # Cleaning Embarked column
---> 20 modeEmbarked = mode(df.Embarked)
21 # print(mode(df.Embarked))
22 # le_embarked = preprocessing.LabelEncoder()
/home/singhaniya/anaconda3/lib/python3.5/site-packages/scipy/stats/stats.py in mode(a, axis)
635 return np.array([]), np.array([])
636
--> 637 scores = np.unique(np.ravel(a)) # get ALL unique values
638 testshape = list(a.shape)
639 testshape[axis] = 1
/home/singhaniya/anaconda3/lib/python3.5/site-packages/numpy/lib/arraysetops.py in unique(ar, return_index, return_inverse, return_counts)
196 aux = ar[perm]
197 else:
--> 198 ar.sort()
199 aux = ar
200 flag = np.concatenate(([True], aux[1:] != aux[:-1]))
TypeError: unorderable types: str() > float()
这是因为您在 df.Embarked
中有混合类型。确保所有项目都是相同类型(或可以比较的类型)。
或者使用Series.mode()
,可以处理混合类型。
modeEmbarked = mode(df.Embarked.dropna())
用这个代替
modeEmbarked = mode(df.Embarked)
解决问题。