TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')

Question

在尝试获取小型玩具数据集的直方图时，numpy 通过 matplotlib 出现奇怪错误。我只是不确定如何解释该错误，这让我很难看到下一步该做什么。

虽然 and this gdsCAD question 表面上相似，但没有找到太多相关内容。

我希望底部的调试信息比驱动程序代码更有帮助，但如果我遗漏了什么，请询问。这可以作为现有测试套件的一部分进行重现。

        if n > 1:
            return diff(a[slice1]-a[slice2], n-1, axis=axis)
        else:
>           return a[slice1]-a[slice2]
E           TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')

../py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py:1567: TypeError
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(1567)diff()
-> return a[slice1]-a[slice2]
(Pdb) bt
[...]
py2.7.11-venv/lib/python2.7/site-packages/matplotlib/axes/_axes.py(5678)hist()
-> m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
  py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(606)histogram()
-> if (np.diff(bins) < 0).any():
> py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(1567)diff()
-> return a[slice1]-a[slice2]
(Pdb) p numpy.__version__
'1.11.0'
(Pdb) p matplotlib.__version__
'1.4.3'
(Pdb) a
a = [u'A' u'B' u'C' u'D' u'E']
n = 1
axis = -1
(Pdb) p slice1
(slice(1, None, None),)
(Pdb) p slice2
(slice(None, -1, None),)
(Pdb)

Answer 1

为什么要将 diff 应用于字符串数组。

我在同一点收到错误消息，但消息不同

In [23]: a=np.array([u'A' u'B' u'C' u'D' u'E'])

In [24]: np.diff(a)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-24-9d5a62fc3ff0> in <module>()
----> 1 np.diff(a)

C:\Users\paul\AppData\Local\Enthought\Canopy\User\lib\site-packages\numpy\lib\function_base.pyc in diff(a, n, axis)
   1112         return diff(a[slice1]-a[slice2], n-1, axis=axis)
   1113     else:
-> 1114         return a[slice1]-a[slice2]
   1115 
   1116 

TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'numpy.ndarray'

这个a数组是bins参数吗？文档说 bins 应该是什么？

Answer 2

我自己对此还很陌生，但我遇到了类似的错误，发现这是由于类型转换问题造成的。我试图连接而不是采取差异，但我认为这里的原则是一样的。我在另一个 question 上提供了类似的答案，所以我希望没问题。

本质上你需要使用不同的数据类型转换，在我的例子中我需要 str 而不是 float，我怀疑你的是一样的所以我建议的解决方案是。很抱歉，在提出建议之前我无法对其进行测试，但从您的示例中我不清楚您在做什么。

return diff(str(a[slice1])-str(a[slice2]), n-1, axis=axis)

请参阅下面的示例代码以修复我的代码，更改发生在倒数第三行。该代码用于生成基本的随机森林模型。

import scipy
import math
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn import preprocessing, metrics, cross_validation

Data = pd.read_csv("Free_Energy_exp.csv", sep=",")
Data = Data.fillna(Data.mean()) # replace the NA values with the mean of the descriptor
header = Data.columns.values # Ues the column headers as the descriptor labels
Data.head()
test_name = "Test.csv"

npArray = np.array(Data)
print header.shape
npheader = np.array(header[1:-1])
print("Array shape X = %d, Y = %d " % (npArray.shape))
datax, datay =  npArray.shape

names = npArray[:,0]
X = npArray[:,1:-1].astype(float)
y = npArray[:,-1] .astype(float)
X = preprocessing.scale(X)

XTrain, XTest, yTrain, yTest = cross_validation.train_test_split(X,y, random_state=0)

# Predictions results initialised 
RFpredictions = []
RF = RandomForestRegressor(n_estimators = 10, max_features = 5, max_depth = 5, random_state=0)
RF.fit(XTrain, yTrain)       # Train the model
print("Training R2 = %5.2f" % RF.score(XTrain,yTrain))
RFpreds = RF.predict(XTest)

with open(test_name,'a') as fpred :
    lenpredictions = len(RFpreds)
    lentrue = yTest.shape[0]
    if lenpredictions == lentrue :
            fpred.write("Names/Label,, Prediction Random Forest,, True Value,\n")
            for i in range(0,lenpredictions) :
                    fpred.write(RFpreds[i]+",,"+yTest[i]+",\n")
    else :
            print "ERROR - names, prediction and true value array size mismatch."

这会导致错误；

Traceback (most recent call last):
  File "min_example.py", line 40, in <module>
    fpred.write(RFpreds[i]+",,"+yTest[i]+",\n")
TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('S32') dtype('S32') dtype('S32')

解决方案是在倒数第三行将每个变量设为 str() 类型，然后写入文件。没有对上面的代码进行其他更改。

import scipy
import math
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn import preprocessing, metrics, cross_validation

Data = pd.read_csv("Free_Energy_exp.csv", sep=",")
Data = Data.fillna(Data.mean()) # replace the NA values with the mean of the descriptor
header = Data.columns.values # Ues the column headers as the descriptor labels
Data.head()
test_name = "Test.csv"

npArray = np.array(Data)
print header.shape
npheader = np.array(header[1:-1])
print("Array shape X = %d, Y = %d " % (npArray.shape))
datax, datay =  npArray.shape

names = npArray[:,0]
X = npArray[:,1:-1].astype(float)
y = npArray[:,-1] .astype(float)
X = preprocessing.scale(X)

XTrain, XTest, yTrain, yTest = cross_validation.train_test_split(X,y, random_state=0)

# Predictions results initialised 
RFpredictions = []
RF = RandomForestRegressor(n_estimators = 10, max_features = 5, max_depth = 5, random_state=0)
RF.fit(XTrain, yTrain)       # Train the model
print("Training R2 = %5.2f" % RF.score(XTrain,yTrain))
RFpreds = RF.predict(XTest)

with open(test_name,'a') as fpred :
    lenpredictions = len(RFpreds)
    lentrue = yTest.shape[0]
    if lenpredictions == lentrue :
            fpred.write("Names/Label,, Prediction Random Forest,, True Value,\n")
            for i in range(0,lenpredictions) :
                    fpred.write(str(RFpreds[i])+",,"+str(yTest[i])+",\n")
    else :
            print "ERROR - names, prediction and true value array size mismatch."

这些示例来自较大的代码，所以我希望这些示例足够清楚。

Answer 3

我有一个类似的问题，我正在迭代的 DataFrame 的一行中的整数类型为 numpy.int64。我得到了

TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')

尝试从中减去浮点数时出错。

对我来说最简单的解决方法是使用 pd.to_numeric(row).

转换行

Answer 4

我遇到了同样的错误，但在我的例子中，我从 dict.value 中减去 dict.key。我已经通过从其他 dict.value.

中减去相应键的 dict.value 来解决这个问题

cosine_sim = cosine_similarity(e_b-e_a, w-e_c)

这里我得到了错误，因为 e_b、e_a 和 e_c 分别是单词 a、b、c 的嵌入向量。我不知道 'w' 是字符串，当我找到 w 是字符串时，我通过以下行解决了这个问题：

cosine_sim = cosine_similarity(e_b-e_a, word_to_vec_map[w]-e_c)

而不是减去dict.key，现在我已经减去key

对应的值

Answer 5

我认为@James 是对的。在处理 Polyval() 时，我遇到了同样的错误。是的，解决方案是使用相同类型的变量。您可以使用类型转换将所有变量转换为同一类型。

下面是一个示例代码

import numpy
P = numpy.array(input().split(), float)
x = float(input())
print(numpy.polyval(P,x))

这里我使用float作为输出类型。所以即使用户输入 INT 值（整数）。最终答案将被转换为 float。

Answer 6

我运行遇到了同样的问题，但在我的例子中，它只是一个 python 列表，而不是使用的 numpy 数组。使用两个 numpy 数组为我解决了这个问题。

TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')

TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')

numpy

matplotlib

python-unicode