直方图的长度是不同的情况下
Lenght of histogram is differ in case
我是 运行 LBP
算法,根据纹理特征对 images
进行分类。分类方法是 LinearSVC
in sklearn.svm
package.
通过SVM
获取直方图和拟合已经完成,但有时histogram
的length
取决于image
。
示例如下:
from skimage import feature
from scipy.stats import itemfreq
from sklearn.svm import LinearSVC
import numpy as np
import cv2
import cvutils
import csv
import os
def __get_hist(image, radius):
NumPoint = radius*8
lbp = feature.local_binary_pattern(image, NumPoint, radius, method="uniform")
x = itemfreq(lbp.ravel())
hist = x[:,1]/sum(x[:,1])
return hist
def get_trainHist_list(train_txt):
train_dic = {}
with open(train_txt, 'r') as csvfile:
reader = csv.reader(csvfile, delimiter = ' ')
for row in reader:
train_dic[row[0]] = int(row[1])
hist_list=[]
key_list=[]
label_list=[]
for key, label in train_dic.items():
img = cv2.imread("D:/Python36/images/texture/%s" %key, cv2.IMREAD_GRAYSCALE)
key_list.append(key)
label_list.append(label)
hist_list.append(__get_hist(img,3))
bundle = [np.array(key_list), np.array(label_list), np.array(hist_list)]
return bundle
train_txt = 'D:/Python36/images/class_train.txt'
train_hist = get_trainHist_list(train_txt)
model = LinearSVC(C=100.0, random_state=42)
model.fit(train_hist[2], train_hist[1])
for i in train_hist[2]:
print(len(i))
test_img = cv2.imread("D:/Python36/images/texture_test/flat-3.png", cv2.IMREAD_GRAYSCALE)
hist= np.array(__get_hist(test_img, 3))
print(len(hist))
prediction = model.predict([hist])
print(prediction)
结果
26
26
26
26
26
26
25
Traceback (most recent call last):
File "D:\Python36\texture.py", line 44, in <module>
prediction = model.predict([hist])
File "D:\Python36\lib\site-packages\sklearn\linear_model\base.py", line 324, in predict
scores = self.decision_function(X)
File "D:\Python36\lib\site-packages\sklearn\linear_model\base.py", line 305, in decision_function
% (X.shape[1], n_features))
ValueError: X has 25 features per sample; expecting 26
可以看出,histogram
的length
对training images
的都是26,而test_img
的是25。因此,predict
在 SVM
中不起作用。
我猜 test_img
在 histogram
中有空的部分,那些空的部分可以跳过。 (我不确定)
有人有解决办法吗?
8 个点的邻域有 59 个不同的统一 LBP。这应该是你的特征向量的维度,但这不是因为你使用 itemfreq
来计算直方图(作为旁注,itemfreq
is deprecated). The length of the histograms obtained throug itemfreq
is the number of different uniform LBPs in the image. If some uniform LBPs are not present in the image the number of bins of the resulting histogram will be lower than 59. This issue can be easily fixed by utilizing bincount
如下面的玩具示例所示:
import numpy as np
from skimage import feature
from scipy.stats import itemfreq
lbp = np.array([[0, 0, 0, 0],
[1, 1, 1, 1],
[8, 8, 9, 9]])
hi = itemfreq(lbp.ravel())[:, 1] # wrong approach
hb = np.bincount(lbp.ravel(), minlength=59) # proposed method
输出如下所示:
In [815]: hi
Out[815]: array([4, 4, 2, 2], dtype=int64)
In [816]: hb
Out[816]:
array([4, 4, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0], dtype=int64)
我是 运行 LBP
算法,根据纹理特征对 images
进行分类。分类方法是 LinearSVC
in sklearn.svm
package.
通过SVM
获取直方图和拟合已经完成,但有时histogram
的length
取决于image
。
示例如下:
from skimage import feature
from scipy.stats import itemfreq
from sklearn.svm import LinearSVC
import numpy as np
import cv2
import cvutils
import csv
import os
def __get_hist(image, radius):
NumPoint = radius*8
lbp = feature.local_binary_pattern(image, NumPoint, radius, method="uniform")
x = itemfreq(lbp.ravel())
hist = x[:,1]/sum(x[:,1])
return hist
def get_trainHist_list(train_txt):
train_dic = {}
with open(train_txt, 'r') as csvfile:
reader = csv.reader(csvfile, delimiter = ' ')
for row in reader:
train_dic[row[0]] = int(row[1])
hist_list=[]
key_list=[]
label_list=[]
for key, label in train_dic.items():
img = cv2.imread("D:/Python36/images/texture/%s" %key, cv2.IMREAD_GRAYSCALE)
key_list.append(key)
label_list.append(label)
hist_list.append(__get_hist(img,3))
bundle = [np.array(key_list), np.array(label_list), np.array(hist_list)]
return bundle
train_txt = 'D:/Python36/images/class_train.txt'
train_hist = get_trainHist_list(train_txt)
model = LinearSVC(C=100.0, random_state=42)
model.fit(train_hist[2], train_hist[1])
for i in train_hist[2]:
print(len(i))
test_img = cv2.imread("D:/Python36/images/texture_test/flat-3.png", cv2.IMREAD_GRAYSCALE)
hist= np.array(__get_hist(test_img, 3))
print(len(hist))
prediction = model.predict([hist])
print(prediction)
结果
26
26
26
26
26
26
25
Traceback (most recent call last):
File "D:\Python36\texture.py", line 44, in <module>
prediction = model.predict([hist])
File "D:\Python36\lib\site-packages\sklearn\linear_model\base.py", line 324, in predict
scores = self.decision_function(X)
File "D:\Python36\lib\site-packages\sklearn\linear_model\base.py", line 305, in decision_function
% (X.shape[1], n_features))
ValueError: X has 25 features per sample; expecting 26
可以看出,histogram
的length
对training images
的都是26,而test_img
的是25。因此,predict
在 SVM
中不起作用。
我猜 test_img
在 histogram
中有空的部分,那些空的部分可以跳过。 (我不确定)
有人有解决办法吗?
8 个点的邻域有 59 个不同的统一 LBP。这应该是你的特征向量的维度,但这不是因为你使用 itemfreq
来计算直方图(作为旁注,itemfreq
is deprecated). The length of the histograms obtained throug itemfreq
is the number of different uniform LBPs in the image. If some uniform LBPs are not present in the image the number of bins of the resulting histogram will be lower than 59. This issue can be easily fixed by utilizing bincount
如下面的玩具示例所示:
import numpy as np
from skimage import feature
from scipy.stats import itemfreq
lbp = np.array([[0, 0, 0, 0],
[1, 1, 1, 1],
[8, 8, 9, 9]])
hi = itemfreq(lbp.ravel())[:, 1] # wrong approach
hb = np.bincount(lbp.ravel(), minlength=59) # proposed method
输出如下所示:
In [815]: hi
Out[815]: array([4, 4, 2, 2], dtype=int64)
In [816]: hb
Out[816]:
array([4, 4, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0], dtype=int64)