图像边缘检测 Keras 模型损失没有改善
Image edge detection Keras model loss not improving
我有水滴的视频。我拍摄了第一帧并手动标记了边缘。我将图像分割成更小的图像。然后,我尝试针对小标记图像训练小的未标记图像的 keras 模型。
我试过使用 "dense" 层。模型训练,但损失没有改善。当我尝试使用该模型时,它只给我一个黑色图像输出。
标记的拆分图像
输入图像(第 1 帧)
模型总结
#################### IMPORT AND SPLIT
from cam_img_split import cam_img_split
import cv2
img_tr_in=cv2.imread('frame 1.png')
img_tr_out=cv2.imread('frame 1 so far.png')
seg_shape=[32,32]
tr_in=cam_img_split(img_tr_in,seg_shape)
tr_out=cam_img_split(img_tr_out,seg_shape)
pl=[4,20] #images selected for training
##################### NEURAL NETWORK
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import adam
b_sha=np.prod(tr_in.shape[2:5]) #batch shape
model = Sequential()
model.add(Dense(b_sha, activation='relu'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(np.prod(tr_out.shape[2:5]), activation='softmax'))
model.compile(optimizer=adam(lr=0.1), loss='mean_squared_error', metrics=['accuracy'])
tr_in_sel=tr_in[0:pl[0],0:pl[1],:,:,:]
tr_out_sel=tr_out[0:pl[0],0:pl[1],:,:,:]
tr_in_sel_flat=tr_in_sel.reshape([np.prod(pl),b_sha]) #Flattening
tr_out_sel_flat=tr_in_sel.reshape([np.prod(pl),b_sha])
tr_in_sel_flat_norm=tr_in_sel_flat/255
tr_out_sel_flat_norm=tr_out_sel_flat/255
model.fit(tr_in_sel_flat_norm, tr_out_sel_flat_norm, epochs=10, batch_size=pl[0])
我希望输出与带有标记边缘的图像相匹配。相反,我得到了黑色图像输出。
您使用了错误的 loss/metric 组合。你的问题是分类,还是回归? MSE 用于回归,而 categorical_crossentropy(或稀疏,或二进制)用于分类。
我解决了这个问题,方法是使用图像的 7x7 部分将该部分的中心像素分类为油或水(1 或 0)。然后我使用 binary_crossentropy 损失函数来训练模型。
通过在主图像上一次移动一个像素的 7x7 部分,我可以获得比仅分割主图像更多的训练数据。
我之前尝试过从另一个7x7图像中得到一个7x7图像,这使得问题变得更加困难。
#IMPORT AND SPLIT
from cam_img_split import cam_img_split
from cam_pad import cam_pad
from cam_img_bow import cam_img_bow
import cv2
import numpy as np
img_tr_in=cv2.imread('frame 1.png',0)[0:767,0:767]/255
img_tr_out=cv2.imread('frame 1 so far bnw 2.png',0)[0:767,0:767]/255
img_tr_out=(cam_img_bow(img_tr_out,0.5)).astype(np.uint8)
seg_shape=[15,15] #needs to be odd and equal to each other
pl_max=img_tr_in.shape[0:2]
pl=np.array([0.15*pl_max[0],pl_max[1]]).astype(np.uint32)
pad_in=int(np.floor(seg_shape[0]/2))
img_tr_in_pad=cam_pad(img_tr_in,pad_in)
tr_in=np.zeros([pl[0],pl[1],seg_shape[0],seg_shape[1]])
for n1 in range(0,pl[0]):
for n2 in range(0,pl[1]):
tr_in[n1,n2]=img_tr_in_pad[n1:n1+seg_shape[0],n2:n2+seg_shape[1]]
##################### NEURAL NETWORK
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense,Dropout,Conv2D, MaxPooling2D, Flatten
from keras.optimizers import adam
from keras.utils import to_categorical
import matplotlib.pyplot as plt
pad=4
input_shape=(seg_shape[0]+2*pad,seg_shape[1]+2*pad,1)
output_shape=(1,1,1)
model = Sequential()
model.add(Conv2D(32, (3, 3),input_shape=input_shape, activation='relu'))
model.add(Conv2D(64,(3, 3), activation='relu'))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(units=2, activation='softmax'))
model.compile(optimizer=adam(lr=0.001), loss='binary_crossentropy', metrics=['accuracy'])
##################### FITTING THE MODEL
tr_in_flat=tr_in.reshape([pl[0]*pl[1],seg_shape[0],seg_shape[1],1])
tr_out_flat=img_tr_out.reshape([pl_max[0]*pl_max[1]])
tr_in_flat_pad=np.zeros(tr_in_flat.shape+np.array([0,2*pad,2*pad,0]))
for n3 in range(0,tr_in_flat.shape[0]):
tr_in_flat_pad[n3,:,:,0]=cam_pad(tr_in_flat[n3,:,:,0], pad)
model.fit(tr_in_flat_pad, to_categorical(tr_out_flat[0:pl[0]*pl[1]]), epochs=5, batch_size=int(16*pl[0]),shuffle=True)
##################### PLOTTING PREDICTIONS
tr_in_full=np.zeros([pl_max[0],pl_max[1],seg_shape[0]+2*pad,seg_shape[1]+2*pad])
for n1 in range(0,pl_max[0]):
for n2 in range(0,pl_max[1]):
tr_in_full[n1,n2]=cam_pad(img_tr_in_pad[n1:n1+seg_shape[0],n2:n2+seg_shape[1]],pad)
tr_in_full_flat=tr_in_full.reshape([pl_max[0]*pl_max[1],seg_shape[0]+2*pad,seg_shape[1]+2*pad,1])
pred = model.predict(tr_in_full_flat)
pred_img=np.zeros(pred.shape[0])
for n1 in range(0,pred.shape[0]):
pred_img[n1]=round(pred[n1,0])
pred_img_out=(pred_img.reshape([pl_max[0],pl_max[1]]))
plt.subplot(1,2,1)
plt.imshow(pred_img_out)
plt.subplot(1,2,2)
plt.imshow(img_tr_in)
plt.show()
我有水滴的视频。我拍摄了第一帧并手动标记了边缘。我将图像分割成更小的图像。然后,我尝试针对小标记图像训练小的未标记图像的 keras 模型。
我试过使用 "dense" 层。模型训练,但损失没有改善。当我尝试使用该模型时,它只给我一个黑色图像输出。
标记的拆分图像
输入图像(第 1 帧)
模型总结
#################### IMPORT AND SPLIT
from cam_img_split import cam_img_split
import cv2
img_tr_in=cv2.imread('frame 1.png')
img_tr_out=cv2.imread('frame 1 so far.png')
seg_shape=[32,32]
tr_in=cam_img_split(img_tr_in,seg_shape)
tr_out=cam_img_split(img_tr_out,seg_shape)
pl=[4,20] #images selected for training
##################### NEURAL NETWORK
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import adam
b_sha=np.prod(tr_in.shape[2:5]) #batch shape
model = Sequential()
model.add(Dense(b_sha, activation='relu'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(np.prod(tr_out.shape[2:5]), activation='softmax'))
model.compile(optimizer=adam(lr=0.1), loss='mean_squared_error', metrics=['accuracy'])
tr_in_sel=tr_in[0:pl[0],0:pl[1],:,:,:]
tr_out_sel=tr_out[0:pl[0],0:pl[1],:,:,:]
tr_in_sel_flat=tr_in_sel.reshape([np.prod(pl),b_sha]) #Flattening
tr_out_sel_flat=tr_in_sel.reshape([np.prod(pl),b_sha])
tr_in_sel_flat_norm=tr_in_sel_flat/255
tr_out_sel_flat_norm=tr_out_sel_flat/255
model.fit(tr_in_sel_flat_norm, tr_out_sel_flat_norm, epochs=10, batch_size=pl[0])
我希望输出与带有标记边缘的图像相匹配。相反,我得到了黑色图像输出。
您使用了错误的 loss/metric 组合。你的问题是分类,还是回归? MSE 用于回归,而 categorical_crossentropy(或稀疏,或二进制)用于分类。
通过在主图像上一次移动一个像素的 7x7 部分,我可以获得比仅分割主图像更多的训练数据。
我之前尝试过从另一个7x7图像中得到一个7x7图像,这使得问题变得更加困难。
#IMPORT AND SPLIT
from cam_img_split import cam_img_split
from cam_pad import cam_pad
from cam_img_bow import cam_img_bow
import cv2
import numpy as np
img_tr_in=cv2.imread('frame 1.png',0)[0:767,0:767]/255
img_tr_out=cv2.imread('frame 1 so far bnw 2.png',0)[0:767,0:767]/255
img_tr_out=(cam_img_bow(img_tr_out,0.5)).astype(np.uint8)
seg_shape=[15,15] #needs to be odd and equal to each other
pl_max=img_tr_in.shape[0:2]
pl=np.array([0.15*pl_max[0],pl_max[1]]).astype(np.uint32)
pad_in=int(np.floor(seg_shape[0]/2))
img_tr_in_pad=cam_pad(img_tr_in,pad_in)
tr_in=np.zeros([pl[0],pl[1],seg_shape[0],seg_shape[1]])
for n1 in range(0,pl[0]):
for n2 in range(0,pl[1]):
tr_in[n1,n2]=img_tr_in_pad[n1:n1+seg_shape[0],n2:n2+seg_shape[1]]
##################### NEURAL NETWORK
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense,Dropout,Conv2D, MaxPooling2D, Flatten
from keras.optimizers import adam
from keras.utils import to_categorical
import matplotlib.pyplot as plt
pad=4
input_shape=(seg_shape[0]+2*pad,seg_shape[1]+2*pad,1)
output_shape=(1,1,1)
model = Sequential()
model.add(Conv2D(32, (3, 3),input_shape=input_shape, activation='relu'))
model.add(Conv2D(64,(3, 3), activation='relu'))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(units=2, activation='softmax'))
model.compile(optimizer=adam(lr=0.001), loss='binary_crossentropy', metrics=['accuracy'])
##################### FITTING THE MODEL
tr_in_flat=tr_in.reshape([pl[0]*pl[1],seg_shape[0],seg_shape[1],1])
tr_out_flat=img_tr_out.reshape([pl_max[0]*pl_max[1]])
tr_in_flat_pad=np.zeros(tr_in_flat.shape+np.array([0,2*pad,2*pad,0]))
for n3 in range(0,tr_in_flat.shape[0]):
tr_in_flat_pad[n3,:,:,0]=cam_pad(tr_in_flat[n3,:,:,0], pad)
model.fit(tr_in_flat_pad, to_categorical(tr_out_flat[0:pl[0]*pl[1]]), epochs=5, batch_size=int(16*pl[0]),shuffle=True)
##################### PLOTTING PREDICTIONS
tr_in_full=np.zeros([pl_max[0],pl_max[1],seg_shape[0]+2*pad,seg_shape[1]+2*pad])
for n1 in range(0,pl_max[0]):
for n2 in range(0,pl_max[1]):
tr_in_full[n1,n2]=cam_pad(img_tr_in_pad[n1:n1+seg_shape[0],n2:n2+seg_shape[1]],pad)
tr_in_full_flat=tr_in_full.reshape([pl_max[0]*pl_max[1],seg_shape[0]+2*pad,seg_shape[1]+2*pad,1])
pred = model.predict(tr_in_full_flat)
pred_img=np.zeros(pred.shape[0])
for n1 in range(0,pred.shape[0]):
pred_img[n1]=round(pred[n1,0])
pred_img_out=(pred_img.reshape([pl_max[0],pl_max[1]]))
plt.subplot(1,2,1)
plt.imshow(pred_img_out)
plt.subplot(1,2,2)
plt.imshow(img_tr_in)
plt.show()