Python 将简历图像存储在 mongodb gridfs 中

Python store cv image in mongodb gridfs

为了测试,我们想将标记的图像日期存储到 mongodb 数据库中。

在我们的图像管道中的某个点,我们将标记图像作为 openCV 图像,表示为 numpy ndarray。

如何存储图像?由于图片比较大,我们考虑使用Gridfs。

到目前为止我们的简单代码:

from pymongo import MongoClient
import gridfs
import cv2

# access our image collection
client = MongoClient('localhost', 27017)
db = client['testDatabaseONE']
testCollection = db['myImageCollection']

fs = gridfs.GridFS(db)

# read the image and convert it to RGB
image = cv2.imread('./testImage.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# store the image
imageID = fs.put(image)

# create our image meta data
meta = {
    'imageID': imageID,
    'name': 'testImage1'
}

# insert the meta data
testCollection.insert_one(meta)

遗憾的是 imageID = fs.put(image) 抛出这个错误:

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/gridfs/grid_file.py", line 337, in write read = data.read AttributeError: 'numpy.ndarray' object has no attribute 'read'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/johann/PycharmProjects/mongoTesting/mongoTesting.py", line 17, in imageID = fs.put(image) File "/usr/local/lib/python3.6/dist-packages/gridfs/init.py", line 121, in put grid_file.write(data) File "/usr/local/lib/python3.6/dist-packages/gridfs/grid_file.py", line 341, in write raise TypeError("can only write strings or file-like objects") TypeError: can only write strings or file-like objects

关于如何使用 gridfs 存储图像的任何提示或想法,或者有更好的方法吗?

很明显问题与图片大小无关。有2个异常,我们需要先解决第一个。

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/gridfs/grid_file.py", line 337, in write read = data.read AttributeError: 'numpy.ndarray' object has no attribute 'read'

请检查文件 grid_file.py”,第 337 行。numpy.ndarray 没有名为 read 的方法。要从此数据数组读取数据,您只需要切片即可,例如:

b = np.fromfunction(f,(5,4),dtype=int)
>>> b
array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [20, 21, 22, 23],
       [30, 31, 32, 33],
       [40, 41, 42, 43]])

>>> b[0:5, 1]  # each row in the second column of b
array([ 1, 11, 21, 31, 41])

我通过将 ndarray 转换为字符串解决了这个问题。

将新图像及其元数据存储到数据库中:

# convert ndarray to string
imageString = image.tostring()

# store the image
imageID = fs.put(imageString, encoding='utf-8')

# create our image meta data
meta = {
    'name': 'myTestSet',
    'images': [
        {
            'imageID': imageID,
            'shape': image.shape,
            'dtype': str(image.dtype)
        }
    ]
}

# insert the meta data
testCollection.insert_one(meta)

取回图片:

# get the image meta data
image = testCollection.find_one({'name': 'myTestSet'})['images'][0]

# get the image from gridfs
gOut = fs.get(image['imageID'])

# convert bytes to ndarray
img = np.frombuffer(gOut.read(), dtype=np.uint8)

# reshape to match the image size
img = np.reshape(img, image['shape'])