Python3图片文件字典中存在Key时出现Key错误

Question

使用python和os为目录中的文件创建键值字典，并使用张量流预处理图像和extract/print文本。
最终目标：创建一个 For 循环，获取目录中的每个图像，将文件名作为字符串附加到 grocery_cve_project 中的路径，处理每个图像，并提取要读取的文本在控制台中

import os
print('os imported')
    
# import packages
from PIL import Image
import pytesseract
import cv2
    
print('packages imported')
    
### Part 1: store image names in dictionary
    
dir_name = ".\grocery_cve_project"
# This is where we get our array
# of file names and store in results
result = os.listdir(dir_name)
    
key_index_store = {}
for i, e in enumerate(result):
    key_index_store[i] = e
    #print(i, e)
    
#print("Our key value store is: ")
#print(key_index_store)
    
#  The types of file names we care about.
photo_extensions = [".jpg", ".png"]

# declare the tesseract executable path
pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files\Tesseract-OCR\tesseract.exe'

第 2 部分：图像处理

for e in key_index_store[e]:
    image_to_ocr = cv2.imread('grocery_cve_project_\%s' % 'e')
    print(image_to_ocr)
        
    # convert to gray
    preprocessed_img = cv2.cvtColor(image_to_ocr, cv2.COLOR_BGR2GRAY)
   
    # step 2: do binary and Otsu thresholding
    preprocessed_img = cv2.threshold(preprocessed_img, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
    
    # step 3: Median Blur to remove noise in image
        preprocessed_img = cv2.medianBlur(preprocessed_img, 3)
    
    '''Step 4: SAVE AND LOAD IMAGE AS PIL image'''
    
    # step 1: Save the processed image to convert to PIL image
    for i in key_index_store[i]:
        cv2.imwrite(("tempdir\temp_img_%s.jpg" % 'i'), preprocessed_img)
        # step 2: load the image as a PIL/Pillow image
        preprocessed__pil_img = Image.open('temp_img.jpg')
    
    # step 1: do OCR of image using Tesseract
    text_extracted = pytesseract.image_to_string(preprocessed__pil_img)
    #Step 2: print the text
    print(text_extracted)

(Grocery_env) D:\Documents\Python\Multiple file array>"1. grocery tesseract.py"
    os imported
    packages imported
    Traceback (most recent call last):
      File "D:\Documents\Python\Multiple file array. grocery tesseract.py", line 44, in <module>
        for e in key_index_store[e]:
    KeyError: 'file_99.png'

研究表明当词典 does not exist 中的项目出现时会出现此错误。但是，如果我运行代码在第 21 行 print(i, e) 中被注释掉，它会为目录中的所有文件输出 key/value 对，并且 'file_99' 确实存在于索引处236，并且实际位于给定目录中。
图像文件的目录与源代码位于同一文件夹中。

Answer 1

在第一部分中，您使用数字索引填充字典

key_index_store = {}
for i, e in enumerate(result):
    key_index_store[i] = e

这有点多余，因为您的结果已经按数字编入索引。然后，在第二部分你遍历 key_index_store[e] 它最有可能是一个错误，只需删除 [e]

Answer 2

如果我正确理解了您的代码，我想您可能对如何从字典中提取 key/value 对感到有点困惑。但在这种情况下，甚至不需要 dict。

你可以在一个循环中写完这些：

for idx, filename in enumerate(result):
    image_to_ocr = cv2.imread(os.path.join(dir_name, filename))
    # ... your image processing code ...
    out_filename = os.path.join("tempdir", f"temp_img_{idx}.jpg")
    cv2.imwrite(out_filename, preprocessed_img)
    preprocessed_pil_img = Image.open(out_filename)
    # ... the rest ...

Python3图片文件字典中存在Key时出现Key错误

Python3 Key error when Key exists in dictionary of image files

python

cv2

第 2 部分：图像处理