PyTesseract 适用于此代码,但不适用于我的代码(差异极小)

PyTesseract works great with this code, but not my code (with minimal differences)

我正在尝试使用 this tutorial 在我的桌面上安装 PyTesseract OCR。它在我 运行 那个脚本时起作用,正如你在这张图片中看到的那样:

,

教程中的代码:

#Construct arg parser and parse arg's
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True, help="path to input image to be OCR'd")
        # '--image' refers to the path of the input image that will be OCR'd

ap.add_argument("-c", "--min-conf", type=int, default=0, help="min conf value to filter weak text detection")
        # sets a min conf to filter weak detections
args = vars(ap.parse_args())


#Load input image, convert from BGR to RGB ch ordering, and
# use Tesseract to localize each area of text in the input image
image = cv2.imread(args["image"] )
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
results = pytesseract.image_to_data(rgb, output_type=Output.DICT)
    # 'image_to_data' detects and localizes text 


#Loop over each indiv text localizations
for i in range(0,  len(results["text"] )  ):
    #extract bounding box coordinates of the text region from the current result
    x = results["left"][i]
    y = results["top"][i]
    w = results["width"][i]
    h = results["height"][i]

    #extract OCR itself along with conf of text localztn
    text = results["text"][i]
    print(results["conf"][i])
    conf = int( results["conf"][i] )


#Filter out weak conf text localztns
    if conf > args["min_conf"]:

        #display conf and text to terminal
        print("Confidence: {}".format(conf) )
        print("Text: {}".format(text) )
        print("")

        #remove non-ASCII text so we can draw text on image using OpenCV, then draw bounding box around text with text itself
        text = "".join( [c if ord(c) < 128 else "" for c in text] ).strip()
        cv2.rectangle(image,  (x,y),  (x+w, y+h),  (0, 255, 0), 2 )
        cv2.putText(image, text, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 1.2, (0, 0, 255), 3)

    #Show output image
    cv2.imshow("Image", image)
    cv2.waitKey(0)      # makes it so that it'll wait for you to hit a key before it continues

但是当我尝试将它实现到另一个项目时它不起作用。这是 我的 代码:

screenshotOfDesktop = pyautogui.screenshot('screenshotOfDesktop.png')

#have Tesseract read it
readDesktop_SAP = cv2.imread('screenshotOfDesktop.png')

#convert data to string
rgb = cv2.cvtColor(readDesktop_SAP, cv2.COLOR_BGR2RGB)

results = pytesseract.image_to_data(rgb, config='--psm 7', output_type=Output.DICT)
    # "config= '--psm 7' " makes it so that PyTesseract reads everything as a single line of text
print(results)

# Iterating through the list of results
for i in range(0,  len(results["text"] )  ):
    if "Description" not in results["text"]:

        print("Didn't find description on screen. Please check that the SAP 'find document' page is open on the screen. ")
        input('Press ENTER to exit now. ')
        exit()
    
    if "Description" in results["text"]:
        print("Found 'Description' on screen! ")

        # Gating by confidence
        conf = int(results["conf"][i])
        if conf < 0.2:
            print("Confidence is less than 0.7. Moving on. ")
            continue

        elif conf >= 0.2:
            # Getting the coordinates of the result
            Desc_x = results["left"][i]
            Desc_y = results["top"][i]
            Desc_w = results["width"][i]
            Desc_h = results["height"][i]
            # Printing everything
            print("The coordinates are: ")
            print(x, y, width, height)
            print(f"Confidence = {conf}") 
#  

相反,我的代码只为“结果”列表吐出这个:

{'level': [1, 2, 3, 4, 5, 5], 'page_num': [1, 1, 1, 1, 1, 1], 'block_num': [0, 1, 1, 1, 1, 1], 'par_num': [0, 0, 1, 1, 1, 1], 'line_num': [0, 0, 0, 1, 1, 1], 'word_num': [0, 0, 0, 0, 1, 2], 'left': [0, 0, 0, 0, 0, 1451], 'top': [0, 4, 4, 4, 4, 145], 'width': [1920, 1727, 1912, 1727, 891, 276], 'height': [1080, 1061, 1070, 1061, 1061, 8], 'conf': ['-1', '-1', '-1', '-1', 11, 0], 'text': ['', '', '', '', 'fe', '~']}

有人知道为什么会这样吗?我知道我没有像作者那样使用 argparser,但它应该是相同的结果,不是吗?我检查以确保它也在查看正确的屏幕截图。

相关信息:

  1. Tesseract v4.1.0.20190314
  2. Python 3.9.2

在使用 PyTesseract 进行 OCR 之前,我没有意识到教程代码使用了灰度图像。我实现了灰度,之后能够找到文本。