如何检测图像并使用pyautogui单击它?

How to detect an image and click it with pyautogui?

我想学习如何让机器人点击图片,我试着看了 yt 教程,但我找不到代码中的错误,因为这对我来说是第一次使用 python, 我尝试了以下代码:

from pyautogui import *
import pyautogui
import time
import keyboard
import random
import win32api, win32con

time.sleep(5)

def click():
    win32api.mouse_event(win32con.MOUSEEVENTF_LEFTDOWN,0,0)
    win32api.mouse_event(win32con.MOUSEEVENTF_LEFTUP,0,0)

while keyboard.is_pressed('q') == False:
    flag = 0
    
    if pyautogui.locateOnScreen('benz.png', region=(0,0,1366,768), grayscale=True, confidence=0.5) != None:
                flag = 1
                click()
                time.sleep(0.05)
                break

                
                if flag == 1:
                 break

但我不断得到:

Traceback (most recent call last):
  File "c:\Program Files\Karim\autoclicker\main+stickman.py", line 17, in <module>
    if pyautogui.locateOnScreen('benz.png', region=(0,0,1366,768), grayscale=True, confidence=0.5) != None:
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyautogui\__init__.py", line 175, in wrapper
    return wrappedFunction(*args, **kwargs)
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyautogui\__init__.py", line 213, in locateOnScreen
    return pyscreeze.locateOnScreen(*args, **kwargs)
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyscreeze\__init__.py", line 373, in locateOnScreen
    retVal = locate(image, screenshotIm, **kwargs)
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyscreeze\__init__.py", line 353, in locate
    points = tuple(locateAll(needleImage, haystackImage, **kwargs))
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyscreeze\__init__.py", line 207, in _locateAll_opencv
    needleImage = _load_cv2(needleImage, grayscale)
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyscreeze\__init__.py", line 170, in _load_cv2
    raise IOError("Failed to read %s because file is missing, "
OSError: Failed to read benz.png because file is missing, has improper permissions, or is an unsupported or invalid format

注意:benz.png文件和代码在同一个文件夹,是png格式,实际上是一张照片(意思是双击打开显示照片)

代码中可能存在我不知道的愚蠢错误,因为我对 python

几乎一无所知

这可能是一个权限问题,因为 pyautogui 运行 在多个脚本实例中并且无法访问正确的文件。

无论如何,您可以通过直接读取文件来解决这个问题,例如:

from cv2 import imread
image = imread('benz.png')
if pyautogui.locateOnScreen(image,... # and so on 

我编辑了:

if pyautogui.locateOnScreen('benz.png', region=(0,0,1366,768), grayscale=True, confidence=0.5) != None:

至:

if pyautogui.locateOnScreen('C:/Program Files/Karim/Others/benz.png', region=(0,0,1366,768), grayscale=True, confidence=0.5) != None:

它奏效了

PyAutoGUI 有一个名为 locateOnScreen() 的内置 function 其中 returns 图像中心的 x, y 坐标,如果它可以在当前屏幕上找到它(它需要截图然后分析它)。

图像必须 完全匹配 才能正常工作;即,如果您想单击 button.png,该按钮图片的大小/分辨率必须与 windows 中的按钮完全相同,程序才能识别它。实现此目的的一种方法是截取屏幕截图,在画图中将其打开并仅剪下您想要按下的按钮(或者您可以让 PyAutoGUI 为您完成,我将在后面的示例中展示)。

import pyautogui

question_list = ['greencircle', 'redcircle', 'bluesquare', 'redtriangle']

user_input = input('Where should I click? ')

while user_input not in question_list:
    print('Incorrect input, available options: greencircle, redcircle, bluesquare, redtriangle')
    user_input = input('Where should I click?')

location = pyautogui.locateOnScreen(user_input + '.png')
pyautogui.click(location)

以上示例要求您的目录中已经有 greencircle.png 和所有其他 .png

PyAutoGUI 也可以 screenshots 并且您可以指定屏幕的哪个区域进行拍摄 pyautogui.screenshot(region=(0, 0, 0, 0)) 前两个值是您要拍摄的区域左上角的 x,y 坐标想要select,第三个是向右多远(x),第四个是向下多远(y)。

以下示例截取 Windows 10 徽标的屏幕截图,将其保存到文件中,然后使用指定的 .png 文件单击徽标

import pyautogui

pyautogui.screenshot('win10_logo.png', region=(0, 1041, 50, 39))
location = pyautogui.locateOnScreen('win10_logo.png')
pyautogui.click(location)

您也不必将屏幕截图保存到文件中,您可以将其保存为变量

import pyautogui

win10 = pyautogui.screenshot(region=(0, 1041, 50, 39))
location = pyautogui.locateOnScreen(win10)
pyautogui.click(location)

让程序检测用户是否点击了某个区域(比方说,windows 10 徽标)需要另一个库,例如 pynput

from pynput.mouse import Listener    

def on_click(x, y, button, pressed):
    if 0 < x < 50 and 1080 > y > 1041 and str(button) == 'Button.left' and pressed:
        print('You clicked on Windows 10 Logo')
        return False    # get rid of return statement if you want a continuous loop

with Listener(on_click=on_click) as listener:
    listener.join()

综合考虑

import pyautogui
from pynput.mouse import Listener

win10 = pyautogui.screenshot(region=(0, 1041, 50, 39))
location = pyautogui.locateOnScreen(win10)

# location[0] is the top left x coord
# location[1] is the top left y coord
# location[2] is the distance from left x coord to right x coord
# location[3] is the distance from top y coord to bottom y coord

x_boundary_left = location[0]
y_boundary_top = location[1]
x_boundary_right = location[0] + location[2]
y_boundary_bottom = location[1] + location[3]


def on_click(x, y, button, pressed):
    if x_boundary_left < x < x_boundary_right and y_boundary_bottom > y > y_boundary_top and str(button) == 'Button.left' and pressed:
        print('You clicked on Windows 10 Logo')
        return False    # get rid of return statement if you want a continuous loop


with Listener(on_click=on_click) as listener:
    listener.join()