等待一个元素显示和不显示,使用selenium,python和whatsapp-web

wait for an element to be displayed and not displayed, using selenium, python and whatsapp-web

我正在尝试使用 "whatsapp-web"、"selenium" 和 "python 3" 来了解 whatsapp 用户何时上线或下线。

为了解释更多,这就是我希望脚本工作的方式:

脚本会监听一个span(title=online)被显示,当span被显示(这意味着用户上线)我想要打印此时的时间,然后脚本将继续监听 span 消失,当它消失时脚本打印消失的时间,等等。

这是我的代码:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import datetime

driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe')
driver.get('https://web.whatsapp.com/')

# do nothing until QR code scanned and whatsapp-web is accessed
input('Enter anything after scanning QR code')

# Input the name of the user to track
name = input('Enter the name of the user : ')

# find the whatsapp user to be tracked then a click to enter the conversation
user = driver.find_element_by_xpath("//span[@title = '{}']".format(name))
user.click()

while True:
   # in the conversation page, a span with title online is diplayed when user is online.
   #the web driver will wait 8hrs=28800s, if user not online all this time script will be killed by webdriverWait
   element = WebDriverWait(driver, 28800).until(
      EC.visibility_of_element_located(
         (By.XPATH, "//span[@title = 'online']")))

   #Moment the user came online
   now = datetime.datetime.now()
   print("online at : ")
   print(now.strftime("%H:%M:%S"))

   element = WebDriverWait(driver, 28800).until(
      EC.invisibility_of_element_located(
         (By.XPATH, "//span[@title = 'online']")))

   #Moment the user went offline
   now = datetime.datetime.now()
   print("offline at : ")
   print(now.strftime("%H:%M:%S"))
   print("************")

我的脚本有效,但是, 我希望它是 运行 几个小时,比如 8 小时或更长时间,但我读到它是 不好的做法 使用 WebDriverWait 的秒数很高(28800s就我而言)。

那么有没有其他更好的方法来实现这个?

如何将输出写入 txt 或 word 文件?

有什么改进我的代码的建议吗?

如何防止 CPU 猛击?或任何可能发生的问题

WebDriverWait无非就是a (quite) fancy while/catch/sleep loop;在您的特定情况下,出于一个简单的原因,您可能想自己复制它 - 它每 500 毫秒轮询一次,这对于此任务来说可能过于详细。它还使您免受更精细的控制。

这是你自己做逻辑的方法 - 有一个布尔变量是用户在线与否;根据它的值,检查元素是否可见(.is_displayed()),休眠 X 次并重复。异常 NoSuchElementExceptionStaleElementReferenceException 将计为用户 offline/the 布尔值 false。

最后,您的代码将非常接近 WebDriverWait 中的逻辑 - 仍然是您的代码,如果需要的话会更加灵活。


或者,只需在当前代码的 WebDriverWait 中传递一个更大的内部轮询 - 它是调用的 poll_frequency 参数 :)

WebDriverWait(driver, 28800, 5)  # the value is in seconds

我不知道你在哪里读到了什么,使用 WebDriverWait 的秒数太长是不好的做法;正如您在它的代码中看到的那样,它只是给该方法多少时间 运行.
我认为建议的语气 "it is a bad practice to use WebDriverWait with a high number of seconds, because if the condition is not fulfilled in X seconds, it won't be ever fulfilled and your code will just spin and spin."。这实际上是您想要的行为:)

我也不担心对 cpu 征税 - 这些检查非常轻量级,没有任何害处。对于这么大的运行时,我担心的是浏览器本身的内存泄漏 ;)


至于优化代码——我会做的是减少语句重复;缺点是会稍微降低其可读性。我对循环的看法:

user_online = False

while True:
    # we'll be checking for the reverse of the last status of the user
    check_method = EC.visibility_of_element_located if not user_online else EC.invisibility_of_element_located

    # in the conversation page, a span with title online is diplayed when user is online.
    # the web driver will wait 8hrs=28800s for the user status to change all
    # the script will be killed by webdriverWait if that doesn't happen
    element = WebDriverWait(driver, 28800, 5).until(
            check_method((By.XPATH, "//span[@title = 'online']")))

    # The moment the user changed status
    now = datetime.datetime.now().strftime("%H:%M:%S")
    print("{} at : {}".format('online' if not user_online else 'offline', now))   # if you're using python v3.6 or more, the fstrings are much more convenient for this
    print("************")

    user_online = not user_online   # switch, to wait for the other status in the next cycle

最后,代码方面 - 脚本不能离开 运行 "endlessly"。为什么?因为如果用户在 8 小时内没有更改状态,WebDriverWait 将停止。要挽救它,请将循环体包装在 try/except:

from selenium.common.exceptions import TimeoutException  # put this in the beginning of the file

while True:
    try:
        # the code from above
    except TimeoutException:
        # the status did not change, repeat the cycle
        pass

写入文件

您可能需要 to read a bit how to do that - 这是一个非常简单的操作。

这是一个示例 - 打开一个文件进行追加(以便保留以前的日志),包装 while 循环:

with open("usermonitor.log", "a") as myfile:
    while True:
        # the other code is not repaeted for brevity
        # ...
        output = "{} at : {}".format('online' if not user_online else 'offline', now)
        print(output)
        myfile.write(output + "\n")  # this will write (append as the last line) the same text in the file
        # write() does not append newlines by itself - you have to do it yourself

我应该建议的一件事是,在您的程序中,每次执行此程序时都需要扫描 whatsapp QR,只需替换此行

driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe')

有了这个


driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe', options="user-data-dir=C:\Users\<username>\AppData\Local\Google\Chrome\User Data\whtsap")

这样您将需要扫描二维码,但只需扫描一次。