等待一个元素显示和不显示,使用selenium,python和whatsapp-web
wait for an element to be displayed and not displayed, using selenium, python and whatsapp-web
我正在尝试使用 "whatsapp-web"、"selenium" 和 "python 3" 来了解 whatsapp 用户何时上线或下线。
为了解释更多,这就是我希望脚本工作的方式:
脚本会监听一个span(title=online)被显示,当span被显示(这意味着用户上线)我想要打印此时的时间,然后脚本将继续监听 span 消失,当它消失时脚本打印消失的时间,等等。
这是我的代码:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import datetime
driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe')
driver.get('https://web.whatsapp.com/')
# do nothing until QR code scanned and whatsapp-web is accessed
input('Enter anything after scanning QR code')
# Input the name of the user to track
name = input('Enter the name of the user : ')
# find the whatsapp user to be tracked then a click to enter the conversation
user = driver.find_element_by_xpath("//span[@title = '{}']".format(name))
user.click()
while True:
# in the conversation page, a span with title online is diplayed when user is online.
#the web driver will wait 8hrs=28800s, if user not online all this time script will be killed by webdriverWait
element = WebDriverWait(driver, 28800).until(
EC.visibility_of_element_located(
(By.XPATH, "//span[@title = 'online']")))
#Moment the user came online
now = datetime.datetime.now()
print("online at : ")
print(now.strftime("%H:%M:%S"))
element = WebDriverWait(driver, 28800).until(
EC.invisibility_of_element_located(
(By.XPATH, "//span[@title = 'online']")))
#Moment the user went offline
now = datetime.datetime.now()
print("offline at : ")
print(now.strftime("%H:%M:%S"))
print("************")
我的脚本有效,但是,
我希望它是 运行 几个小时,比如 8 小时或更长时间,但我读到它是 不好的做法 使用 WebDriverWait 的秒数很高(28800s就我而言)。
那么有没有其他更好的方法来实现这个?
如何将输出写入 txt 或 word 文件?
有什么改进我的代码的建议吗?
如何防止 CPU 猛击?或任何可能发生的问题
WebDriverWait
无非就是a (quite) fancy while/catch/sleep loop;在您的特定情况下,出于一个简单的原因,您可能想自己复制它 - 它每 500 毫秒轮询一次,这对于此任务来说可能过于详细。它还使您免受更精细的控制。
这是你自己做逻辑的方法 - 有一个布尔变量是用户在线与否;根据它的值,检查元素是否可见(.is_displayed()
),休眠 X 次并重复。异常 NoSuchElementException
、StaleElementReferenceException
将计为用户 offline/the 布尔值 false。
最后,您的代码将非常接近 WebDriverWait
中的逻辑 - 仍然是您的代码,如果需要的话会更加灵活。
或者,只需在当前代码的 WebDriverWait
中传递一个更大的内部轮询 - 它是调用的 poll_frequency
参数 :)
WebDriverWait(driver, 28800, 5) # the value is in seconds
我不知道你在哪里读到了什么,使用 WebDriverWait 的秒数太长是不好的做法;正如您在它的代码中看到的那样,它只是给该方法多少时间 运行.
我认为建议的语气 "it is a bad practice to use WebDriverWait with a high number of seconds, because if the condition is not fulfilled in X seconds, it won't be ever fulfilled and your code will just spin and spin."。这实际上是您想要的行为:)
我也不担心对 cpu 征税 - 这些检查非常轻量级,没有任何害处。对于这么大的运行时,我担心的是浏览器本身的内存泄漏 ;)
至于优化代码——我会做的是减少语句重复;缺点是会稍微降低其可读性。我对循环的看法:
user_online = False
while True:
# we'll be checking for the reverse of the last status of the user
check_method = EC.visibility_of_element_located if not user_online else EC.invisibility_of_element_located
# in the conversation page, a span with title online is diplayed when user is online.
# the web driver will wait 8hrs=28800s for the user status to change all
# the script will be killed by webdriverWait if that doesn't happen
element = WebDriverWait(driver, 28800, 5).until(
check_method((By.XPATH, "//span[@title = 'online']")))
# The moment the user changed status
now = datetime.datetime.now().strftime("%H:%M:%S")
print("{} at : {}".format('online' if not user_online else 'offline', now)) # if you're using python v3.6 or more, the fstrings are much more convenient for this
print("************")
user_online = not user_online # switch, to wait for the other status in the next cycle
最后,代码方面 - 脚本不能离开 运行 "endlessly"。为什么?因为如果用户在 8 小时内没有更改状态,WebDriverWait
将停止。要挽救它,请将循环体包装在 try/except:
中
from selenium.common.exceptions import TimeoutException # put this in the beginning of the file
while True:
try:
# the code from above
except TimeoutException:
# the status did not change, repeat the cycle
pass
写入文件
您可能需要 to read a bit how to do that - 这是一个非常简单的操作。
这是一个示例 - 打开一个文件进行追加(以便保留以前的日志),包装 while
循环:
with open("usermonitor.log", "a") as myfile:
while True:
# the other code is not repaeted for brevity
# ...
output = "{} at : {}".format('online' if not user_online else 'offline', now)
print(output)
myfile.write(output + "\n") # this will write (append as the last line) the same text in the file
# write() does not append newlines by itself - you have to do it yourself
我应该建议的一件事是,在您的程序中,每次执行此程序时都需要扫描 whatsapp QR,只需替换此行
driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe')
有了这个
driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe', options="user-data-dir=C:\Users\<username>\AppData\Local\Google\Chrome\User Data\whtsap")
这样您将需要扫描二维码,但只需扫描一次。
我正在尝试使用 "whatsapp-web"、"selenium" 和 "python 3" 来了解 whatsapp 用户何时上线或下线。
为了解释更多,这就是我希望脚本工作的方式:
脚本会监听一个span(title=online)被显示,当span被显示(这意味着用户上线)我想要打印此时的时间,然后脚本将继续监听 span 消失,当它消失时脚本打印消失的时间,等等。
这是我的代码:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import datetime
driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe')
driver.get('https://web.whatsapp.com/')
# do nothing until QR code scanned and whatsapp-web is accessed
input('Enter anything after scanning QR code')
# Input the name of the user to track
name = input('Enter the name of the user : ')
# find the whatsapp user to be tracked then a click to enter the conversation
user = driver.find_element_by_xpath("//span[@title = '{}']".format(name))
user.click()
while True:
# in the conversation page, a span with title online is diplayed when user is online.
#the web driver will wait 8hrs=28800s, if user not online all this time script will be killed by webdriverWait
element = WebDriverWait(driver, 28800).until(
EC.visibility_of_element_located(
(By.XPATH, "//span[@title = 'online']")))
#Moment the user came online
now = datetime.datetime.now()
print("online at : ")
print(now.strftime("%H:%M:%S"))
element = WebDriverWait(driver, 28800).until(
EC.invisibility_of_element_located(
(By.XPATH, "//span[@title = 'online']")))
#Moment the user went offline
now = datetime.datetime.now()
print("offline at : ")
print(now.strftime("%H:%M:%S"))
print("************")
我的脚本有效,但是, 我希望它是 运行 几个小时,比如 8 小时或更长时间,但我读到它是 不好的做法 使用 WebDriverWait 的秒数很高(28800s就我而言)。
那么有没有其他更好的方法来实现这个?
如何将输出写入 txt 或 word 文件?
有什么改进我的代码的建议吗?
如何防止 CPU 猛击?或任何可能发生的问题
WebDriverWait
无非就是a (quite) fancy while/catch/sleep loop;在您的特定情况下,出于一个简单的原因,您可能想自己复制它 - 它每 500 毫秒轮询一次,这对于此任务来说可能过于详细。它还使您免受更精细的控制。
这是你自己做逻辑的方法 - 有一个布尔变量是用户在线与否;根据它的值,检查元素是否可见(.is_displayed()
),休眠 X 次并重复。异常 NoSuchElementException
、StaleElementReferenceException
将计为用户 offline/the 布尔值 false。
最后,您的代码将非常接近 WebDriverWait
中的逻辑 - 仍然是您的代码,如果需要的话会更加灵活。
或者,只需在当前代码的 WebDriverWait
中传递一个更大的内部轮询 - 它是调用的 poll_frequency
参数 :)
WebDriverWait(driver, 28800, 5) # the value is in seconds
我不知道你在哪里读到了什么,使用 WebDriverWait 的秒数太长是不好的做法;正如您在它的代码中看到的那样,它只是给该方法多少时间 运行.
我认为建议的语气 "it is a bad practice to use WebDriverWait with a high number of seconds, because if the condition is not fulfilled in X seconds, it won't be ever fulfilled and your code will just spin and spin."。这实际上是您想要的行为:)
我也不担心对 cpu 征税 - 这些检查非常轻量级,没有任何害处。对于这么大的运行时,我担心的是浏览器本身的内存泄漏 ;)
至于优化代码——我会做的是减少语句重复;缺点是会稍微降低其可读性。我对循环的看法:
user_online = False
while True:
# we'll be checking for the reverse of the last status of the user
check_method = EC.visibility_of_element_located if not user_online else EC.invisibility_of_element_located
# in the conversation page, a span with title online is diplayed when user is online.
# the web driver will wait 8hrs=28800s for the user status to change all
# the script will be killed by webdriverWait if that doesn't happen
element = WebDriverWait(driver, 28800, 5).until(
check_method((By.XPATH, "//span[@title = 'online']")))
# The moment the user changed status
now = datetime.datetime.now().strftime("%H:%M:%S")
print("{} at : {}".format('online' if not user_online else 'offline', now)) # if you're using python v3.6 or more, the fstrings are much more convenient for this
print("************")
user_online = not user_online # switch, to wait for the other status in the next cycle
最后,代码方面 - 脚本不能离开 运行 "endlessly"。为什么?因为如果用户在 8 小时内没有更改状态,WebDriverWait
将停止。要挽救它,请将循环体包装在 try/except:
from selenium.common.exceptions import TimeoutException # put this in the beginning of the file
while True:
try:
# the code from above
except TimeoutException:
# the status did not change, repeat the cycle
pass
写入文件
您可能需要 to read a bit how to do that - 这是一个非常简单的操作。
这是一个示例 - 打开一个文件进行追加(以便保留以前的日志),包装 while
循环:
with open("usermonitor.log", "a") as myfile:
while True:
# the other code is not repaeted for brevity
# ...
output = "{} at : {}".format('online' if not user_online else 'offline', now)
print(output)
myfile.write(output + "\n") # this will write (append as the last line) the same text in the file
# write() does not append newlines by itself - you have to do it yourself
我应该建议的一件事是,在您的程序中,每次执行此程序时都需要扫描 whatsapp QR,只需替换此行
driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe')
有了这个
driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe', options="user-data-dir=C:\Users\<username>\AppData\Local\Google\Chrome\User Data\whtsap")
这样您将需要扫描二维码,但只需扫描一次。