尝试将抓取的网络数据保存到文本文件时出错

Error trying to save scraped web data to a text file

我是新手,最近开始使用 Python。我正在尝试将检索到的 Twitter 关注者从 Web 保存到文本文件,但它不起作用。

这是我的代码:

for twusernames in driver.find_elements_by_xpath('//div[@aria-label="Timeline: Followers"]//a[@role="link"]'):
    print(twusernames.get_property('href'))
    file = open('links.txt', 'w')
    file.write(twusernames.get_property('href'))
    file.close()

我做错了什么? :( 感谢您的帮助。

试试这个。

with open('links.txt', 'w') as file
    file.write(twusernames.get_property('href'))
    file.close()

这应该有效:

with open('links.txt', 'w') as file
for twusernames in driver.find_elements_by_xpath('//div[@aria-label="Timeline: Followers"]//a[@role="link"]'):
    content = twusernames.get_property('href')
    print(content)
    file.write(content)
file.close()

get_attribute()

get_attribute() gets the given attribute or property of the element. This method first tries to return the value of a property with the given name. If a property with that name doesn’t exist, it returns the value of the attribute with the same name. If there is no attribute with that name, None is returned. Values which are considered truthy, that is equals "true" or "false", are returned as booleans. All other non-None values are returned as strings. For attributes or properties which do not exist, None is returned. To obtain the exact value of the attribute or property you can also use get_dom_attribute() or get_property() 方法。

此外,一旦您打开 file_handle,您需要将其关闭。但是,文件操作是I/O操作,必须越少越好。如此有效,您的优化代码块将是:

  • 使用get_attribute():

    file = open("links.txt", "w")
    for twusernames in driver.find_elements_by_xpath('//div[@aria-label="Timeline: Followers"]//a[@role="link"]'):
        file.write(twusernames.get_attribute("href") + "\n")
    file.close()
    
  • 使用get_property():

    file = open("links.txt", "w")
    for twusernames in driver.find_elements_by_xpath('//div[@aria-label="Timeline: Followers"]//a[@role="link"]'):
        file.write(twusernames.get_property("href") + "\n")
    file.close()
    
  • 使用get_dom_attribute():

    file = open("links.txt", "w")
    for twusernames in driver.find_elements_by_xpath('//div[@aria-label="Timeline: Followers"]//a[@role="link"]'):
        file.write(twusernames.get_dom_attribute("href") + "\n")
    file.close()