当它们不同时，如何获取检查元素代码而不是页面源代码？

Question

我试图使用以下代码从 this 网站的检查元素代码中获取所有链接。

import requests
from bs4 import BeautifulSoup

url = 'https://chromedriver.storage.googleapis.com/index.html?path=97.0.4692.71/'
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')

for link in soup.find_all('a'):
    print(link)

但是，我没有得到任何链接。然后，我通过打印检查 soup 是什么，并将其与我在实际网站上检查元素和查看页面源代码后得到的代码进行比较。 print(source) 返回的代码与我单击查看页面源时显示的代码匹配，但它与我单击检查元素时显示的代码不匹配。首先，如何获取检查元素代码而不是页面源代码？其次，为什么两者不同？

Answer 1

只需使用评论中提到的 the other URL 并将 XML 解析为 BeautifulSoup。

例如：

import requests
from bs4 import BeautifulSoup

url = "https://chromedriver.storage.googleapis.com/?delimiter=/&prefix=97.0.4692.71/"
soup = BeautifulSoup(requests.get(url).text, features="xml").find_all("Key")
keys = [f"https://chromedriver.storage.googleapis.com/{k.getText()}" for k in soup]
print("\n".join(keys))

输出：

https://chromedriver.storage.googleapis.com/97.0.4692.71/chromedriver_linux64.zip
https://chromedriver.storage.googleapis.com/97.0.4692.71/chromedriver_mac64.zip
https://chromedriver.storage.googleapis.com/97.0.4692.71/chromedriver_mac64_m1.zip
https://chromedriver.storage.googleapis.com/97.0.4692.71/chromedriver_win32.zip
https://chromedriver.storage.googleapis.com/97.0.4692.71/notes.txt

当它们不同时，如何获取检查元素代码而不是页面源代码？

How do I get the inspect element code instead of the page source when they are both different?

python

get

beautifulsoup

web-scraping

python-requests