如何单击网站中的 CSV 按钮并下载 python 中的数据
How to click the CSV button in a website and download the data in python
我正在尝试从以下网站下载 CSV 和 JSON 数据:https://worldpopulationreview.com/countries/countries-by-gdp/#worldCountries
如何模拟点击csv文件?
import pandas as pd
import requests
from lxml import html,etree
url = "https://worldpopulationreview.com/countries/countries-by-gdp/#worldCountries"
# now I am not sure, how to click csv button of actual website
# also I am not sure how it will download the csv file
# to DOWNLOADS as like when I click the page
我会抓取网页,但我想学习点击按钮
import pandas as pd
import requests
url = "https://worldpopulationreview.com/countries/countries-by-gdp/#worldCountries"
r = requests.get(url)
df = pd.read_html(r.text)[0]
df.to_csv('data.csv')
您可以使用 selenium 来模拟点击 csv 下载按钮
https://selenium-python.readthedocs.io/getting-started.html#example-explained
您需要下载pip install selenium
如果使用 Chrome,请在此处下载 Chrome 驱动程序 - Chrome driver。然后找到button/link的xpath,我是用inspect element找到xpath的:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome(executable_path='/Users/xxx/Downloads/chromedriver-1')
driver.get('https://worldpopulationreview.com/countries/countries-by-gdp')#put here the adress of your page
btn = driver.find_element_by_xpath('/html/body/div[1]/div/div[1]/div[2]/div[2]/div[1]/div/div/div/div[2]/div[1]/a[2]')
btn.click()
df = pd.read_csv('/Users/xxx/Downloads/data.csv')
print(df.head())
driver.close()
rank country imfGDP unGDP gdpPerCapita pop
0 1 United States 2.219812e+13 18624475000000 67063.2695 331002.651
1 2 China 1.546810e+13 11218281029298 10746.7828 1439323.776
2 3 Japan 5.495420e+12 4936211827875 43450.1405 126476.461
3 4 Germany 4.157120e+12 3477796274497 49617.1450 83783.942
4 6 United Kingdom 2.927080e+12 2647898654635 43117.5725 67886.011
找到 xpath 的图像:
我正在尝试从以下网站下载 CSV 和 JSON 数据:https://worldpopulationreview.com/countries/countries-by-gdp/#worldCountries
如何模拟点击csv文件?
import pandas as pd
import requests
from lxml import html,etree
url = "https://worldpopulationreview.com/countries/countries-by-gdp/#worldCountries"
# now I am not sure, how to click csv button of actual website
# also I am not sure how it will download the csv file
# to DOWNLOADS as like when I click the page
我会抓取网页,但我想学习点击按钮
import pandas as pd
import requests
url = "https://worldpopulationreview.com/countries/countries-by-gdp/#worldCountries"
r = requests.get(url)
df = pd.read_html(r.text)[0]
df.to_csv('data.csv')
您可以使用 selenium 来模拟点击 csv 下载按钮 https://selenium-python.readthedocs.io/getting-started.html#example-explained
您需要下载pip install selenium
如果使用 Chrome,请在此处下载 Chrome 驱动程序 - Chrome driver。然后找到button/link的xpath,我是用inspect element找到xpath的:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome(executable_path='/Users/xxx/Downloads/chromedriver-1')
driver.get('https://worldpopulationreview.com/countries/countries-by-gdp')#put here the adress of your page
btn = driver.find_element_by_xpath('/html/body/div[1]/div/div[1]/div[2]/div[2]/div[1]/div/div/div/div[2]/div[1]/a[2]')
btn.click()
df = pd.read_csv('/Users/xxx/Downloads/data.csv')
print(df.head())
driver.close()
rank country imfGDP unGDP gdpPerCapita pop
0 1 United States 2.219812e+13 18624475000000 67063.2695 331002.651
1 2 China 1.546810e+13 11218281029298 10746.7828 1439323.776
2 3 Japan 5.495420e+12 4936211827875 43450.1405 126476.461
3 4 Germany 4.157120e+12 3477796274497 49617.1450 83783.942
4 6 United Kingdom 2.927080e+12 2647898654635 43117.5725 67886.011
找到 xpath 的图像: