使用 python 从 Tableau 图表中抓取数据
Scraping data from Tableau chart with python
我想从以下网站抓取所有权数据:
https://www.usnewsdeserts.com/states/california/#1536357227283-a4a9d6e4-ccf9
我使用的代码如下:
import requests
from bs4 import BeautifulSoup
import json
import re
import random
url = "https://public.tableau.com/vizql/w/TopOwnersCalifornia/v/Owners/bootstrapSession/sessions/5E565C4C5F7D462BBE8DFEE9246F846E-0:0"
header = random.choice(user_agent_list)
url = "https://public.tableau.com/vizql/w/TopOwnersCalifornia/v/Owners/bootstrapSession/sessions/5E565C4C5F7D462BBE8DFEE9246F846E-0:0"
header = random.choice(user_agent_list)
HEADERS = {"User-Agent": header}
params = {"stickySessionKey": {"dataserverPermissions":"44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a"}}
r = requests.post(url, params=params, headers = HEADERS)
soup = BeautifulSoup(r.text, "html.parser")
print(soup)
我得到:
<br/>
2020-12-12 12:41:46.829
(X9S6ik90vQizHF9Qa-S@CwAAAUk,0:0)
我怎样才能得到这些数据?
我已经 tableau scraper library 从 Tableau 工作表中提取数据。您只需要在开发人员工具的网络选项卡中找到画面 URL,在这种情况下:
GET https://public.tableau.com/views/NewspapersByCountyCalifornia/Newspaperbycounty
您可以使用以下代码提取数据:
from tableauscraper import TableauScraper as TS
url = "https://public.tableau.com/views/NewspapersByCountyCalifornia/Newspaperbycounty"
ts = TS()
ts.loads(url)
dashboard = ts.getDashboard()
for t in dashboard.worksheets:
#show worksheet name
print(f"WORKSHEET NAME : {t.name}")
#show dataframe for this worksheet
print(t.data)
我想从以下网站抓取所有权数据:
https://www.usnewsdeserts.com/states/california/#1536357227283-a4a9d6e4-ccf9
我使用的代码如下:
import requests
from bs4 import BeautifulSoup
import json
import re
import random
url = "https://public.tableau.com/vizql/w/TopOwnersCalifornia/v/Owners/bootstrapSession/sessions/5E565C4C5F7D462BBE8DFEE9246F846E-0:0"
header = random.choice(user_agent_list)
url = "https://public.tableau.com/vizql/w/TopOwnersCalifornia/v/Owners/bootstrapSession/sessions/5E565C4C5F7D462BBE8DFEE9246F846E-0:0"
header = random.choice(user_agent_list)
HEADERS = {"User-Agent": header}
params = {"stickySessionKey": {"dataserverPermissions":"44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a"}}
r = requests.post(url, params=params, headers = HEADERS)
soup = BeautifulSoup(r.text, "html.parser")
print(soup)
我得到:
<br/>
2020-12-12 12:41:46.829
(X9S6ik90vQizHF9Qa-S@CwAAAUk,0:0)
我怎样才能得到这些数据?
我已经 tableau scraper library 从 Tableau 工作表中提取数据。您只需要在开发人员工具的网络选项卡中找到画面 URL,在这种情况下:
GET https://public.tableau.com/views/NewspapersByCountyCalifornia/Newspaperbycounty
您可以使用以下代码提取数据:
from tableauscraper import TableauScraper as TS
url = "https://public.tableau.com/views/NewspapersByCountyCalifornia/Newspaperbycounty"
ts = TS()
ts.loads(url)
dashboard = ts.getDashboard()
for t in dashboard.worksheets:
#show worksheet name
print(f"WORKSHEET NAME : {t.name}")
#show dataframe for this worksheet
print(t.data)