BeautifulSoup 输出给出一个空列表
BeutifulSoup Output giving an empty list
我正在尝试使用 BeutifulSoup + python-requests 从网站上抓取文本。但它只得到 []
作为输出。
from bs4 import BeautifulSoup
import requests
import urllib.request
page = requests.get("https://www.adsbhub.org/stations.php")
soup = BeautifulSoup(page.content, "lxml")
table = soup.find_all('table', id="jqGridUsers")
print(table)
上面给了我一个 table 我需要抓取值。
输出:
[<table id="jqGridUsers"></table>]
但是当我尝试提取数据并在 table 中找到 tr
时,它 returns 是一个空列表。
我做错了什么?
考虑到 nd
参数是 Unix 时间 1653375896726
,您不能在请求获取最新数据期间指定它。
import pandas as pd
import requests
import json
def main(url):
params = {
"cmd": "1",
"webkey": "e2321319bb42e360a23413q29772a2b2a2",
"_search": "false",
"nd": "1653375896726",
"rows": "3000",
"page": "1",
"sidx": "",
"sord": "asc"
}
r = requests.get(url, params=params)
data = json.loads(r.text[9:])
target = [i['cell'] for i in data['rows']]
df = pd.DataFrame(target)
print(df)
main('https://www.adsbhub.org/stations_ctr.php')
0 1 2 ... 7 8 9
0 460 PD3RFR|460 Radar Maarssen ... 381434 1 460
1 2018 EDDN|2018 DerrChecker ... 506720 1 2018
2 291 Flightlive|291 flightlive ... 161816 1 291
3 3114 FachaRadar|3114 facha ... 31995 1 3114
4 3056 Fly Italy Adsb|3056 ... 328618 1 3056
... ... ... ... ... ... ... ...
2216 1829 India|1829 Sanket ... None None 1829
2217 2771 N-Eugene|2771 ... None None 2771
2218 791 RPI Geneva|791 Marclg ... None None 791
2219 1617 T-EDDV66|1617 ... None None 1617
2220 2533 Nijmegen-Radar|2533 Nijmegen-Radar ... None None 2533
[2221 rows x 10 columns]
我正在尝试使用 BeutifulSoup + python-requests 从网站上抓取文本。但它只得到 []
作为输出。
from bs4 import BeautifulSoup
import requests
import urllib.request
page = requests.get("https://www.adsbhub.org/stations.php")
soup = BeautifulSoup(page.content, "lxml")
table = soup.find_all('table', id="jqGridUsers")
print(table)
上面给了我一个 table 我需要抓取值。
输出:
[<table id="jqGridUsers"></table>]
但是当我尝试提取数据并在 table 中找到 tr
时,它 returns 是一个空列表。
我做错了什么?
考虑到 nd
参数是 Unix 时间 1653375896726
,您不能在请求获取最新数据期间指定它。
import pandas as pd
import requests
import json
def main(url):
params = {
"cmd": "1",
"webkey": "e2321319bb42e360a23413q29772a2b2a2",
"_search": "false",
"nd": "1653375896726",
"rows": "3000",
"page": "1",
"sidx": "",
"sord": "asc"
}
r = requests.get(url, params=params)
data = json.loads(r.text[9:])
target = [i['cell'] for i in data['rows']]
df = pd.DataFrame(target)
print(df)
main('https://www.adsbhub.org/stations_ctr.php')
0 1 2 ... 7 8 9
0 460 PD3RFR|460 Radar Maarssen ... 381434 1 460
1 2018 EDDN|2018 DerrChecker ... 506720 1 2018
2 291 Flightlive|291 flightlive ... 161816 1 291
3 3114 FachaRadar|3114 facha ... 31995 1 3114
4 3056 Fly Italy Adsb|3056 ... 328618 1 3056
... ... ... ... ... ... ... ...
2216 1829 India|1829 Sanket ... None None 1829
2217 2771 N-Eugene|2771 ... None None 2771
2218 791 RPI Geneva|791 Marclg ... None None 791
2219 1617 T-EDDV66|1617 ... None None 1617
2220 2533 Nijmegen-Radar|2533 Nijmegen-Radar ... None None 2533
[2221 rows x 10 columns]