BeautifulSoup4 无法从 table 抓取数据

BeautifulSoup4 not able to scrape data from table

我想抓取网站 table 的第 2 列和第 3 列 https://www.airvistara.com/fly/flightschedule 我使用的代码是

import bs4 as bs
from urllib2 import urlopen

sauce=urlopen('https://www.airvistara.com/fly/flightschedule').read()
soup=bs.BeautifulSoup(sauce,'lxml')
table=soup.table
table_body=table.find('tbody')
table_rows=table_body.find_all('tr')
for tr in table_rows:
    td=tr.find_all('td')
    row=[i.text for i in td]
    print row

但是我找不到想要的解决方案

您尝试解析的内容是通过 ajax 加载的,bs 无法使用该内容。
这是在 python 字典中获取 Outbound Flights 的工作代码:

import json
import requests

post_fields = {"flightDate":"22/04/2017"}
headers = {'content-type': 'application/json'}
url = 'https://www.airvistara.com/fly/getFlightschedule'
json_response = requests.post(url, data=json.dumps(post_fields), headers=headers).text
decoded_json = json.loads(json_response)
print decoded_json

输出:

{u'flightSchedule': [{u'effectiveFrom': u'19-APR-2017', u'flightCode': u'UK 0946', u'baseFareL1': 0, u'flightDate': u'Saturday, 28 October 2017',...

要获取每个航班的详细信息,您可以使用:

for flight in decoded_json['flightSchedule']:
    print flight['effectiveFrom']
    print flight['flightCode']
    print flight['baseFareL1']
    print flight['flightDate']
    print flight['daysOfOperation']
    print flight['arrivalStation']
    print flight['departureStation']
    print flight['via']
    print flight['scheduledArrivalTime']
    print flight['departureCityName']
    print flight['effectiveTo']
    print flight['arrivalCityName']
    print flight['scheduledDepartureTime']

这将输出如下内容:

19-APR-2017
UK 0946
0
Saturday, 28 October 2017
Daily
DEL
AMD
-
10:25
Ahmedabad
28-OCT-2017
New Delhi
08:45

备注:
1 - 如果您需要指定 arrivalStationdepartureStation,请使用:

post_fields = {"flightDate":"22/04/2017","arrivalStation":"AIRPORTCODE","departureStation":"AIRPORTCODE"}