未能在 Python 中使用 Beautiful Soup 提取 html table 数据

Failed to extract html table data using Beautiful Soup in Python

我正在尝试复制此 code 并制作一些图表,但我未能获得 csv 文件。我 运行 完全相同的代码但无济于事,因为它打印空数据帧。

代码:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import requests
from bs4 import BeautifulSoup
import geopandas as gpd
from prettytable import PrettyTable

url = 'https://www.mohfw.gov.in/'
# make a GET request to fetch the raw HTML content
web_content = requests.get(url).content

# parse the html content
soup = BeautifulSoup(web_content, "html.parser")

# remove any newlines and extra spaces from left and right
extract_contents = lambda row: [x.text.replace('\n', '') for x in row]

# find all table rows and data cells within
stats = [] 
all_rows = soup.find_all('tr')
for row in all_rows:
    stat = extract_contents(row.find_all('td')) 
# notice that the data that we require is now a list of length 5
    if len(stat) == 5:
        stats.append(stat)

#now convert the data into a pandas dataframe for further processing
new_cols = ["Sr.No", "States/UT","Confirmed","Recovered","Deceased"]
state_data = pd.DataFrame(data = stats, columns = new_cols)
state_data.head()

感谢任何帮助。

您可以从允许 return JSON 的 URI 获取所有数据。您将需要映射一些列名称,然后使用 returned 列进行计算以得出自昨天以来的变化。 new_ 列是今天的值。

import pandas as pd
import requests

r = requests.get('https://www.mohfw.gov.in/data/datanew.json').json()
df = pd.DataFrame(r)
df