BeautifulSoup4 从 pre 样式中提取和 select 数据

Question

我想从这个 link 中提取所有 short_name 我已经尝试过遵循此但失败了。我得到的结果是 'None'.

这是我的代码：

def checkStockIdExistOrNot(stockIdNumberOrName):

    BursaStockSearchIdURL = 'https://www.bursamalaysia.com/api/v1/search/stock_list?keyword=' + str(stockIdNumberOrName) + '&lang=EN&limit=99'
    BursaStockSearchIdRequest = requests.get(str(BursaStockSearchIdURL), headers=header)
    BursaStockSearchIdParser = BeautifulSoup(BursaStockSearchIdRequest.content, 'html.parser')
    BursaSelection = BursaStockSearchIdParser.find('pre')
    print(BursaSelection)

checkStockIdExistOrNot('SERBADK')

我的意图是只获得 short_name SERBADK 和 SERBADK-C17。但是，由于 'None' 值，我不能 select/pick 从中提取任何数据。

谢谢！

Answer 1

根据要求 return 将数据转换为 json 格式，因此您可以使用直接 .json 方法从中提取数据！

import requests
res=requests.get("https://www.bursamalaysia.com/api/v1/search/stock_list?keyword=SERBADK&lang=EN&limit=99")
main_data=res.json()['data']
for i in range(len(main_data)):
    print(main_data[i]['short_name'])

输出：

SERBADK
SERBADK-C16
SERBADK-C17
SERBADK-C20
SERBADK-C21
SERBADK-C22
SERBADK-C23
SERBADK-C24
SERBADK-C25
SERBADK-C26
SERBADK-WA

要查找第一个元素，您可以使用

main_data[0]['short_name']

as main_data return as list 你可以使用索引值迭代

Answer 2

由于数据采用 JSON 格式，您不需要为此使用 pre 中的 BeautifulSoup 和 select 数据。

只需使用 (response.json()) 将 response 转换为 JSON 并提取您需要的数据。

此代码将打印所有 short_names。

import requests

def checkStockIdExistOrNot(stockIdNumberOrName):
    url = 'https://www.bursamalaysia.com/api/v1/search/stock_list?keyword=' + str(stockIdNumberOrName) + '&lang=EN&limit=99'
    response = requests.get(url)
    info = response.json()

    for i in info['data']:
        print(i['short_name'])

checkStockIdExistOrNot('SERBADK')

SERBADK
SERBADK-C16
SERBADK-C17
SERBADK-C20
SERBADK-C21
SERBADK-C22
SERBADK-C23
SERBADK-C24
SERBADK-C25
SERBADK-C26
SERBADK-WA

由于您打算只获得 short_name SERBADK 和 SERBADK-C17，您可以这样做

for i in info['data']:
        if i['short_name'] in ['SERBADK', 'SERBADK-C17']:
            print(i['short_name'])

SERBADK
SERBADK-C17

BeautifulSoup4 从 pre 样式中提取和 select 数据

BeautifulSoup4 extract and select data from pre style

python

tags

beautifulsoup

web-scraping