使用 bs4 从列表或 html 解析器获取参数

Question

我正在尝试通过输入拼写错误的单词来获取建议的 google 单词：

下面是我的代码：输入：Johnny walker rd lbl 输出：Johnny walker red label

import requests
import pandas as pd
from bs4 import BeautifulSoup
from pprint import pprint
key = "Johnny walker rd lbl"
query = "https://www.google.com/search?q=" + key
r = requests.get(query)
html_doc = r.text
soup = BeautifulSoup(html_doc, 'html.parser')
#for s in soup.find_all(id="rhs_block"):
 #   pprint(s.text)
find=soup.find_all('script',attrs={'type':'text/javascript'})
mylist = []
for x in find:
    mylist.append(str(x.string))
print(mylist)

输出：

['None', "(function(){var eventid='QO7rW5TtM5OYvQT516NY';google.kEI = 
eventid;})();", 'google.ac&&google.ac.c({"agen":true,"cgen":true,
"client":"heirloom-serp","dh":true,"dhqt":true,"ds":"","ffql":"en","fl":true,"host":"google.com","isbh":28,"jsonp":true,
"msgs":{"cibl":"Clear Search","dym":"Did you mean:","lcky":"I\u0026#39;m Feeling Lucky","lml":"Learn more",
"oskt":"Input tools","psrc":"This search was removed from your \u003Ca href=\"/history\"\u003EWeb History\u003C/a\u003E","psrl":"Remove",
"sbit":"Search by image","srch":"Google Search"},"ovr":{},"pq":"Johnny walker red label","refpd":true,"rfs":[],"sbpl":24,"sbpr":24,"scd":10,"sce":5,"stok":"7UqfdDr4nbKtZNfvytsBW8kPB9E","uhde":false})']

我应该如何从可用的输出列表中获取 "pq" 标签。请帮忙。

Answer 1

使用正则表达式

import re

....
html_doc = r.text
output = re.search(r'"pq":"([^"]+)', html_doc).group(1)

使用 bs4 从列表或 html 解析器获取参数

getting a parameter from a list or an html parser using bs4

beautifulsoup

google-search

python-3.x