如何从 json-ld 代码段中抓取数据?

How can I webscrape data from json-ld piece of code?

我正在尝试从这些 json-ld 代码中获取坐标('latitude' 和 'longitude')。

> <script type="application/ld+json">
> {"@context":"http://schema.org","@graph":[
>   {"@type":"Place","address":
>       {"@type":"PostalAddress","streetAddress":"XX, XX"},"geo":
>       {"@type":"GeoCoordinates","latitude":50.08872,"longitude":20.0297}}]}
> </script>

离我最近的是:

req = requests.get(link)
soup = BeautifulSoup(req.text, 'html.parser')
text_ = json.loads("".join(soup.find("script", {"type":"application/ld+json"}).contents)

但即使是这个脚本也给了我之前的 json-ld 代码块(第一个完整的 html 代码)。

即使将 json-ld 块像字符串一样,我也很感激。

谢谢

import json
from bs4 import BeautifulSoup

data = """<script type="application/ld+json">
{"@context":"http://schema.org","@graph":[
  {"@type":"Place","address":
      {"@type":"PostalAddress","streetAddress":"XX, XX"},"geo":
      {"@type":"GeoCoordinates","latitude":50.08872,"longitude":20.0297}}]}
</script>"""


soup = BeautifulSoup(data, 'html.parser')
goal = soup.select_one("script").string
match = json.loads(goal)
print(type(match))
print(match)
<class 'dict'>
{'@context': 'http://schema.org', '@graph': [{'@type': 'Place', 'address': {'@type': 'PostalAddress', 'streetAddress': 'XX, XX'}, 'geo': {'@type': 'GeoCoordinates', 'latitude': 50.08872, 'longitude': 20.0297}}]}