如何使用 python 从 wiki 获取特定语言的文章?

How can I get an article from wiki with a specific language using python?

我想在 wiki 中获取特定语言的文章。

我尝试了以下代码:

URL = "https://en.wikipedia.org/w/api.php"
PARAMS = {
        "action": "query",
        "titles": "Python",
        "prop": "langlinks",
        "lllang": "de",
        "format": "json"
        }
results = requests.get(url=URL, params=PARAMS)
soup = BeautifulSoup(results.content, 'html.parser')
print(soup.prettify())

但我没有得到整篇文章我 git 就是这个

{"batchcomplete":"","query":{"pages":{"46332325":{"pageid":46332325,"ns":0,"title":"Python","langlinks":[{"lang":"de","*":"Python"}]}}}}

你能帮助理解我做错了什么吗?

将 URL 更改为 de.wikipedia.org 以获得德语版本。

例如:

import requests
from bs4 import BeautifulSoup

URL = "https://de.wikipedia.org/w/api.php"  # <-- note the de.
PARAMS = {
        "action": "parse",
        "page": "Python (Programmiersprache)",
        "prop": "text",
        "section": 0,
        "format": "json"
        }

results = requests.get(url=URL, params=PARAMS).json()
soup = BeautifulSoup(results['parse']['text']['*'], 'html.parser')
print(soup.prettify())

打印:

<div class="mw-parser-output">
 <table cellspacing="5" class="float-right infobox toccolours toptextcells" style="font-size:90%; margin-top:0; width:21em;">
  <tbody>
   <tr>
    <th class="hintergrundfarbe6" colspan="2" style="font-size:larger;">
     Python
    </th>
   </tr>
   <tr>

... and so on.

要仅获取 wiki template/tags,您可以这样做:

URL = "https://de.wikipedia.org/w/api.php"
PARAMS = {
        "action": "query",
        "titles": "Python (Programmiersprache)",
        "prop": "revisions",
        "rvprop": "content",
        "rvsection": 0,
        "format": "json"
        }

results = requests.get(url=URL, params=PARAMS).json()
print(results)

如果您有一种语言的维基百科页面标题,而您想知道另一种语言的标题,您可以使用 "langlinks" 属性,如下所示:

https://en.wikipedia.org/w/api.php?action=query&prop=langlinks&titles=Python+(programming+language)&lllang=de

注意 "lllang" 设置为 "de"

这给你:

{
    "batchcomplete": "",
    "query": {
        "pages": {
            "23862": {
                "pageid": 23862,
                "ns": 0,
                "title": "Python (programming language)",
                "langlinks": [
                    {
                        "lang": "de",
                        "*": "Python (Programmiersprache)"
                    }
                ]
            }
        }
    }
}

查看此处了解更多信息: https://www.mediawiki.org/wiki/API:Langlinks