如何在 python 中集成维基数据查询

How to integrate Wikidata query in python

我目前正在使用 Wikidata Query Service 来 运行 我的维基数据查询。

例如,我的维基数据查询之一如下所示。

SELECT ?sLabel {
    SERVICE wikibase:mwapi {
        bd:serviceParam wikibase:api "EntitySearch".
        bd:serviceParam wikibase:endpoint "www.wikidata.org".
        bd:serviceParam mwapi:search "natural language processing".
        bd:serviceParam mwapi:language "en".
        ?item wikibase:apiOutputItem mwapi:item.
        ?num wikibase:apiOrdinal true.
    }
    ?s wdt:P279|wdt:P31 ?item .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
ORDER BY ?num
LIMIT 10

我想知道我们是否可以在 python 程序中使用这些查询?如果是这样,我们如何整合 python 中的查询?

如果需要,我很乐意提供更多详细信息。

sparqlwrapper 可以处理。您可以找到更多信息here

如果您想在没有 SPARQL 特定库的情况下执行此操作:

import requests

url = 'https://query.wikidata.org/sparql'
query = '''
SELECT ?item ?itemLabel ?linkcount WHERE {
    ?item wdt:P31/wdt:P279* wd:Q35666 .
    ?item wikibase:sitelinks ?linkcount .
FILTER (?linkcount >= 1) .
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }
}
GROUP BY ?item ?itemLabel ?linkcount
ORDER BY DESC(?linkcount)
'''
r = requests.get(url, params = {'format': 'json', 'query': query})
data = r.json()

作为 http://wiki.bitplan.com/index.php/PyLoDStorage 的提交者,我推荐这个库,因为它以一种您可以立即获得正确 python 类型的方式包装 sparqlwrapper。因此,您将能够将数据导入其他库,例如 pandas 或使用 csv、json、xml、sql 或您后续需要的任何内容。

我稍微修改并扩展了您的查询

# WF 2021-10-30
# see 
SELECT ?s ?sLabel ?item ?itemLabel ?sourceCode ?webSite ?stackexchangeTag  {
    SERVICE wikibase:mwapi {
        bd:serviceParam wikibase:api "EntitySearch".
        bd:serviceParam wikibase:endpoint "www.wikidata.org".
        bd:serviceParam mwapi:search "natural language processing".
        bd:serviceParam mwapi:language "en".
        ?item wikibase:apiOutputItem mwapi:item.
        ?num wikibase:apiOrdinal true.
    }
    ?s wdt:P279|wdt:P31 ?item .
    OPTIONAL { 
      ?s wdt:P1324 ?sourceCode.
    }
    OPTIONAL {    
      ?s wdt:P856 ?webSite.
    }
    OPTIONAL {    
      ?s wdt:P1482 ?stackexchangeTag.
    }
    SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
ORDER BY ?itemLabel ?sLabel

try it!

然后将其用作 pyLoDStorage 库的展示:

def testWhosebug55961615Query(self):
        '''
        see 
        
        
        '''
        endpoint="https://query.wikidata.org/sparql"
        wd=SPARQL(endpoint)
        queryString="""SELECT ?s ?sLabel ?item ?itemLabel ?sourceCode ?webSite ?stackexchangeTag  {
    SERVICE wikibase:mwapi {
        bd:serviceParam wikibase:api "EntitySearch".
        bd:serviceParam wikibase:endpoint "www.wikidata.org".
        bd:serviceParam mwapi:search "natural language processing".
        bd:serviceParam mwapi:language "en".
        ?item wikibase:apiOutputItem mwapi:item.
        ?num wikibase:apiOrdinal true.
    }
    ?s wdt:P279|wdt:P31 ?item .
    OPTIONAL { 
      ?s wdt:P1324 ?sourceCode.
    }
    OPTIONAL {    
      ?s wdt:P856 ?webSite.
    }
    OPTIONAL {    
      ?s wdt:P1482 ?stackexchangeTag.
    }
    SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
ORDER BY ?itemLabel ?sLabel"""
        qlod=wd.queryAsListOfDicts(queryString,fixNone=True)
        query=Query(name="EntitySearch",query=queryString,lang='sparql')
        debug=self.debug
        for tablefmt in ["github","mediawiki","latex"]:
            lod=copy.deepcopy(qlod)
            qdoc=query.documentQueryResult(lod,tablefmt=tablefmt)
            if debug:
                print (qdoc)

并且可以立即将输出粘贴到此处,如下所示。输出也可以 tabulate library 支持的任何格式提供,例如 mediawiki、latex 和其他格式。

实体搜索

查询

SELECT ?s ?sLabel ?item ?itemLabel ?sourceCode ?webSite ?stackexchangeTag  {
    SERVICE wikibase:mwapi {
        bd:serviceParam wikibase:api "EntitySearch".
        bd:serviceParam wikibase:endpoint "www.wikidata.org".
        bd:serviceParam mwapi:search "natural language processing".
        bd:serviceParam mwapi:language "en".
        ?item wikibase:apiOutputItem mwapi:item.
        ?num wikibase:apiOrdinal true.
    }
    ?s wdt:P279|wdt:P31 ?item .
    OPTIONAL { 
      ?s wdt:P1324 ?sourceCode.
    }
    OPTIONAL {    
      ?s wdt:P856 ?webSite.
    }
    OPTIONAL {    
      ?s wdt:P1482 ?stackexchangeTag.
    }
    SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
ORDER BY ?itemLabel ?sLabel

结果

s sLabel item itemLabel stackexchangeTag sourceCode webSite
http://www.wikidata.org/entity/Q24841819 Q24841819 http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q1898737 Morphological analysis http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q6913444 Morphological parsing http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q104415642 Wikification http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q51751772 biomedical natural language processing http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q46346005 computer-based question classification http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q1513879 natural language generation http://www.wikidata.org/entity/Q30642 natural language processing https://whosebug.com/tags/nlg
http://www.wikidata.org/entity/Q1078276 natural language understanding http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q1271424 part-of-speech tagging http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q105171570 speaker verification http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q189436 speech recognition http://www.wikidata.org/entity/Q30642 natural language processing https://ai.stackexchange.com/tags/speech-recognition
http://www.wikidata.org/entity/Q189436 speech recognition http://www.wikidata.org/entity/Q30642 natural language processing https://cs.stackexchange.com/tags/speech-recognition
http://www.wikidata.org/entity/Q189436 speech recognition http://www.wikidata.org/entity/Q30642 natural language processing https://dsp.stackexchange.com/tags/speech-recognition
http://www.wikidata.org/entity/Q189436 speech recognition http://www.wikidata.org/entity/Q30642 natural language processing https://linguistics.stackexchange.com/tags/speech-recognition
http://www.wikidata.org/entity/Q189436 speech recognition http://www.wikidata.org/entity/Q30642 natural language processing https://whosebug.com/tags/speech-recognition
http://www.wikidata.org/entity/Q189436 speech recognition http://www.wikidata.org/entity/Q30642 natural language processing https://unix.stackexchange.com/tags/speech-recognition
http://www.wikidata.org/entity/Q1948408 text segmentation http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q3484781 text simplification http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q2438971 tokenization http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q105330879 word segmentation http://www.wikidata.org/entity/Q30642 natural language processing
http://www.wikidata.org/entity/Q7095836 Apache OpenNLP http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://whosebug.com/tags/opennlp http://svn.apache.org/repos/asf/opennlp http://opennlp.apache.org
http://www.wikidata.org/entity/Q7095836 Apache OpenNLP http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://whosebug.com/tags/opennlp https://github.com/apache/opennlp http://opennlp.apache.org
http://www.wikidata.org/entity/Q7095836 Apache OpenNLP http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://whosebug.com/tags/opennlp https://github.com/apache/opennlp.git http://opennlp.apache.org
http://www.wikidata.org/entity/Q7095836 Apache OpenNLP http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://whosebug.com/tags/opennlp http://svn.apache.org/repos/asf/opennlp https://opennlp.apache.org/
http://www.wikidata.org/entity/Q7095836 Apache OpenNLP http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://whosebug.com/tags/opennlp https://github.com/apache/opennlp https://opennlp.apache.org/
http://www.wikidata.org/entity/Q7095836 Apache OpenNLP http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://whosebug.com/tags/opennlp https://github.com/apache/opennlp.git https://opennlp.apache.org/
http://www.wikidata.org/entity/Q975453 General Architecture for Text Engineering http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://github.com/GateNLP http://gate.ac.uk
http://www.wikidata.org/entity/Q5533567 Gensim http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://github.com/RaRe-Technologies/gensim https://radimrehurek.com/gensim/
http://www.wikidata.org/entity/Q48688669 LingPipe http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://whosebug.com/tags/lingpipe http://alias-i.com/lingpipe/
http://www.wikidata.org/entity/Q6553971 LinguaStream http://www.wikidata.org/entity/Q21129801 natural language processing toolkit
http://www.wikidata.org/entity/Q5396532 MALLET http://www.wikidata.org/entity/Q21129801 natural language processing toolkit http://mallet.cs.umass.edu/
http://www.wikidata.org/entity/Q6906675 MontyLingua http://www.wikidata.org/entity/Q21129801 natural language processing toolkit http://web.media.mit.edu/~hugo/montylingua/
http://www.wikidata.org/entity/Q3630063 Moses http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://github.com/moses-smt/mosesdecoder http://www.statmt.org/moses
http://www.wikidata.org/entity/Q1635410 Natural Language Toolkit http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://whosebug.com/tags/nltk https://github.com/nltk/nltk http://nltk.org/
http://www.wikidata.org/entity/Q17071870 NiuTrans http://www.wikidata.org/entity/Q21129801 natural language processing toolkit http://www.nlplab.com/NiuPlan/NiuTrans.html
http://www.wikidata.org/entity/Q47405152 Pattern http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://github.com/clips/pattern http://www.clips.ua.ac.be/pages/pattern
http://www.wikidata.org/entity/Q32998961 Stanford CoreNLP http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://stanfordnlp.github.io/CoreNLP/
http://www.wikidata.org/entity/Q104840874 Stanza http://www.wikidata.org/entity/Q21129801 natural language processing toolkit https://stanfordnlp.github.io/stanza/ https://stanfordnlp.github.io/stanza/
http://www.wikidata.org/entity/Q2593410 WordSmith http://www.wikidata.org/entity/Q21129801 natural language processing toolkit http://www.lexically.net
test testWhosebug55961615Query, debug=False took   0.5 s
----------------------------------------------------------------------
Ran 1 test in 0.503s

OK