如何使用语义搜索和 ontology 使用 python 搜索工作?

how to search for jobs using semantic search and ontology using python?

我正在构建一个简单的在线语义搜索引擎来寻找工作,我找到了一个读取本地 owl 文件的简单程序 但我希望它可以在线使用语义网和链接数据来罚款工作和雇主

from owlready2 import *


class SparqlQueries:
def __init__(self):
    my_world = World()
    my_world.get_ontology("file://ExampleOntolohy.owl").load() #path to the owl file is given here
    sync_reasoner(my_world)  #reasoner is started and synchronized here
    self.graph = my_world.as_rdflib_graph()

def search(self):
    #Search query is given here
    #Base URL of your ontology has to be given here
    query = "base <http://www.semanticweb.org/ExampleOntology> " \
            "SELECT ?s ?p ?o " \
            "WHERE { " \
            "?s ?p ?o . " \
            "}"

    #query is being run
    resultsList = self.graph.query(query)

    #creating json object
    response = []
    for item in resultsList:
        s = str(item['s'].toPython())
        s = re.sub(r'.*#',"",s)

        p = str(item['p'].toPython())
        p = re.sub(r'.*#', "", p)

        o = str(item['o'].toPython())
        o = re.sub(r'.*#', "", o)
        response.append({'s' : s, 'p' : p, "o" : o})

    print(response) #just to show the output
    return response


runQuery = SparqlQueries()
runQuery.search()

我试过像文档中提到的那样使用 RDFlib

import rdflib
g=rdflib.Graph()
g.load('http://dbpedia.org/resource/Semantic_Web')

for s,p,o in g:
    print s,p,o

我应该如何获取有关工作的数据和链接? 或者关于雇主或公司?

我应该如何编写 owl 文件?

schema.org 有 JobPosting 规范。如果幸运的话,您会发现一些网站在使用它并且使用得很好。根据他们的做法(在链接的文档中),您将能够将其抓取到您自己的图表中。这将至少节省编写 ontology。

我只看了一个招聘网站:Monster.com,他们很友好地把 Schema lists in JSON-LD on their collection pages, line 1187 in the linked source, as well as Schema JobPostings on the linked pages, line 261 in the linked source

如果您同时安装了 rdflibrdflib-jsonld pip,那么它很简单:

from rdflib import Graph
g = Graph()
g.parse("https://job-openings.monster.com/software-engineer-principal-software-engineer-northridge-ca-us-northrop-grumman/0d2caa9e-3b3c-46fa-94d1-cddc75d9ae27")

# Demo
print(len(g))
for s, p, o in g:
    print(s, p, o)