来自 url 的 gremlin io 步骤

gremlin io step from url

https://www.compose.com/articles/importing-graphs-into-janusgraph/ 展示了如何将数据导入 janus graph。

因为我无法让 janusgraph docker 使用本地主机在我的 Mac 计算机上工作,我尝试连接到远程 Ubuntu 机器,我 运行 janusgraph 使用:

docker run -it -p 8182:8182 janusgraph/janusgraph

然后想用gremlin-python加载数据,失败了。我尝试了以下方法来获得一个简单的可重复示例:

server= ...
port=8182
graph = Graph()
janusgraphurl='ws://%s:%s/gremlin' % (server,port)
connection = DriverRemoteConnection(janusgraphurl, 'g')    
g = graph.traversal().withRemote(connection)
dataurl="https://github.com/krlawrence/graph/raw/master/sample-data/air-routes.graphml"
g.io(dataurl).read().iterate()

我收到以下错误:

 File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/gremlin_python/driver/protocol.py", line 110, in data_received
    raise GremlinServerError(message["status"])
gremlin_python.driver.protocol.GremlinServerError: 500: https://github.com/krlawrence/graph/raw/master/sample-data/air-routes.graphml does not exist

虽然 link https://github.com/krlawrence/graph/raw/master/sample-data/air-routes.graphml 似乎工作正常。

使用 python gremlin 语言变体从 url 加载图形数据的正确方法是什么?

Kelvin Lawrence 是对的。

与 bash:

docker run -it janusgraph/janusgraph /bin/bash

我可以检查可用文件

root@8542ed1b8232:/opt/janusgraph# ls data
grateful-dead-janusgraph-schema.groovy  tinkerpop-crew-typed.json
grateful-dead-typed.json        tinkerpop-crew-v2d0-typed.json
grateful-dead-v2d0-typed.json       tinkerpop-crew-v2d0.json
grateful-dead-v2d0.json         tinkerpop-crew.json
grateful-dead.json          tinkerpop-crew.kryo
grateful-dead.kryo          tinkerpop-modern-typed.json
grateful-dead.txt           tinkerpop-modern-v2d0-typed.json
grateful-dead.xml           tinkerpop-modern-v2d0.json
script-input-grateful-dead.groovy   tinkerpop-modern.json
script-input-tinkerpop.groovy       tinkerpop-modern.kryo
tinkerpop-classic-typed.json        tinkerpop-modern.xml
tinkerpop-classic-v2d0-typed.json   tinkerpop-sink-typed.json
tinkerpop-classic-v2d0.json     tinkerpop-sink-v2d0-typed.json
tinkerpop-classic.json          tinkerpop-sink-v2d0.json
tinkerpop-classic.kryo          tinkerpop-sink.json
tinkerpop-classic.txt           tinkerpop-sink.kryo
tinkerpop-classic.xml

测试我选择tinkerpop-modern.xml:

    file="data/tinkerpop-modern.xml";
    g.io(file).read().iterate()
    vCount=g.V().count().next()
    print ("%s has %d vertices" % (file,vCount))
    assert vCount==6

有效。谢谢!

要使 "external" 数据可用于 docker 图像,可以使用 --mount 选项:

docker run -it -p 8182:8182 --mount src=<path to graphdata>,target=/graphdata,type=bind janusgraph/janusgraph

以下助手 class 帮助共享文件:

RemoteGremlin

'''
Created on 2020-03-30

@author: wf
'''
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.structure.graph import Graph
from shutil import copyfile
import os

class RemoteGremlin(object):
    '''
    helper for remote gremlin connections
    '''

    def __init__(self, server, port=8182):
        '''
        construct me with the given server and port
        '''
        self.server=server
        self.port=port    

    def sharepoint(self,sharepoint,sharepath):
        '''
        set up the sharepoint
        '''
        self.sharepoint=sharepoint
        self.sharepath=sharepath


    def share(self,file):
        '''
        share the given file  and return the path as seen by the server
        '''
        fbase=os.path.basename(file)
        copyfile(file,self.sharepoint+fbase)
        return self.sharepath+fbase

    def open(self):
        '''
        open the remote connection
        '''
        self.graph = Graph()
        self.url='ws://%s:%s/gremlin' % (self.server,self.port)
        self.connection = DriverRemoteConnection(self.url, 'g')    
        # The connection should be closed on shut down to close open connections with connection.close()
        self.g = self.graph.traversal().withRemote(self.connection)

    def close(self):
        '''
        close the remote connection
        '''
        self.connection.close()

python单元测试:

'''
Created on 2020-03-28

@author: wf
'''
import unittest
from tp.gremlin import RemoteGremlin

class JanusGraphTest(unittest.TestCase):
    '''
    test access to a janus graph docker instance via the RemoteGremlin helper class
    '''

    def setUp(self):
        pass


    def tearDown(self):
        pass

    def test_loadGraph(self):
        # change to your server
        rg=RemoteGremlin("capri.bitplan.com")
        rg.open()
        # change to your shared path
        rg.sharepoint("/Volumes/bitplan/user/wf/graphdata/","/graphdata/")
        g=rg.g
        graphmlFile="air-routes-small.xml";
        shared=rg.share(graphmlFile)
        # drop the existing content of the graph
        g.V().drop().iterate()
        # read the content from the air routes example
        g.io(shared).read().iterate()
        vCount=g.V().count().next()
        print ("%s has %d vertices" % (shared,vCount))
        assert vCount==47


if __name__ == "__main__":
    #import sys;sys.argv = ['', 'Test.testName']
    unittest.main()