我应该如何将这个 neo4j Cypher/Apoc 加载转换为 neo4j-admin 导入?
How should I convert this neo4j Cypher/Apoc load to neo4j-admin import?
我正在处理电子邮件数据并使用 python 解析它,它每小时生成一个 csv。有了那个 csv,我有 5 个单独的 load csv commands
到 create/update 节点和关系。它们是 NO ATTACHMENT OR LINK
、URL ONLY
、ATTACHMENT ONLY
、URL AND ATTACHMENT
和 Attachment to Attachment Name, FileName Node
。
我想通过批处理作业自动导入这些。因为我很熟悉,所以我只想在 python 中完成它,但我一直在寻找堆栈和其他地方,人们推荐 neo4j-admin import
。从文档来看,它看起来与我使用 --nodes
和 --relationships
所做的非常不同。谁能帮助我展示如何将我在下面创建的 CYPHER/APOC LOAD CSV
示例转换为 noe4j-admin import
?
// URL AND ATTACHMENT
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM ("file:///sessions/4_hour_parsed_and_ready.csv") AS row
MERGE (a:Sender { name: row.From, domain: row.Sender_Sub_Fld})
MERGE (b:Link { name: row.Url_Sub_Fld, topLevelDomain: row.Url_Tld, htmlEncodedMessage: row.HTML_Encoded})
MERGE (c:Attachment { name: row.FileHash, fileExtension: row.FileName_Ext, containsMultipleExtensions: row.MultipleExtensions})
MERGE (d:Recipient { name: row.To})
WITH a,b,c,d,row
WHERE NOT row.Url_Tld = "false" AND NOT row.FileHash = "false"
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, b) YIELD rel as rel1
CALL apoc.merge.relationship(b, row.Outcome2, {}, {}, d) YIELD rel as rel2
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, c) YIELD rel as rel3
CALL apoc.merge.relationship(c, row.Outcome2, {}, {}, d) YIELD rel as rel4
RETURN a,b,c,d
或者如何将此代码包装在 py2neo 中。
我刚刚创建了一个函数来保存服务器连接信息并将所有内容包装在 py2neo 查询中然后执行它。
import py_2_neo_pass
from py_2_neo_pass import db_server, db_user, db_password
from py2neo import Graph, Node, Relationship
graph = Graph(ip_addr = db_server, username = db_user, password = db_password)
query='''
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM ("file:///sessions/4_hour_parsed_and_ready.csv") AS row
MERGE (a:Sender { name: row.From, domain: row.Sender_Sub_Fld})
MERGE (b:Link { name: row.Url_Sub_Fld, topLevelDomain: row.Url_Tld, htmlEncodedMessage: row.HTML_Encoded})
MERGE (c:Attachment { name: row.FileHash, fileExtension: row.FileName_Ext, containsMultipleExtensions: row.MultipleExtensions})
MERGE (d:Recipient { name: row.To})
WITH a,b,c,d,row
WHERE NOT row.Url_Tld = "false" AND NOT row.FileHash = "false"
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, b) YIELD rel as rel1
CALL apoc.merge.relationship(b, row.Outcome2, {}, {}, d) YIELD rel as rel2
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, c) YIELD rel as rel3
CALL apoc.merge.relationship(c, row.Outcome2, {}, {}, d) YIELD rel as rel4
RETURN a,b,c,d
'''
graph.run(query)
我正在处理电子邮件数据并使用 python 解析它,它每小时生成一个 csv。有了那个 csv,我有 5 个单独的 load csv commands
到 create/update 节点和关系。它们是 NO ATTACHMENT OR LINK
、URL ONLY
、ATTACHMENT ONLY
、URL AND ATTACHMENT
和 Attachment to Attachment Name, FileName Node
。
我想通过批处理作业自动导入这些。因为我很熟悉,所以我只想在 python 中完成它,但我一直在寻找堆栈和其他地方,人们推荐 neo4j-admin import
。从文档来看,它看起来与我使用 --nodes
和 --relationships
所做的非常不同。谁能帮助我展示如何将我在下面创建的 CYPHER/APOC LOAD CSV
示例转换为 noe4j-admin import
?
// URL AND ATTACHMENT
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM ("file:///sessions/4_hour_parsed_and_ready.csv") AS row
MERGE (a:Sender { name: row.From, domain: row.Sender_Sub_Fld})
MERGE (b:Link { name: row.Url_Sub_Fld, topLevelDomain: row.Url_Tld, htmlEncodedMessage: row.HTML_Encoded})
MERGE (c:Attachment { name: row.FileHash, fileExtension: row.FileName_Ext, containsMultipleExtensions: row.MultipleExtensions})
MERGE (d:Recipient { name: row.To})
WITH a,b,c,d,row
WHERE NOT row.Url_Tld = "false" AND NOT row.FileHash = "false"
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, b) YIELD rel as rel1
CALL apoc.merge.relationship(b, row.Outcome2, {}, {}, d) YIELD rel as rel2
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, c) YIELD rel as rel3
CALL apoc.merge.relationship(c, row.Outcome2, {}, {}, d) YIELD rel as rel4
RETURN a,b,c,d
或者如何将此代码包装在 py2neo 中。
我刚刚创建了一个函数来保存服务器连接信息并将所有内容包装在 py2neo 查询中然后执行它。
import py_2_neo_pass
from py_2_neo_pass import db_server, db_user, db_password
from py2neo import Graph, Node, Relationship
graph = Graph(ip_addr = db_server, username = db_user, password = db_password)
query='''
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM ("file:///sessions/4_hour_parsed_and_ready.csv") AS row
MERGE (a:Sender { name: row.From, domain: row.Sender_Sub_Fld})
MERGE (b:Link { name: row.Url_Sub_Fld, topLevelDomain: row.Url_Tld, htmlEncodedMessage: row.HTML_Encoded})
MERGE (c:Attachment { name: row.FileHash, fileExtension: row.FileName_Ext, containsMultipleExtensions: row.MultipleExtensions})
MERGE (d:Recipient { name: row.To})
WITH a,b,c,d,row
WHERE NOT row.Url_Tld = "false" AND NOT row.FileHash = "false"
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, b) YIELD rel as rel1
CALL apoc.merge.relationship(b, row.Outcome2, {}, {}, d) YIELD rel as rel2
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, c) YIELD rel as rel3
CALL apoc.merge.relationship(c, row.Outcome2, {}, {}, d) YIELD rel as rel4
RETURN a,b,c,d
'''
graph.run(query)