neo4j 加载 csv - 某些部分不起作用
neo4j load csv - some part doesnt work
我从 csv 导入时遇到问题。
我是 运行 shell 中的以下部分,而最后一部分 (MERGE (e1)-[:NEXT]->(hit))))
从未发生过。
有点沮丧...
每个会话都有 x 次点击。
我想找到插入会话的最后一次命中,并通过 NEXT 关系将其与新命中连接起来
PSV 样本:
Session_id|date_time
Xxx|2015-01-01T01:00:00
Xxx|2015-02-02T09:00:00
Yyy|2015-03-03T06:00:44
代码:
USING PERIODIC COMMIT 100
LOAD CSV WITH HEADERS FROM 'file:///home/xxx.csv' AS line FIELDTERMINATOR '|'
MERGE (session :Session { session_id:line.session_id })
MERGE (hit:Hit{date:line.date_time})
//.......更多合并......
//关系
CREATE (hit)-[:IN_SESSION]->(session)
CREATE ....//more relations
WITH session
MATCH (prev_hit:Hit)-[:IN_SESSION]->(session)
WITH prev_hit ORDER BY prev_hit.date_time DESC LIMIT 2
WITH collect(prev_hit) as entries
FOREACH(i in RANGE(0, length(entries)-1) |
FOREACH(e1 in [entries[i]] |
MERGE (e1)-[:NEXT]->(hit)))
我不明白你试图用嵌套的 FOREACH
循环实现什么。
如果真的把hit
节点和session
节点都搞定了,简单的MERGE
应该就可以了。不过,我认为您必须在 WITH
语句中包含 hit
。
MERGE (session :Session { id: "xxx" })
MERGE (hit:Hit { date_time:"2015-04-03T06:00:44" })
CREATE (hit)-[:IN_SESSION]->(session)
WITH session, hit
MATCH (prev_hit:Hit)-[:IN_SESSION]->(session)
WHERE prev_hit <> hit // make sure that you only match other hits
WITH hit, prev_hit
ORDER BY prev_hit.date_time DESC LIMIT 1
MERGE (prev_hit)-[:NEXT]->(hit) // create relationship between the two
更新
我更新了查询以仅匹配 prev_hit
,这不是当前命中。上面的查询可以按您的需要工作,即它创建一个 NEXT
关系到与相同 Session
相关的单个 Hit
节点。看这里:http://console.neo4j.org/?id=ov7mer
date_time 可能存在问题。你把它存储为一个字符串我认为,排序可能并不总是给你预期的结果。
更新 2
关于您的第二条评论:如果您逐行查看文件并添加 Hit
个节点,则只能将关系添加到已添加的 Hit
个节点。如果您想要 Hit
节点之间的连续 NEXT
关系链,您只能在一个查询中执行此操作,前提是您确保 CSV 文件的条目按 date_time 升序排列。
您可以稍后在 Hit
节点之间添加 NEXT
关系,如下所述:http://www.markhneedham.com/blog/2014/04/19/neo4j-cypher-creating-relationships-between-a-collection-of-nodes-invalid-input/
开始查询:
MATCH (s:Session)--(hit:Hit)
// first order by hit.date_time
WITH DISTINCT s, hit ORDER BY hit.date_time DESC
// this will return one row per session with the hits in a collection
WITH s, collect(hit) AS this_session_hits
// try this to check the ordering:
// RETURN s.session_id, this_session_hits
// the following queries will be done on each row, this is like iterating over the sessions
FOREACH(i in RANGE(0, length(this_session_hits)-2) |
FOREACH(e1 in [this_session_hits[i]] |
FOREACH(e2 in [this_session_hits[i+1]] |
MERGE (e1)-[:NEXT]->(e2))))
最终答案;)
此查询适用于您的 neo4j 控制台 (http://console.neo4j.org/?id=mginka) 中的数据集。它将会话中的所有 Hit
与 NEXT
关系连接起来。
MATCH (s:Session)<--(hit:Hit)
WITH DISTINCT s, hit
ORDER BY hit.date_time ASC
WITH s, collect(hit) AS this_session_hits
FOREACH (i IN RANGE(0, length(this_session_hits)-2)|
FOREACH (e1 IN [this_session_hits[i]]|
FOREACH (e2 IN [this_session_hits[i+1]]|
MERGE (e1)-[:NEXT]->(e2))))
我从 csv 导入时遇到问题。
我是 运行 shell 中的以下部分,而最后一部分 (MERGE (e1)-[:NEXT]->(hit))))
从未发生过。
有点沮丧...
每个会话都有 x 次点击。 我想找到插入会话的最后一次命中,并通过 NEXT 关系将其与新命中连接起来
PSV 样本:
Session_id|date_time Xxx|2015-01-01T01:00:00 Xxx|2015-02-02T09:00:00 Yyy|2015-03-03T06:00:44
代码:
USING PERIODIC COMMIT 100
LOAD CSV WITH HEADERS FROM 'file:///home/xxx.csv' AS line FIELDTERMINATOR '|'
MERGE (session :Session { session_id:line.session_id })
MERGE (hit:Hit{date:line.date_time})
//.......更多合并......
//关系
CREATE (hit)-[:IN_SESSION]->(session)
CREATE ....//more relations
WITH session
MATCH (prev_hit:Hit)-[:IN_SESSION]->(session)
WITH prev_hit ORDER BY prev_hit.date_time DESC LIMIT 2
WITH collect(prev_hit) as entries
FOREACH(i in RANGE(0, length(entries)-1) |
FOREACH(e1 in [entries[i]] |
MERGE (e1)-[:NEXT]->(hit)))
我不明白你试图用嵌套的 FOREACH
循环实现什么。
如果真的把hit
节点和session
节点都搞定了,简单的MERGE
应该就可以了。不过,我认为您必须在 WITH
语句中包含 hit
。
MERGE (session :Session { id: "xxx" })
MERGE (hit:Hit { date_time:"2015-04-03T06:00:44" })
CREATE (hit)-[:IN_SESSION]->(session)
WITH session, hit
MATCH (prev_hit:Hit)-[:IN_SESSION]->(session)
WHERE prev_hit <> hit // make sure that you only match other hits
WITH hit, prev_hit
ORDER BY prev_hit.date_time DESC LIMIT 1
MERGE (prev_hit)-[:NEXT]->(hit) // create relationship between the two
更新
我更新了查询以仅匹配 prev_hit
,这不是当前命中。上面的查询可以按您的需要工作,即它创建一个 NEXT
关系到与相同 Session
相关的单个 Hit
节点。看这里:http://console.neo4j.org/?id=ov7mer
date_time 可能存在问题。你把它存储为一个字符串我认为,排序可能并不总是给你预期的结果。
更新 2
关于您的第二条评论:如果您逐行查看文件并添加 Hit
个节点,则只能将关系添加到已添加的 Hit
个节点。如果您想要 Hit
节点之间的连续 NEXT
关系链,您只能在一个查询中执行此操作,前提是您确保 CSV 文件的条目按 date_time 升序排列。
您可以稍后在 Hit
节点之间添加 NEXT
关系,如下所述:http://www.markhneedham.com/blog/2014/04/19/neo4j-cypher-creating-relationships-between-a-collection-of-nodes-invalid-input/
开始查询:
MATCH (s:Session)--(hit:Hit)
// first order by hit.date_time
WITH DISTINCT s, hit ORDER BY hit.date_time DESC
// this will return one row per session with the hits in a collection
WITH s, collect(hit) AS this_session_hits
// try this to check the ordering:
// RETURN s.session_id, this_session_hits
// the following queries will be done on each row, this is like iterating over the sessions
FOREACH(i in RANGE(0, length(this_session_hits)-2) |
FOREACH(e1 in [this_session_hits[i]] |
FOREACH(e2 in [this_session_hits[i+1]] |
MERGE (e1)-[:NEXT]->(e2))))
最终答案;)
此查询适用于您的 neo4j 控制台 (http://console.neo4j.org/?id=mginka) 中的数据集。它将会话中的所有 Hit
与 NEXT
关系连接起来。
MATCH (s:Session)<--(hit:Hit)
WITH DISTINCT s, hit
ORDER BY hit.date_time ASC
WITH s, collect(hit) AS this_session_hits
FOREACH (i IN RANGE(0, length(this_session_hits)-2)|
FOREACH (e1 IN [this_session_hits[i]]|
FOREACH (e2 IN [this_session_hits[i+1]]|
MERGE (e1)-[:NEXT]->(e2))))