Neo4J 基于时间的数据查询显示出比预期更多的关系

Time based data query with Neo4J showing more relations than expected

我正在努力了解 Neo4J 和基于时间的数据。

所以我基本上想构建的是一个数据结构,它能够在某个时间给我一个跟踪节点(页面视图)及其引用者和引用者-引用者。

我的问题是,如果我保存数据及其与时间树的关系,仍然会出现在按小时查询特定时间时不应该显示的关系。

在研究过程中,我发现了这篇关于 modeling time series data with neo4j 的文章。

到目前为止一切顺利,但引荐来源网址及其子关系并未按时间抽象化。

为了更好地说明问题,这里先介绍数据结构:

我创建了一个索引:

CREATE INDEX ON :Year(value);
CREATE INDEX ON :Month(value);
CREATE INDEX ON :Day(value);
CREATE INDEX ON :Hour(value);
CREATE INDEX ON :Minute(value);
CREATE INDEX ON :Second(value);

并把时间节点放在那里:

//Create Time Tree with Day Depth
WITH range(2015, 2017) AS years, range(1,12) AS months
FOREACH(year IN years |
   CREATE (y:Year {value: year})
   FOREACH(month IN months |
     CREATE (m:Month {value: month})
    MERGE (y)-[:CONTAINS]->(m)
    FOREACH(day IN (CASE
                       WHEN month IN [1,3,5,7,8,10,12] THEN range(1,31)
                      WHEN month = 2 THEN
                        CASE
                          WHEN year % 4 <> 0 THEN range(1,28)
                           WHEN year % 100 = 0 AND year % 400 = 0 THEN range(1,29)
                           ELSE range(1,28)
                         END
                       ELSE range(1,30)
                     END) |
       CREATE (d:Day {value: day})
       MERGE (m)-[:CONTAINS]->(d))))

如果我现在保存数据:

MERGE (a:tracking {ip:'someniceid', type:'page_view', timestamp:'2154645'})
MERGE (f:Domain {name:'domain1.com'})
MERGE (e:Domain {name:'domain2.com'})
MERGE (d:Domain {name:'domain3.com'})
MERGE (z:Domain {name:'domain4.com'})
MERGE (a)-[:CAME_FROM]->(f)
MERGE (f)-[:REFERRED_BY]->(e)
MERGE (e)-[:REFERRED_BY]->(d)
MERGE (d)-[:REFERRED_BY]->(z)
WITH a, 2016 AS y 
MATCH (year:Year {value: y})
WITH a, year, 5 AS m 
MATCH (year)-[:CONTAINS]->(month:Month {value: m})
WITH a, month, 9 AS d 
MATCH (month)-[:CONTAINS]->(day:Day {value: d})
WITH a, day, 14 AS h 
MERGE (day)-[:CONTAINS]->(hour:Hour {value: h})
MERGE (a)-[:HAPPENED_ON]->(hour)

我通过查询得到以下图表:

MATCH (y)-[:CONTAINS]->(m:Month {value: 5}) WITH y, m
MATCH (m)-[:CONTAINS]->(d {value: 9}) WITH y, m, d
MATCH (d)-[:CONTAINS]->(h {value: 14}) WITH y, m, d, h
MATCH (a:tracking)-[:HAPPENED_ON]->(h),(a)-[:CAME_FROM|:REFERRED_BY*]->(dom) RETURN dom AS D, a AS A

当我现在再保存一个数据集时,唯一的区别是更改了小时和域(而不是 domain4,我们现在有了 domain6),例如:

MERGE (a:tracking {ip:'someniceid', type:'page_view', timestamp:'2154645'})"
MERGE (f:Domain {name:'domain1.com'})
MERGE (e:Domain {name:'domain2.com'})
MERGE (d:Domain {name:'domain3.com'})
MERGE (z:Domain {name:'domain6.com'})
MERGE (a)-[:CAME_FROM]->(f)
MERGE (f)-[:REFERRED_BY]->(e)
MERGE (e)-[:REFERRED_BY]->(d)
MERGE (d)-[:REFERRED_BY]->(z)
WITH a, 2016 AS y 
MATCH (year:Year {value: y})
WITH a, year, 5 AS m 
MATCH (year)-[:CONTAINS]->(month:Month {value: m})
WITH a, month, 9 AS d 
MATCH (month)-[:CONTAINS]->(day:Day {value: d})
WITH a, day, 10 AS h 
MERGE (day)-[:CONTAINS]->(hour:Hour {value: h})
MERGE (a)-[:HAPPENED_ON]->(hour)

因此,对于上面的相同查询,又添加了一个引荐来源网址,我认为这应该不会发生,因为与跟踪节点相关的时间(小时)节点不同:

尽管跟踪连接到不同的小时节点,但仍显示推荐人关系!我做错了什么?对我来说,域6不应该是可见的,因为相关的跟踪与那个时间节点没有联系...有人有想法吗?

问题是对于每个被监视的事件 merge 没有为域创建新记录,并且您存储了不正确的域序列。尝试为每个跟踪创建指向域的链接:

MERGE (a:tracking {ip:'someniceid', type:'page_view', timestamp:'2154645'})
MERGE (_f:Domain {name:'domain1.com'})
MERGE (_e:Domain {name:'domain2.com'})
MERGE (_d:Domain {name:'domain3.com'})
MERGE (_z:Domain {name:'domain4.com'})
CREATE (f:Symlink)-[:Symlink]->(_f)
CREATE (e:Symlink)-[:Symlink]->(_e)
CREATE (d:Symlink)-[:Symlink]->(_d)
CREATE (z:Symlink)-[:Symlink]->(_z)
MERGE (a)-[:CAME_FROM]->(f)
MERGE (f)-[:REFERRED_BY]->(e)
MERGE (e)-[:REFERRED_BY]->(d)
MERGE (d)-[:REFERRED_BY]->(z)
WITH a, 2016 AS y 
MATCH (year:Year {value: y})
WITH a, year, 5 AS m 
MATCH (year)-[:CONTAINS]->(month:Month {value: m})
WITH a, month, 9 AS d 
MATCH (month)-[:CONTAINS]->(day:Day {value: d})
WITH a, day, 14 AS h 
MERGE (day)-[:CONTAINS]->(hour:Hour {value: h})
MERGE (a)-[:HAPPENED_ON]->(hour)