如何将附加原因输出到 Gremlin 中的相似性
How to output append reason to similarities in Gremlin
我有以下简单图表:
用户 -- 喜欢 --> 项目
我正在使用以下 Gremlin 代码查找与用户 u 最相似的前 10 个用户:
u.out('Likes').in('Likes').filter([u]).groupCount.cap.orderMap(T.decr)[0..10].map()
这会输出类似这样的内容:
==>{userid=1}
==>{userid=5}
==>{userid=10}
==>{userid=15}
我希望输出能够提供更多信息并包含其他信息,例如排序地图中的排名和与原始用户共享的项目 (itemid),如下所示:
==>{userid=1, rank=0, reason_items={1,2,3,5}}
==>{userid=5, rank=1, reason_items={1,2,10}}
==>{userid=10, rank=2, reason_items={1,2,4}}
==>{userid=15, rank=3, reason_items={1,2}}
一个高效的 gremlin-groovy 代码示例会很好!
谢谢。
通过将适当的 transform
闭包附加到您的查询:
rank = 0; itemsU1 = [] as Set; u1.out('Likes').aggregate(itemsU1).in('Likes')
.filter{it != u1}.groupCount.cap.orderMap(T.decr)
.transform{[id:it.id, rank:rank++, reason_item_ids:itemsU1.intersect(it.out('Likes').toSet()).collect{it.id}]}
……您可以获得:
==>{id=User6, rank=0, reason_item_ids=[Item1, Item5]}
==>{id=User4, rank=1, reason_item_ids=[Item1, Item2]}
==>{id=User2, rank=2, reason_item_ids=[Item1]}
==>{id=User5, rank=3, reason_item_ids=[Item5]}
==>{id=User3, rank=4, reason_item_ids=[Item2]}
对于以下示例图:
g = new TinkerGraph()
u1 = g.addVertex('User1')
u2 = g.addVertex('User2')
u3 = g.addVertex('User3')
u4 = g.addVertex('User4')
u5 = g.addVertex('User5')
u6 = g.addVertex('User6')
i1 = g.addVertex('Item1')
i2 = g.addVertex('Item2')
i3 = g.addVertex('Item3')
i4 = g.addVertex('Item4')
i5 = g.addVertex('Item5')
g.addEdge(u1,i1,'Likes')
g.addEdge(u1,i2,'Likes')
g.addEdge(u1,i5,'Likes')
g.addEdge(u2,i1,'Likes')
g.addEdge(u2,i4,'Likes')
g.addEdge(u3,i2,'Likes')
g.addEdge(u4,i1,'Likes')
g.addEdge(u4,i2,'Likes')
g.addEdge(u4,i3,'Likes')
g.addEdge(u5,i4,'Likes')
g.addEdge(u5,i5,'Likes')
g.addEdge(u6,i1,'Likes')
g.addEdge(u6,i4,'Likes')
g.addEdge(u6,i5,'Likes')
鉴于 Faber 的示例图,您可以这样做:
u = u1; m = [:].withDefault {[]}; rank = 0; key = null
u.out('Likes').as('item').in('Likes').except([u]).as('user').select().groupBy {
key = it.getColumn('user')
} {
m[key] << it.getColumn('item').id
} {
it.size()
}.cap().orderMap(T.decr)[0..10].transform {[
'userid' : it.id,
'rank' : rank++,
'reason_item_ids': m[it]
]}
无需在 .transform()
.
内进行嵌套遍历
我有以下简单图表:
用户 -- 喜欢 --> 项目
我正在使用以下 Gremlin 代码查找与用户 u 最相似的前 10 个用户:
u.out('Likes').in('Likes').filter([u]).groupCount.cap.orderMap(T.decr)[0..10].map()
这会输出类似这样的内容:
==>{userid=1}
==>{userid=5}
==>{userid=10}
==>{userid=15}
我希望输出能够提供更多信息并包含其他信息,例如排序地图中的排名和与原始用户共享的项目 (itemid),如下所示:
==>{userid=1, rank=0, reason_items={1,2,3,5}}
==>{userid=5, rank=1, reason_items={1,2,10}}
==>{userid=10, rank=2, reason_items={1,2,4}}
==>{userid=15, rank=3, reason_items={1,2}}
一个高效的 gremlin-groovy 代码示例会很好!
谢谢。
通过将适当的 transform
闭包附加到您的查询:
rank = 0; itemsU1 = [] as Set; u1.out('Likes').aggregate(itemsU1).in('Likes')
.filter{it != u1}.groupCount.cap.orderMap(T.decr)
.transform{[id:it.id, rank:rank++, reason_item_ids:itemsU1.intersect(it.out('Likes').toSet()).collect{it.id}]}
……您可以获得:
==>{id=User6, rank=0, reason_item_ids=[Item1, Item5]}
==>{id=User4, rank=1, reason_item_ids=[Item1, Item2]}
==>{id=User2, rank=2, reason_item_ids=[Item1]}
==>{id=User5, rank=3, reason_item_ids=[Item5]}
==>{id=User3, rank=4, reason_item_ids=[Item2]}
对于以下示例图:
g = new TinkerGraph()
u1 = g.addVertex('User1')
u2 = g.addVertex('User2')
u3 = g.addVertex('User3')
u4 = g.addVertex('User4')
u5 = g.addVertex('User5')
u6 = g.addVertex('User6')
i1 = g.addVertex('Item1')
i2 = g.addVertex('Item2')
i3 = g.addVertex('Item3')
i4 = g.addVertex('Item4')
i5 = g.addVertex('Item5')
g.addEdge(u1,i1,'Likes')
g.addEdge(u1,i2,'Likes')
g.addEdge(u1,i5,'Likes')
g.addEdge(u2,i1,'Likes')
g.addEdge(u2,i4,'Likes')
g.addEdge(u3,i2,'Likes')
g.addEdge(u4,i1,'Likes')
g.addEdge(u4,i2,'Likes')
g.addEdge(u4,i3,'Likes')
g.addEdge(u5,i4,'Likes')
g.addEdge(u5,i5,'Likes')
g.addEdge(u6,i1,'Likes')
g.addEdge(u6,i4,'Likes')
g.addEdge(u6,i5,'Likes')
鉴于 Faber 的示例图,您可以这样做:
u = u1; m = [:].withDefault {[]}; rank = 0; key = null
u.out('Likes').as('item').in('Likes').except([u]).as('user').select().groupBy {
key = it.getColumn('user')
} {
m[key] << it.getColumn('item').id
} {
it.size()
}.cap().orderMap(T.decr)[0..10].transform {[
'userid' : it.id,
'rank' : rank++,
'reason_item_ids': m[it]
]}
无需在 .transform()
.