sparql:随机 select 每个节点一个连接
sparql: randomly select one connection for each node
我有以下数据:
<node:1><urn:connectTo><node:2>
<node:1><urn:connectTo><node:3>
<node:1><urn:connectTo><node:4>
<node:2><urn:connectTo><node:10>
<node:2><urn:connectTo><node:11>
<node:2><urn:connectTo><node:12>
<node:3><urn:connectTo><node:21>
<node:3><urn:connectTo><node:13>
<node:3><urn:connectTo><node:41>
<node:3><urn:connectTo><node:100>
<node:4><urn:connectTo><node:119>
<node:4><urn:connectTo><node:120>
如您所见,每个节点都有多个连接。我想 select 每个节点随机连接一个。我怎样才能做到这一点?我尝试了以下查询,但 none 解决了问题:
-
select ?currentNode ?nextNode where {
?currentNode ?p ?nextNode
BIND(RAND() AS ?orderKey)
}
ORDER BY ?orderKey
LIMIT 1
select ?currentNode SAMPLE(?nextNode) as ?nextNode1
where {
?currentNode ?p ?nextNode
}
GROUP BY ?currentNode
注意:结果给出了每个节点的第一个连接但不是随机的
select ?currentNode ?nextNode (COUNT(?nextNode) AS ?noOfChoices)
where {
?currentNode ?p ?nextNode
BIND(RAND() AS ?orderKey)
}
GROUP BY ?currentNode
ORDER BY ?orderKey
OFFSET (RAND()*?noOfChoices)
LIMIT 1
sample aggregatereturn来自一个组的个人:
Sample is a set function which returns an arbitrary value from the
multiset passed to it. … For example, given Sample({"a", "b",
"c"}), "a", "b", and "c" are all valid return values. Note that
Sample() is not required to be deterministic for a given input, the
only restriction is that the output value must be present in the input
multiset.
这将是这样的查询:
prefix node: <node:>
prefix urn: <urn:>
select ?source (sample(?_target) as ?target) where {
?source urn:connectTo ?_target
}
group by ?source
---------------------
| source | target |
=====================
| node:1 | node:2 |
| node:2 | node:10 |
| node:3 | node:13 |
| node:4 | node:119 |
---------------------
当然,正如您所注意到的,实施只需要 return 任意个人。这很可能每次都是相同。您 可以 在子查询中进行一些排序,并希望随机化目标的顺序以便从 sample 中获得不同的结果,但没有要求子查询结果的顺序也被保留。看起来像这样:
prefix node: <node:>
prefix urn: <urn:>
select ?source (sample(?_target) as ?target) where {
{ select ?source ?_target {
?source urn:connectTo ?_target
}
order by rand() }
}
group by ?source
这似乎适用于 Apache Jena。以下是重复调用的结果:
---------------------
| source | target |
=====================
| node:1 | node:2 |
| node:2 | node:11 |
| node:3 | node:100 |
| node:4 | node:120 |
---------------------
---------------------
| source | target |
=====================
| node:1 | node:3 |
| node:2 | node:11 |
| node:3 | node:13 |
| node:4 | node:120 |
---------------------
---------------------
| source | target |
=====================
| node:1 | node:3 |
| node:2 | node:10 |
| node:3 | node:21 |
| node:4 | node:119 |
---------------------
---------------------
| source | target |
=====================
| node:1 | node:3 |
| node:2 | node:10 |
| node:3 | node:100 |
| node:4 | node:119 |
---------------------
我有以下数据:
<node:1><urn:connectTo><node:2>
<node:1><urn:connectTo><node:3>
<node:1><urn:connectTo><node:4>
<node:2><urn:connectTo><node:10>
<node:2><urn:connectTo><node:11>
<node:2><urn:connectTo><node:12>
<node:3><urn:connectTo><node:21>
<node:3><urn:connectTo><node:13>
<node:3><urn:connectTo><node:41>
<node:3><urn:connectTo><node:100>
<node:4><urn:connectTo><node:119>
<node:4><urn:connectTo><node:120>
如您所见,每个节点都有多个连接。我想 select 每个节点随机连接一个。我怎样才能做到这一点?我尝试了以下查询,但 none 解决了问题:
-
select ?currentNode ?nextNode where { ?currentNode ?p ?nextNode BIND(RAND() AS ?orderKey) } ORDER BY ?orderKey LIMIT 1
select ?currentNode SAMPLE(?nextNode) as ?nextNode1 where { ?currentNode ?p ?nextNode } GROUP BY ?currentNode
注意:结果给出了每个节点的第一个连接但不是随机的
select ?currentNode ?nextNode (COUNT(?nextNode) AS ?noOfChoices) where { ?currentNode ?p ?nextNode BIND(RAND() AS ?orderKey) } GROUP BY ?currentNode ORDER BY ?orderKey OFFSET (RAND()*?noOfChoices) LIMIT 1
sample aggregatereturn来自一个组的个人:
Sample is a set function which returns an arbitrary value from the multiset passed to it. … For example, given Sample({"a", "b", "c"}), "a", "b", and "c" are all valid return values. Note that Sample() is not required to be deterministic for a given input, the only restriction is that the output value must be present in the input multiset.
这将是这样的查询:
prefix node: <node:>
prefix urn: <urn:>
select ?source (sample(?_target) as ?target) where {
?source urn:connectTo ?_target
}
group by ?source
---------------------
| source | target |
=====================
| node:1 | node:2 |
| node:2 | node:10 |
| node:3 | node:13 |
| node:4 | node:119 |
---------------------
当然,正如您所注意到的,实施只需要 return 任意个人。这很可能每次都是相同。您 可以 在子查询中进行一些排序,并希望随机化目标的顺序以便从 sample 中获得不同的结果,但没有要求子查询结果的顺序也被保留。看起来像这样:
prefix node: <node:>
prefix urn: <urn:>
select ?source (sample(?_target) as ?target) where {
{ select ?source ?_target {
?source urn:connectTo ?_target
}
order by rand() }
}
group by ?source
这似乎适用于 Apache Jena。以下是重复调用的结果:
---------------------
| source | target |
=====================
| node:1 | node:2 |
| node:2 | node:11 |
| node:3 | node:100 |
| node:4 | node:120 |
---------------------
---------------------
| source | target |
=====================
| node:1 | node:3 |
| node:2 | node:11 |
| node:3 | node:13 |
| node:4 | node:120 |
---------------------
---------------------
| source | target |
=====================
| node:1 | node:3 |
| node:2 | node:10 |
| node:3 | node:21 |
| node:4 | node:119 |
---------------------
---------------------
| source | target |
=====================
| node:1 | node:3 |
| node:2 | node:10 |
| node:3 | node:100 |
| node:4 | node:119 |
---------------------